当前位置:网站首页>Data processing of deep learning
Data processing of deep learning
2022-07-07 00:42:00 【Peng Xiang】
Data manipulation
- data type , The most commonly used one is array


Creating an array requires
- shape : A few lines and columns
- Element type :int still float
- Element value
Array access method :
Code :




This mechanism works as follows : First , Extend one or two arrays by copying elements appropriately , So that after the conversion , Two tensors have the same shape . secondly , Perform a per element operation on the generated array .
because a and b And matrix , If you add them together , Their shapes don't match . We broadcast two matrices into a larger matrix , As shown below : matrix a Columns will be copied , matrix b The row will be copied , Then add by element .
Data preprocessing
Create files and write data
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)# Create in current directory data Folder
data_file = os.path.join('.', 'data', 'house_tiny.csv')# stay data Create under folder house_tiny.csv
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
Read the file , about csv Document multipurpose pandas This library
import pandas as pd
data=pd.read_csv(data_file)
print(data)

Data processing missing values and conversions
For missing values , We can use two methods: insertion method and deletion method , Insertion is the value we give , Deletion is a direct deletion, which is no longer considered , Here we use the method of taking the mean value of the missing value
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)# about string Type we see Alley The value is only Pave and NaN, So we can put Pave Write it down as 1,NaN Write it down as 0
print(inputs)

Transform our data into tensors
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
Come here , We will convert the data into tensor Tensor , This is processable for computers 
Complete code :
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)
data_file = os.path.join('.', 'data', 'house_tiny.csv')
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
import pandas as pd
data=pd.read_csv(data_file)
print(data)
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)
print(inputs)
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
边栏推荐
- Typescript incremental compilation
- stm32F407-------DAC数模转换
- Memory optimization of Amazon memorydb for redis and Amazon elasticache for redis
- Basic information of mujoco
- Leecode brushes questions to record interview questions 17.16 massagist
- 从外企离开,我才知道什么叫尊重跟合规…
- How to judge whether an element in an array contains all attribute values of an object
- Mujoco finite state machine and trajectory tracking
- "Latex" Introduction to latex mathematical formula "suggestions collection"
- 2022/2/12 summary
猜你喜欢

stm32F407-------DAC数模转换

Racher integrates LDAP to realize unified account login

rancher集成ldap,实现统一账号登录

Alexnet experiment encounters: loss Nan, train ACC 0.100, test ACC 0.100
![[yolov5 6.0 | 6.1 deploy tensorrt to torch serve] environment construction | model transformation | engine model deployment (detailed packet file writing method)](/img/1a/2b497a1baa04d84d28da715d097dfe.png)
[yolov5 6.0 | 6.1 deploy tensorrt to torch serve] environment construction | model transformation | engine model deployment (detailed packet file writing method)

What can the interactive slide screen demonstration bring to the enterprise exhibition hall

英雄联盟|王者|穿越火线 bgm AI配乐大赛分享

How engineers treat open source -- the heartfelt words of an old engineer

Lombok 同时使⽤ @Data 和 @Builder 的坑,你中招没?

沉浸式投影在线下展示中的三大应用特点
随机推荐
37頁數字鄉村振興智慧農業整體規劃建設方案
Use type aliases in typescript
沉浸式投影在线下展示中的三大应用特点
【软件逆向-求解flag】内存获取、逆变换操作、线性变换、约束求解
After leaving a foreign company, I know what respect and compliance are
基于GO语言实现的X.509证书
Leecode brushes questions to record interview questions 17.16 massagist
C9高校,博士生一作发Nature!
Three methods to realize JS asynchronous loading
Advanced learning of MySQL -- Fundamentals -- concurrency of transactions
JS+SVG爱心扩散动画js特效
深度学习之数据处理
How to set encoding in idea
The programmer resigned and was sentenced to 10 months for deleting the code. Jingdong came home and said that it took 30000 to restore the database. Netizen: This is really a revenge
Leecode brushes questions and records interview questions 01.02 Determine whether it is character rearrangement for each other
QT tutorial: creating the first QT program
Personal digestion of DDD
从外企离开,我才知道什么叫尊重跟合规…
AI超清修复出黄家驹眼里的光、LeCun大佬《深度学习》课程生还报告、绝美画作只需一行代码、AI最新论文 | ShowMeAI资讯日报 #07.06
Clipboard management tool paste Chinese version