当前位置:网站首页>Data processing of deep learning
Data processing of deep learning
2022-07-07 00:42:00 【Peng Xiang】
Data manipulation
- data type , The most commonly used one is array


Creating an array requires
- shape : A few lines and columns
- Element type :int still float
- Element value
Array access method :
Code :




This mechanism works as follows : First , Extend one or two arrays by copying elements appropriately , So that after the conversion , Two tensors have the same shape . secondly , Perform a per element operation on the generated array .
because a and b And matrix , If you add them together , Their shapes don't match . We broadcast two matrices into a larger matrix , As shown below : matrix a Columns will be copied , matrix b The row will be copied , Then add by element .
Data preprocessing
Create files and write data
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)# Create in current directory data Folder
data_file = os.path.join('.', 'data', 'house_tiny.csv')# stay data Create under folder house_tiny.csv
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
Read the file , about csv Document multipurpose pandas This library
import pandas as pd
data=pd.read_csv(data_file)
print(data)

Data processing missing values and conversions
For missing values , We can use two methods: insertion method and deletion method , Insertion is the value we give , Deletion is a direct deletion, which is no longer considered , Here we use the method of taking the mean value of the missing value
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)# about string Type we see Alley The value is only Pave and NaN, So we can put Pave Write it down as 1,NaN Write it down as 0
print(inputs)

Transform our data into tensors
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
Come here , We will convert the data into tensor Tensor , This is processable for computers 
Complete code :
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)
data_file = os.path.join('.', 'data', 'house_tiny.csv')
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
import pandas as pd
data=pd.read_csv(data_file)
print(data)
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)
print(inputs)
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
边栏推荐
- St table
- 【JokerのZYNQ7020】AXI_EMC。
- MIT 6.824 - raft Student Guide
- Liuyongxin report | microbiome data analysis and science communication (7:30 p.m.)
- Hero League | King | cross the line of fire BGM AI score competition sharing
- File and image comparison tool kaleidoscope latest download
- Racher integrates LDAP to realize unified account login
- Leecode brushes questions to record interview questions 17.16 massagist
- Advanced learning of MySQL -- basics -- multi table query -- self join
- Alexnet experiment encounters: loss Nan, train ACC 0.100, test ACC 0.100
猜你喜欢

学习使用代码生成美观的接口文档!!!

On February 19, 2021ccf award ceremony will be held, "why in Hengdian?"

48 page digital government smart government all in one solution

一图看懂对程序员的误解:西方程序员眼中的中国程序员

If the college entrance examination goes well, I'm already graying out at the construction site at the moment

Mujoco Jacobi - inverse motion - sensor

Attention SLAM:一种从人类注意中学习的视觉单目SLAM

基于GO语言实现的X.509证书

System activity monitor ISTAT menus 6.61 (1185) Chinese repair

stm32F407-------DAC数模转换
随机推荐
Understand the misunderstanding of programmers: Chinese programmers in the eyes of Western programmers
Win10 startup error, press F9 to enter how to repair?
File and image comparison tool kaleidoscope latest download
iMeta | 华南农大陈程杰/夏瑞等发布TBtools构造Circos图的简单方法
The programmer resigned and was sentenced to 10 months for deleting the code. Jingdong came home and said that it took 30000 to restore the database. Netizen: This is really a revenge
Mujoco produces analog video
After leaving a foreign company, I know what respect and compliance are
VTK volume rendering program design of 3D scanned volume data
Data sharing of the 835 postgraduate entrance examination of software engineering in Hainan University in 23
深度学习之环境配置 jupyter notebook
C language input / output stream and file operation [II]
If the college entrance examination goes well, I'm already graying out at the construction site at the moment
How can computers ensure data security in the quantum era? The United States announced four alternative encryption algorithms
Liuyongxin report | microbiome data analysis and science communication (7:30 p.m.)
"Latex" Introduction to latex mathematical formula "suggestions collection"
C语言输入/输出流和文件操作【二】
Testers, how to prepare test data
Data analysis course notes (III) array shape and calculation, numpy storage / reading data, indexing, slicing and splicing
Racher integrates LDAP to realize unified account login
Amazon MemoryDB for Redis 和 Amazon ElastiCache for Redis 的内存优化