当前位置:网站首页>Data processing of deep learning
Data processing of deep learning
2022-07-07 00:42:00 【Peng Xiang】
Data manipulation
- data type , The most commonly used one is array
Creating an array requires
- shape : A few lines and columns
- Element type :int still float
- Element value
Array access method :
Code :
This mechanism works as follows : First , Extend one or two arrays by copying elements appropriately , So that after the conversion , Two tensors have the same shape . secondly , Perform a per element operation on the generated array .
because a and b And matrix , If you add them together , Their shapes don't match . We broadcast two matrices into a larger matrix , As shown below : matrix a Columns will be copied , matrix b The row will be copied , Then add by element .
Data preprocessing
Create files and write data
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)# Create in current directory data Folder
data_file = os.path.join('.', 'data', 'house_tiny.csv')# stay data Create under folder house_tiny.csv
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
Read the file , about csv Document multipurpose pandas This library
import pandas as pd
data=pd.read_csv(data_file)
print(data)
Data processing missing values and conversions
For missing values , We can use two methods: insertion method and deletion method , Insertion is the value we give , Deletion is a direct deletion, which is no longer considered , Here we use the method of taking the mean value of the missing value
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)# about string Type we see Alley The value is only Pave and NaN, So we can put Pave Write it down as 1,NaN Write it down as 0
print(inputs)
Transform our data into tensors
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
Come here , We will convert the data into tensor Tensor , This is processable for computers
Complete code :
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)
data_file = os.path.join('.', 'data', 'house_tiny.csv')
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
import pandas as pd
data=pd.read_csv(data_file)
print(data)
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)
print(inputs)
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
边栏推荐
- The programmer resigned and was sentenced to 10 months for deleting the code. Jingdong came home and said that it took 30000 to restore the database. Netizen: This is really a revenge
- MySQL learning notes (mind map)
- Service asynchronous communication
- How to set encoding in idea
- Data analysis course notes (III) array shape and calculation, numpy storage / reading data, indexing, slicing and splicing
- Lombok 同时使⽤ @Data 和 @Builder 的坑,你中招没?
- St table
- @TableId can‘t more than one in Class: “com.example.CloseContactSearcher.entity.Activity“.
- Compilation of kickstart file
- Three sentences to briefly introduce subnet mask
猜你喜欢
stm32F407-------SPI通信
Racher integrates LDAP to realize unified account login
alexnet实验偶遇:loss nan, train acc 0.100, test acc 0.100情况
Interface master v3.9, API low code development tool, build your interface service platform immediately
St table
The programmer resigned and was sentenced to 10 months for deleting the code. Jingdong came home and said that it took 30000 to restore the database. Netizen: This is really a revenge
uniapp中redirectTo和navigateTo的区别
Geo data mining (III) enrichment analysis of go and KEGG using David database
基于GO语言实现的X.509证书
学习使用代码生成美观的接口文档!!!
随机推荐
Things like random
2022 PMP project management examination agile knowledge points (9)
【vulnhub】presidential1
Advanced learning of MySQL -- basics -- multi table query -- joint query
【软件逆向-求解flag】内存获取、逆变换操作、线性变换、约束求解
5种不同的代码相似性检测,以及代码相似性检测的发展趋势
What is web penetration testing_ Infiltration practice
Attention SLAM:一种从人类注意中学习的视觉单目SLAM
Leecode brushes questions to record interview questions 17.16 massagist
JWT signature does not match locally computed signature. JWT validity cannot be asserted and should
MIT 6.824 - raft Student Guide
Liuyongxin report | microbiome data analysis and science communication (7:30 p.m.)
【JokerのZYNQ7020】AXI_EMC。
Advanced learning of MySQL -- basics -- transactions
智能运维应用之道,告别企业数字化转型危机
The programmer resigned and was sentenced to 10 months for deleting the code. Jingdong came home and said that it took 30000 to restore the database. Netizen: This is really a revenge
37頁數字鄉村振興智慧農業整體規劃建設方案
基於GO語言實現的X.509證書
dynamic programming
集合(泛型 & List & Set & 自定义排序)