当前位置:网站首页>Data processing of deep learning
Data processing of deep learning
2022-07-07 00:42:00 【Peng Xiang】
Data manipulation
- data type , The most commonly used one is array
Creating an array requires
- shape : A few lines and columns
- Element type :int still float
- Element value
Array access method :
Code :
This mechanism works as follows : First , Extend one or two arrays by copying elements appropriately , So that after the conversion , Two tensors have the same shape . secondly , Perform a per element operation on the generated array .
because a and b And matrix , If you add them together , Their shapes don't match . We broadcast two matrices into a larger matrix , As shown below : matrix a Columns will be copied , matrix b The row will be copied , Then add by element .
Data preprocessing
Create files and write data
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)# Create in current directory data Folder
data_file = os.path.join('.', 'data', 'house_tiny.csv')# stay data Create under folder house_tiny.csv
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
Read the file , about csv Document multipurpose pandas This library
import pandas as pd
data=pd.read_csv(data_file)
print(data)
Data processing missing values and conversions
For missing values , We can use two methods: insertion method and deletion method , Insertion is the value we give , Deletion is a direct deletion, which is no longer considered , Here we use the method of taking the mean value of the missing value
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)# about string Type we see Alley The value is only Pave and NaN, So we can put Pave Write it down as 1,NaN Write it down as 0
print(inputs)
Transform our data into tensors
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
Come here , We will convert the data into tensor Tensor , This is processable for computers
Complete code :
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)
data_file = os.path.join('.', 'data', 'house_tiny.csv')
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
import pandas as pd
data=pd.read_csv(data_file)
print(data)
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)
print(inputs)
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
边栏推荐
- What can the interactive slide screen demonstration bring to the enterprise exhibition hall
- 2021 SASE integration strategic roadmap (I)
- File and image comparison tool kaleidoscope latest download
- MySQL learning notes (mind map)
- Mujoco produces analog video
- 学习光线跟踪一样的自3D表征Ego3RT
- Leecode brush questions record interview questions 32 - I. print binary tree from top to bottom
- How to set encoding in idea
- AI超清修复出黄家驹眼里的光、LeCun大佬《深度学习》课程生还报告、绝美画作只需一行代码、AI最新论文 | ShowMeAI资讯日报 #07.06
- 48 page digital government smart government all in one solution
猜你喜欢
System activity monitor ISTAT menus 6.61 (1185) Chinese repair
iMeta | 华南农大陈程杰/夏瑞等发布TBtools构造Circos图的简单方法
How to set encoding in idea
2022/2/10 summary
509 certificat basé sur Go
Everyone is always talking about EQ, so what is EQ?
threejs图片变形放大全屏动画js特效
37 pages Digital Village revitalization intelligent agriculture Comprehensive Planning and Construction Scheme
2021 SASE integration strategic roadmap (I)
Service asynchronous communication
随机推荐
Are you ready to automate continuous deployment in ci/cd?
Huawei mate8 battery price_ Huawei mate8 charges very slowly after replacing the battery
Operation test of function test basis
St table
48页数字政府智慧政务一网通办解决方案
基于GO语言实现的X.509证书
How engineers treat open source -- the heartfelt words of an old engineer
Leecode brushes questions and records interview questions 01.02 Determine whether it is character rearrangement for each other
After leaving a foreign company, I know what respect and compliance are
基於GO語言實現的X.509證書
Advanced learning of MySQL -- basics -- multi table query -- inner join
Everyone is always talking about EQ, so what is EQ?
Service asynchronous communication
浅谈测试开发怎么入门,如何提升?
JS import excel & Export Excel
JWT signature does not match locally computed signature. JWT validity cannot be asserted and should
Markov decision process
代码克隆的优缺点
Leecode brush questions record sword finger offer 43 The number of occurrences of 1 in integers 1 to n
Typescript incremental compilation