当前位置:网站首页>Data processing of deep learning
Data processing of deep learning
2022-07-07 00:42:00 【Peng Xiang】
Data manipulation
- data type , The most commonly used one is array


Creating an array requires
- shape : A few lines and columns
- Element type :int still float
- Element value
Array access method :
Code :




This mechanism works as follows : First , Extend one or two arrays by copying elements appropriately , So that after the conversion , Two tensors have the same shape . secondly , Perform a per element operation on the generated array .
because a and b And matrix , If you add them together , Their shapes don't match . We broadcast two matrices into a larger matrix , As shown below : matrix a Columns will be copied , matrix b The row will be copied , Then add by element .
Data preprocessing
Create files and write data
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)# Create in current directory data Folder
data_file = os.path.join('.', 'data', 'house_tiny.csv')# stay data Create under folder house_tiny.csv
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
Read the file , about csv Document multipurpose pandas This library
import pandas as pd
data=pd.read_csv(data_file)
print(data)

Data processing missing values and conversions
For missing values , We can use two methods: insertion method and deletion method , Insertion is the value we give , Deletion is a direct deletion, which is no longer considered , Here we use the method of taking the mean value of the missing value
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)# about string Type we see Alley The value is only Pave and NaN, So we can put Pave Write it down as 1,NaN Write it down as 0
print(inputs)

Transform our data into tensors
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
Come here , We will convert the data into tensor Tensor , This is processable for computers 
Complete code :
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)
data_file = os.path.join('.', 'data', 'house_tiny.csv')
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
import pandas as pd
data=pd.read_csv(data_file)
print(data)
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)
print(inputs)
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
边栏推荐
- Command line kills window process
- Operation test of function test basis
- 5种不同的代码相似性检测,以及代码相似性检测的发展趋势
- Sword finger offer 26 Substructure of tree
- Leetcode(547)——省份数量
- Basic information of mujoco
- Clipboard management tool paste Chinese version
- 37页数字乡村振兴智慧农业整体规划建设方案
- Designed for decision tree, the National University of Singapore and Tsinghua University jointly proposed a fast and safe federal learning system
- Three methods to realize JS asynchronous loading
猜你喜欢

Article management system based on SSM framework

Everyone is always talking about EQ, so what is EQ?

AI super clear repair resurfaces the light in Huang Jiaju's eyes, Lecun boss's "deep learning" course survival report, beautiful paintings only need one line of code, AI's latest paper | showmeai info

如何判断一个数组中的元素包含一个对象的所有属性值

Mujoco finite state machine and trajectory tracking

智能运维应用之道,告别企业数字化转型危机

Data analysis course notes (V) common statistical methods, data and spelling, index and composite index

37页数字乡村振兴智慧农业整体规划建设方案

Idea automatically imports and deletes package settings

Three application characteristics of immersive projection in offline display
随机推荐
Three application characteristics of immersive projection in offline display
基於GO語言實現的X.509證書
Personal digestion of DDD
Are you ready to automate continuous deployment in ci/cd?
Mujoco finite state machine and trajectory tracking
三维扫描体数据的VTK体绘制程序设计
The difference between redirectto and navigateto in uniapp
2021 SASE integration strategic roadmap (I)
一图看懂对程序员的误解:西方程序员眼中的中国程序员
Attention SLAM:一种从人类注意中学习的视觉单目SLAM
Leecode brush questions record sword finger offer 43 The number of occurrences of 1 in integers 1 to n
Lombok 同时使⽤ @Data 和 @Builder 的坑,你中招没?
Advanced learning of MySQL -- basics -- multi table query -- external connection
How can computers ensure data security in the quantum era? The United States announced four alternative encryption algorithms
Article management system based on SSM framework
stm32F407-------DAC数模转换
Understand the misunderstanding of programmers: Chinese programmers in the eyes of Western programmers
Leecode brush question record sword finger offer 58 - ii Rotate string left
How to set encoding in idea
Stm32f407 ------- SPI communication