当前位置:网站首页>Data processing of deep learning
Data processing of deep learning
2022-07-07 00:42:00 【Peng Xiang】
Data manipulation
- data type , The most commonly used one is array


Creating an array requires
- shape : A few lines and columns
- Element type :int still float
- Element value
Array access method :
Code :




This mechanism works as follows : First , Extend one or two arrays by copying elements appropriately , So that after the conversion , Two tensors have the same shape . secondly , Perform a per element operation on the generated array .
because a and b And matrix , If you add them together , Their shapes don't match . We broadcast two matrices into a larger matrix , As shown below : matrix a Columns will be copied , matrix b The row will be copied , Then add by element .
Data preprocessing
Create files and write data
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)# Create in current directory data Folder
data_file = os.path.join('.', 'data', 'house_tiny.csv')# stay data Create under folder house_tiny.csv
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
Read the file , about csv Document multipurpose pandas This library
import pandas as pd
data=pd.read_csv(data_file)
print(data)

Data processing missing values and conversions
For missing values , We can use two methods: insertion method and deletion method , Insertion is the value we give , Deletion is a direct deletion, which is no longer considered , Here we use the method of taking the mean value of the missing value
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)# about string Type we see Alley The value is only Pave and NaN, So we can put Pave Write it down as 1,NaN Write it down as 0
print(inputs)

Transform our data into tensors
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
Come here , We will convert the data into tensor Tensor , This is processable for computers 
Complete code :
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)
data_file = os.path.join('.', 'data', 'house_tiny.csv')
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # Name
f.write('NA,Pave,127500\n') # Each row represents a data sample
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
import pandas as pd
data=pd.read_csv(data_file)
print(data)
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]# Read data in file format , Read columns 1 to 2
inputs = inputs.fillna(inputs.mean())# For the missing value, we usually take the mean value of other values
inputs = pd.get_dummies(inputs, dummy_na=True)
print(inputs)
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
边栏推荐
- 深度学习之环境配置 jupyter notebook
- 如何判断一个数组中的元素包含一个对象的所有属性值
- 代码克隆的优缺点
- Advanced learning of MySQL -- Fundamentals -- concurrency of transactions
- How engineers treat open source -- the heartfelt words of an old engineer
- Personal digestion of DDD
- 509 certificat basé sur Go
- MySQL learning notes (mind map)
- Use mujoco to simulate Cassie robot
- build. How to configure the dependent version number in the gradle file
猜你喜欢

Designed for decision tree, the National University of Singapore and Tsinghua University jointly proposed a fast and safe federal learning system

深度学习之环境配置 jupyter notebook

rancher集成ldap,实现统一账号登录

基于SSM框架的文章管理系统

On February 19, 2021ccf award ceremony will be held, "why in Hengdian?"

5种不同的代码相似性检测,以及代码相似性检测的发展趋势

Mujoco Jacobi - inverse motion - sensor

学习使用代码生成美观的接口文档!!!

The programmer resigned and was sentenced to 10 months for deleting the code. Jingdong came home and said that it took 30000 to restore the database. Netizen: This is really a revenge

Liuyongxin report | microbiome data analysis and science communication (7:30 p.m.)
随机推荐
If the college entrance examination goes well, I'm already graying out at the construction site at the moment
集合(泛型 & List & Set & 自定义排序)
Stm32f407 ------- DAC digital to analog conversion
Things like random
Testers, how to prepare test data
MySQL learning notes (mind map)
AI超清修复出黄家驹眼里的光、LeCun大佬《深度学习》课程生还报告、绝美画作只需一行代码、AI最新论文 | ShowMeAI资讯日报 #07.06
Cross-entrpy Method
[vector retrieval research series] product introduction
什么是时间
How to set encoding in idea
Amazon MemoryDB for Redis 和 Amazon ElastiCache for Redis 的内存优化
Data analysis course notes (III) array shape and calculation, numpy storage / reading data, indexing, slicing and splicing
Leecode brushes questions and records interview questions 01.02 Determine whether it is character rearrangement for each other
37頁數字鄉村振興智慧農業整體規劃建設方案
System activity monitor ISTAT menus 6.61 (1185) Chinese repair
Interface master v3.9, API low code development tool, build your interface service platform immediately
Lombok makes ⽤ @data and @builder's pit at the same time. Are you hit?
Explain in detail the implementation of call, apply and bind in JS (source code implementation)
【vulnhub】presidential1