当前位置:网站首页>深度学习之数据处理
深度学习之数据处理
2022-07-06 16:54:00 【彭祥.】
数据操作
- 数据类型,我们最常用的便是数组了
创建数组需要
- 形状:几行几列
- 元素类型:int还是float
- 元素值
数组访问方式:
代码:
这种机制的工作方式如下:首先,通过适当复制元素来扩展一个或两个数组, 以便在转换之后,两个张量具有相同的形状。 其次,对生成的数组执行按元素操作。
由于a和b分别是和矩阵,如果让它们相加,它们的形状不匹配。 我们将两个矩阵广播为一个更大的矩阵,如下所示:矩阵a将复制列, 矩阵b将复制行,然后再按元素相加。
数据预处理
创建文件并写入数据
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)#在当前目录下创建data文件夹
data_file = os.path.join('.', 'data', 'house_tiny.csv')#在data文件夹下创建house_tiny.csv
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # 列名
f.write('NA,Pave,127500\n') # 每行表示一个数据样本
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
读取文件,对于csv文件多用pandas这个库
import pandas as pd
data=pd.read_csv(data_file)
print(data)
数据处理缺失值与转换
对于缺失值,我们可以采用插入法和删除法两种,插入即我们给定取值,删除则是直接删除不再考虑,这里我们采用缺失值取均值的方式
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]#按照文件格式读取数据,读第一列至第二列
inputs = inputs.fillna(inputs.mean())#对于缺少的数值我们一般取其他值的均值
inputs = pd.get_dummies(inputs, dummy_na=True)#对于string类型我们看到Alley取值只有Pave和NaN,所以我们可以将Pave记为1,NaN记为0
print(inputs)
将我们的数据转换为张量
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
到这里,我们便将数据转换为tensor的张量,这种对于计算机是可处理的
完整代码:
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)
data_file = os.path.join('.', 'data', 'house_tiny.csv')
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # 列名
f.write('NA,Pave,127500\n') # 每行表示一个数据样本
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
import pandas as pd
data=pd.read_csv(data_file)
print(data)
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]#按照文件格式读取数据,读第一列至第二列
inputs = inputs.fillna(inputs.mean())#对于缺少的数值我们一般取其他值的均值
inputs = pd.get_dummies(inputs, dummy_na=True)
print(inputs)
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
边栏推荐
- Clipboard management tool paste Chinese version
- Sword finger offer 26 Substructure of tree
- 37 pages Digital Village revitalization intelligent agriculture Comprehensive Planning and Construction Scheme
- [2022 the finest in the whole network] how to test the interface test generally? Process and steps of interface test
- 刘永鑫报告|微生物组数据分析与科学传播(晚7点半)
- On February 19, 2021ccf award ceremony will be held, "why in Hengdian?"
- PostgreSQL highly available repmgr (1 master 2 slave +1witness) + pgpool II realizes master-slave switching + read-write separation
- Introduction to GPIO
- DAY SIX
- JWT signature does not match locally computed signature. JWT validity cannot be asserted and should
猜你喜欢
Article management system based on SSM framework
JWT signature does not match locally computed signature. JWT validity cannot be asserted and should
2022年PMP项目管理考试敏捷知识点(9)
iMeta | 华南农大陈程杰/夏瑞等发布TBtools构造Circos图的简单方法
What is a responsive object? How to create a responsive object?
Lombok 同时使⽤ @Data 和 @Builder 的坑,你中招没?
MySQL learning notes (mind map)
Alexnet experiment encounters: loss Nan, train ACC 0.100, test ACC 0.100
基于GO语言实现的X.509证书
Business process testing based on functional testing
随机推荐
Data analysis course notes (V) common statistical methods, data and spelling, index and composite index
Quickly use various versions of PostgreSQL database in docker
Markov decision process
AI super clear repair resurfaces the light in Huang Jiaju's eyes, Lecun boss's "deep learning" course survival report, beautiful paintings only need one line of code, AI's latest paper | showmeai info
DAY SIX
三维扫描体数据的VTK体绘制程序设计
英雄联盟|王者|穿越火线 bgm AI配乐大赛分享
Introduction to GPIO
Rails 4 asset pipeline vendor asset images are not precompiled
If the college entrance examination goes well, I'm already graying out at the construction site at the moment
Hero League | King | cross the line of fire BGM AI score competition sharing
DAY FOUR
Pdf document signature Guide
Devops can help reduce technology debt in ten ways
X.509 certificate based on go language
The difference between redirectto and navigateto in uniapp
Win10 startup error, press F9 to enter how to repair?
ZYNQ移植uCOSIII
Leecode brush questions record sword finger offer 11 Rotate the minimum number of the array
VTK volume rendering program design of 3D scanned volume data