当前位置:网站首页>深度学习之数据处理
深度学习之数据处理
2022-07-06 16:54:00 【彭祥.】
数据操作
- 数据类型,我们最常用的便是数组了


创建数组需要
- 形状:几行几列
- 元素类型:int还是float
- 元素值
数组访问方式:
代码:




这种机制的工作方式如下:首先,通过适当复制元素来扩展一个或两个数组, 以便在转换之后,两个张量具有相同的形状。 其次,对生成的数组执行按元素操作。
由于a和b分别是和矩阵,如果让它们相加,它们的形状不匹配。 我们将两个矩阵广播为一个更大的矩阵,如下所示:矩阵a将复制列, 矩阵b将复制行,然后再按元素相加。
数据预处理
创建文件并写入数据
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)#在当前目录下创建data文件夹
data_file = os.path.join('.', 'data', 'house_tiny.csv')#在data文件夹下创建house_tiny.csv
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # 列名
f.write('NA,Pave,127500\n') # 每行表示一个数据样本
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
读取文件,对于csv文件多用pandas这个库
import pandas as pd
data=pd.read_csv(data_file)
print(data)

数据处理缺失值与转换
对于缺失值,我们可以采用插入法和删除法两种,插入即我们给定取值,删除则是直接删除不再考虑,这里我们采用缺失值取均值的方式
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]#按照文件格式读取数据,读第一列至第二列
inputs = inputs.fillna(inputs.mean())#对于缺少的数值我们一般取其他值的均值
inputs = pd.get_dummies(inputs, dummy_na=True)#对于string类型我们看到Alley取值只有Pave和NaN,所以我们可以将Pave记为1,NaN记为0
print(inputs)

将我们的数据转换为张量
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
到这里,我们便将数据转换为tensor的张量,这种对于计算机是可处理的
完整代码:
import os
os.makedirs(os.path.join('.', 'data'), exist_ok=True)
data_file = os.path.join('.', 'data', 'house_tiny.csv')
print(data_file)
with open(data_file, 'w') as f:
f.write('NumRooms,Alley,Price\n') # 列名
f.write('NA,Pave,127500\n') # 每行表示一个数据样本
f.write('2,NA,106000\n')
f.write('4,NA,178100\n')
f.write('NA,NA,140000\n')
import pandas as pd
data=pd.read_csv(data_file)
print(data)
inputs,outputs = data.iloc[:, 0:2], data.iloc[:, 2]#按照文件格式读取数据,读第一列至第二列
inputs = inputs.fillna(inputs.mean())#对于缺少的数值我们一般取其他值的均值
inputs = pd.get_dummies(inputs, dummy_na=True)
print(inputs)
import torch
x,y=torch.tensor(inputs.values),torch.tensor(outputs.values)
print(x,y)
边栏推荐
- Leecode brush questions record interview questions 32 - I. print binary tree from top to bottom
- Wechat applet UploadFile server, wechat applet wx Uploadfile[easy to understand]
- 48页数字政府智慧政务一网通办解决方案
- Use Yum or up2date to install the postgresql13.3 database
- Data operation platform - data collection [easy to understand]
- Leecode brush questions record sword finger offer 43 The number of occurrences of 1 in integers 1 to n
- Memory optimization of Amazon memorydb for redis and Amazon elasticache for redis
- Advanced learning of MySQL -- basics -- basic operation of transactions
- Uniapp uploads and displays avatars locally, and converts avatars into Base64 format and stores them in MySQL database
- 2022/2/10 summary
猜你喜欢

Business process testing based on functional testing

工程师如何对待开源 --- 一个老工程师的肺腑之言

@TableId can‘t more than one in Class: “com.example.CloseContactSearcher.entity.Activity“.

Data analysis course notes (V) common statistical methods, data and spelling, index and composite index

Clipboard management tool paste Chinese version

Alexnet experiment encounters: loss Nan, train ACC 0.100, test ACC 0.100

ldap创建公司组织、人员

DAY TWO

从外企离开,我才知道什么叫尊重跟合规…

JWT signature does not match locally computed signature. JWT validity cannot be asserted and should
随机推荐
Everyone is always talking about EQ, so what is EQ?
Advanced learning of MySQL -- basics -- multi table query -- self join
Interface master v3.9, API low code development tool, build your interface service platform immediately
Data operation platform - data collection [easy to understand]
【软件逆向-求解flag】内存获取、逆变换操作、线性变换、约束求解
How to set encoding in idea
C语言输入/输出流和文件操作【二】
The programmer resigned and was sentenced to 10 months for deleting the code. Jingdong came home and said that it took 30000 to restore the database. Netizen: This is really a revenge
How can computers ensure data security in the quantum era? The United States announced four alternative encryption algorithms
PostgreSQL uses pgpool II to realize read-write separation + load balancing
uniapp实现从本地上传头像并显示,同时将头像转化为base64格式存储在mysql数据库中
Sword finger offer 26 Substructure of tree
MIT 6.824 - raft Student Guide
The difference between redirectto and navigateto in uniapp
【YoloV5 6.0|6.1 部署 TensorRT到torchserve】环境搭建|模型转换|engine模型部署(详细的packet文件编写方法)
Three sentences to briefly introduce subnet mask
Alexnet experiment encounters: loss Nan, train ACC 0.100, test ACC 0.100
Leecode brush questions record sword finger offer 43 The number of occurrences of 1 in integers 1 to n
2022/2/11 summary
Use type aliases in typescript