当前位置:网站首页>Self made dataset in pytoch for dataset rewriting
Self made dataset in pytoch for dataset rewriting
2022-07-07 17:41:00 【AI cannon fodder】
Through the last blog post , We can get the data of the file as follows :
So the process of self-made dataset is as follows :
(1) Generate csv perhaps txt file
See my last blog : Deep learning - Make your own dataset _AI Cannon fodder blog -CSDN Blog
(2) rewrite Dataset
(3) Generate DataLoader()
(4) Iterative data
(2)(3)(4) The complete code of step is as follows ;
import pandas as pd
from torch.utils.data import Dataset, DataLoader, random_split
from torchvision import transforms
import cv2 as cv
class diff_motion_dataset(Dataset):
def __init__(self, dataset_dir, csv_path, resize_shape): # After initialization, the initialization function will call itself
# init Methods generally need to write data transformer、 Basic parameters of data
self.dataset_dir = dataset_dir
self.csv_path = csv_path
self.shape = resize_shape
# Read our generated csv file
self.df = pd.read_csv(self.csv_path, encoding='utf-8')
self.transformer = transforms.Compose([
transforms.Resize(self.shape),
transforms.ToTensor(), # hold PIL nucleus np.array Convert images in format to Tensor
])
def __len__(self): # Return data size
return len(self.df)
def __getitem__(self, idx): # getitem, idx = index Is the subscript of the data sample . Special reminder: first list filename and label Take it out and proceed idx Read in sequence, otherwise an error will be reported
x_train = cv.imread(self.df['filepath'][idx]) # Read idx That's ok ,filename Columns of data ( That is, all images ), And then into transformer Inside , It will process the image resize and toTensor
y_train = self.df['label'][idx] # traindataLoader It will automatically turn label Turn into tensor
return x_train, y_train # A single piece of data is returned, not df All the data in it
data_ds = diff_motion_dataset("F:/reshape_images", "F:/reshape_images/motion_data.csv", (256, 256))
# print(len(data_ds))
# Data partitioning
num_sample = len(data_ds)
train_percent = 0.8
train_num = int(train_percent*num_sample)
test_num = num_sample - train_num
train_ds, test_ds = random_split(data_ds, [train_num, test_num])
# print(len(train_ds))
# 3. Generate DataLoader(). Make the data iteratable , Secondly, the data can be divided into many batch as well as shuffer、nun_worker Multithreading
train_dl = DataLoader(train_ds, batch_size=4, shuffle=True)
test_dl = DataLoader(test_ds, batch_size=4, shuffle=True)
# # Iterative data
for x_train, y_train in iter(train_dl):
print(x_train.shape)
print(y_train.shape)
break
If you need self-defined model for self-made data set training , Call the defined model as follows :
Different formats are the production and loading of data sets, as shown in :
边栏推荐
- Function and usage of calendar view component
- 在窗口上面显示进度条
- 深度学习-制作自己的数据集
- 深入浅出图解CNN-卷积神经网络
- Actionbar navigation bar learning
- calendarview日历视图组件的功能和用法
- 第二十四届中国科协湖南组委会调研课题组一行莅临麒麟信安调研考察
- Share the latest high-frequency Android interview questions, and take you to explore the Android event distribution mechanism
- 2021-06-28
- 【可信计算】第十二次课:TPM授权与会话
猜你喜欢
随机推荐
【网络攻防原理与技术】第4章:网络扫描技术
策略模式 - Unity
麒麟信安操作系统衍生产品解决方案 | 存储多路径管理系统,有效提高数据传输可靠性
viewflipper的功能和用法
【信息安全法律法規】複習篇
【TPM2.0原理及应用指南】 9、10、11章
如何在软件研发阶段落地安全实践
本周小贴士#141:注意隐式转换到bool
【可信计算】第十一次课:TPM密码资源管理(三) NV索引与PCR
swiper左右切换滑块插件
Pytorch中自制数据集进行Dataset重写
Alertdialog create dialog
基于百度飞浆平台(EasyDL)设计的人脸识别考勤系统
企业经营12法的领悟
基于RGB图像阈值分割并利用滑动调节阈值
仿今日头条APP顶部点击可居中导航
本周小贴士131:特殊成员函数和`= default`
DatePickerDialog and trimepickerdialog
数值 - number(Lua)
机器人工程终身学习和工作计划-2022-