当前位置:网站首页>Mix sampling and loading with mindspire multiple datasets
Mix sampling and loading with mindspire multiple datasets
2022-06-10 04:36:00 【MSofficial】
Problem description :
【 Function module 】
Mindspore 1.5
【 Operation steps & Problem phenomenon 】
1、 Positive and negative examples need to be loaded in comparative learning , From different data sets PosDataSet as well as NegDataSet, also The sample size of the two is inconsistent , Now, it is used for two data sets respectively RandomSampler sampling
Use batch The data sets are processed as PosBatchDataSet as well as NegBatchDataSet
2、 In order to use Model Multi card parallel function , Now you need to merge the two data sets into the same data set , When primary PosBatchDataSet perhaps NegBatchDataSet Of batchSize by 64 when , You want to merge the data set batchSize by 128, The top 64 The data comes from PosBatchDataSet, after 64 individual originate NegBatchDataSet
3. problem :
a. Whether it is feasible to merge data sets , If possible, can you provide reference
b. Whether it can be used DistributedSampler Using multiple machines
c. because PosDataSet as well as NegDataSet Sample size is inconsistent , Can you specify NegDataSet Repeat sampling , until PosDataSet End of non repeated sampling
【 Screenshot information 】
class MakeDataset:
def __init__(self, length):
self.length = length
self.A = [np.ones((2, 2)) for _ in range(length)]
self.B = np.arange(length)
def __getitem__(self, index):
return self.A[index], self.B[index]
def __len__(self):
return self.length
batch_size = 64
pos_data_set = MakeDataset(1000)
neg_data_set = MakeDataset(1000)
pos_data_set = GeneratorDataset(pos_data_set, column_names=["A", "B"])
neg_data_set = GeneratorDataset(neg_data_set, column_names=["A", "B"])
pos_batch_data_set = pos_data_set.batch(batch_size)
neg_batch_data_set = neg_data_set.batch(batch_size)
# Merge(pos_batch_data_set, neg_batch_data_set)
answer :
Make a proposal to NegBatchData and PosBatchData Put it in the same MakeDataset Inside , And then control it NegSample -> NegBatch and PosSample -> PosBatch The logic of , And repeat sampling can also be done in __getitem__ Internal control implementation . namely : adopt MakeDataset The output is already mixed Batch. According to the above logic , only one MakeDataset, Yes, you can use DistributedSampler Of .
边栏推荐
- golang学习之四:闭包、defer
- Will quic become a disruptor of Internet transmission?
- Today, 19:30 | graphics special session - Gao Lin's team from Institute of computing technology, Chinese Academy of Sciences
- libc、glibc和glib的关系
- Exemple de démarrage JDBC
- Mindspore 1.5rcGraph Mode训练速度慢,这是为什么?
- 25. Bom Event
- Pampy | powerful pattern matching tool
- jenkinsclient | 好用的jenkins客户端
- 超好用的 Chrome 插件!
猜你喜欢

什么时候用@ComponentScan?与@MapperScan有什么区别?

Execution strategy of application software efficiency test

Celery | task queue artifact

What are the advantages of multi merchant mall applet source code?

大事件回顾 | Eolink 5月重要动态速览!

【深度学习】《PyTorch入门到项目实战》(十一):卷积层

Celery | 任务队列神器
![[depth first search] toy snake: maze problem](/img/9c/1c74ea1f7c9367a9bec4e1663a853a.png)
[depth first search] toy snake: maze problem

Metersphere | a super easy-to-use open source testing platform

AI candidates challenge the composition of the college entrance examination, generating one essay in one second on average, with the level of more than 75% of the candidates
随机推荐
golang学习之六:中的文件操作
Using kubekey to build kubernetes/kubesphere environment
什么时候用@ComponentScan?与@MapperScan有什么区别?
Award winning research
Basic methods of stack and related problems
91. 栅栏
midway的使用教程
2022.5.26-----leetcode. six hundred and ninety-nine
MindSpore【数据集功能】无法查看数据集
Detailed explanation of thread pool creation method
S系列·删除文件夹的几种姿势
APISpace 疫情地区校验API接口 免费好用
Meanings of letters in PMP project management calculation PV, EV, AC, SV, CV, SPI, CPI
Quic must see
OpenJudge NOI 1.13 13:人民币支付
Distributed data object: HyperTerminal 'global variable'
[depth first search] maximum product: arrangement
IO被谁吃了?
90. 闭锁
Deep learning and CV tutorial (13) | target detection (SSD, Yolo Series)