当前位置:网站首页>MindSpore:【resnet_thor模型】尝试运行resnet_thor时报Could not convert to
MindSpore:【resnet_thor模型】尝试运行resnet_thor时报Could not convert to
2022-07-30 19:04:00 【小乐快乐】
问题描述:
【功能模块】
用mindspore-ascend-1.1.1 运行resnet_thor(仓库地址:https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnet_thor)时报错。
【操作步骤&问题现象】
1、解压imagenet2012数据集
2、注释掉src/dataset_helper.py中的160-162行(否则这里会抛出异常)

3、cd resnet_thor && python train.py --dataset_path=/home/ImageNet2012_origin
报错信息:
WARNING: 'ControlDepend' is deprecated from version 1.1 and will be removed in a future version, use 'Depend' instead.
[ERROR] CORE(167346,python):2021-03-31-17:06:03.564.646 [mindspore/core/utils/status.cc:43] Status] Thread ID 281470327271920 Unexpected error. Could not convert to CV Tensor
Line of code : 142
File : /home/jenkins/agent-working-dir/workspace/Compile_Ascend_ARM_Ubuntu/mindspore/mindspore/ccsrc/minddata/dataset/kernels/image/image_utils.cc
Traceback (most recent call last):
File "train.py", line 143, in
model.train(config.epoch_size, dataset, callbacks=cb)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/model.py", line 592, in train
sink_size=sink_size)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/model.py", line 391, in _train
self._train_dataset_sink_process(epoch, train_dataset, list_callback, cb_params, sink_size)
File "/home/resnet_thor/src/model_thor.py", line 183, in _train_dataset_sink_process
iter_first_order=iter_first_order)
File "/home/resnet_thor/src/model_thor.py", line 122, in _exec_preprocess
dataset_helper = DatasetHelper(dataset, dataset_sink_mode, sink_size, epoch_num, iter_first_order)
File "/home/resnet_thor/src/dataset_helper.py", line 72, in init
self.iter = iterclass(dataset, sink_size, epoch_num, iter_first_order)
File "/home/resnet_thor/src/dataset_helper.py", line 156, in init
super().init(dataset, sink_size, epoch_num)
File "/home/resnet_thor/src/dataset_helper.py", line 106, in init
dataset.transfer_dataset = _exec_datagraph(dataset, self.sink_size)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/_utils.py", line 62, in _exec_datagraph
dataset_types, dataset_shapes = _get_types_and_shapes(exec_dataset)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/_utils.py", line 51, in _get_types_and_shapes
dataset_types = _convert_type(dataset.output_types())
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/dataset/engine/datasets.py", line 1443, in output_types
self.saved_output_shapes = runtime_getter[0].GetOutputShapes()
RuntimeError: Thread ID 281470327271920 Unexpected error. Could not convert to CV Tensor
Line of code : 142
File : /home/jenkins/agent-working-dir/workspace/Compile_Ascend_ARM_Ubuntu/mindspore/mindspore/ccsrc/minddata/dataset/kernels/image/image_utils.cc
报错截图:

解决方案:
看报错应该是数据集使用方式不对,应该是数据集路径没有使用到训练那级的路径,排查下数据集,可以试下
python train.py --dataset_path=/home/ImageNet2012_origin/train
参考了@zhaoting_731 做了修改后,原来的问题解决了,但是遇到了新的报错

看起来似乎和hccl 多卡训练有关系,但我运行的命令是:
python train.py --dataset_path=/home/ImageNet2012_origin/ilsvrc
所以run_distribute是默认的False,走的应该是单卡训练
错误信息:
WARNING: 'ControlDepend' is deprecated from version 1.1 and will be removed in a future version, use 'Depend' instead.
WARNING: 'ControlDepend' is deprecated from version 1.1 and will be removed in a future version, use 'Depend' instead.
[ERROR] HCCL_ADPT(78728,python):2021-04-06-20:10:05.673.721 [mindspore/ccsrc/runtime/hccl_adapter/hccl_adapter.cc:124] GenTask] : The pointer[ops_kernel_builder] is null.
Traceback (most recent call last):
File "train.py", line 143, in <module>
model.train(config.epoch_size, dataset, callbacks=cb)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/model.py", line 592, in train
sink_size=sink_size)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/model.py", line 391, in _train
self._train_dataset_sink_process(epoch, train_dataset, list_callback, cb_params, sink_size)
File "/home/thor/mindspore/model_zoo/official/cv/resnet_thor/src/model_thor.py", line 254, in _train_dataset_sink_process
outputs = self._train_network(*inputs)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/nn/cell.py", line 322, in __call__
out = self.compile_and_run(*inputs)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/nn/cell.py", line 578, in compile_and_run
self.compile(*inputs)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/nn/cell.py", line 565, in compile
_executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/common/api.py", line 505, in compile
result = self._executor.compile(obj, args_list, phase, use_vm)
RuntimeError: mindspore/ccsrc/runtime/hccl_adapter/hccl_adapter.cc:124 GenTask] : The pointer[ops_kernel_builder] is null.
model zoo中的这个示例主要是针对多卡场景的,目前我们已经将resnet及resnet_thor脚本合并为resnet,如果想要运行单卡训练的话,推荐使用resnet目录下的代码,将src/config.py中的优化器改为Thor,然后按照README 执行训练。如:
python train.py --net=resnet50 --dataset=imagenet2012 --device_target=Ascend --dataset_path=[DATASET_PATH]
边栏推荐
猜你喜欢

Critical Reviews | A review of the global distribution of antibiotics and resistance genes in farmland soil by Nannong Zou Jianwen's group

The use of @ symbol in MySql

【PHPWord】Quick Start of PHPWord in PHPOffice Suite

After 23 years of operation, the former "China's largest e-commerce website" has turned yellow...

JS提升:Promise中reject与then之间的关系

【PHPWord】PHPOffice 套件之PHPWord快速入门

浅聊对比学习(Contrastive Learning)第一弹

自然语言处理nltk

MySql中@符号的使用

OneFlow source code analysis: Op, Kernel and interpreter
随机推荐
跨进程启动后台服务
Codeblocks + Widgets create window code analysis
redis
CIMC Shilian Dafeitong is the global industrial artificial intelligence AI leader, the world's top AI core technology, high generalization, high robustness, sparse sample continuous learning, industri
基于inquirer封装一个控制台文件选择器
架构师如何成长
跨域问题的解决方法
开心的聚餐
Tensorflow2.0 confusion matrix does not match printing accuracy
VS Code 连接SQL Server
监听开机广播
Tensorflow2.0 混淆矩阵与打印准确率不符
【总结】1396- 60+个 VSCode 插件,打造好用的编辑器
[Summary] 1396- 60+ VSCode plugins to create a useful editor
第十七届“振兴杯”全国青年 职业技能大赛——计算机程序设计员(云计算平台与运维)参赛回顾与总结
【科普】无线电波怎样传送信息?
防抖和节流有什么区别,分别用于什么场景?
golang日志库zerolog使用记录
Multiple instances of mysql
The advanced version of the Niu Ke brushing series (team competition, sorting subsequences, inverting strings, deleting common characters, repairing pastures)