当前位置:网站首页>MindSpore:【resnet_thor模型】尝试运行resnet_thor时报Could not convert to
MindSpore:【resnet_thor模型】尝试运行resnet_thor时报Could not convert to
2022-07-30 19:04:00 【小乐快乐】
问题描述:
【功能模块】
用mindspore-ascend-1.1.1 运行resnet_thor(仓库地址:https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnet_thor)时报错。
【操作步骤&问题现象】
1、解压imagenet2012数据集
2、注释掉src/dataset_helper.py中的160-162行(否则这里会抛出异常)
3、cd resnet_thor && python train.py --dataset_path=/home/ImageNet2012_origin
报错信息:
WARNING: 'ControlDepend' is deprecated from version 1.1 and will be removed in a future version, use 'Depend' instead.
[ERROR] CORE(167346,python):2021-03-31-17:06:03.564.646 [mindspore/core/utils/status.cc:43] Status] Thread ID 281470327271920 Unexpected error. Could not convert to CV Tensor
Line of code : 142
File : /home/jenkins/agent-working-dir/workspace/Compile_Ascend_ARM_Ubuntu/mindspore/mindspore/ccsrc/minddata/dataset/kernels/image/image_utils.cc
Traceback (most recent call last):
File "train.py", line 143, in
model.train(config.epoch_size, dataset, callbacks=cb)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/model.py", line 592, in train
sink_size=sink_size)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/model.py", line 391, in _train
self._train_dataset_sink_process(epoch, train_dataset, list_callback, cb_params, sink_size)
File "/home/resnet_thor/src/model_thor.py", line 183, in _train_dataset_sink_process
iter_first_order=iter_first_order)
File "/home/resnet_thor/src/model_thor.py", line 122, in _exec_preprocess
dataset_helper = DatasetHelper(dataset, dataset_sink_mode, sink_size, epoch_num, iter_first_order)
File "/home/resnet_thor/src/dataset_helper.py", line 72, in init
self.iter = iterclass(dataset, sink_size, epoch_num, iter_first_order)
File "/home/resnet_thor/src/dataset_helper.py", line 156, in init
super().init(dataset, sink_size, epoch_num)
File "/home/resnet_thor/src/dataset_helper.py", line 106, in init
dataset.transfer_dataset = _exec_datagraph(dataset, self.sink_size)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/_utils.py", line 62, in _exec_datagraph
dataset_types, dataset_shapes = _get_types_and_shapes(exec_dataset)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/_utils.py", line 51, in _get_types_and_shapes
dataset_types = _convert_type(dataset.output_types())
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/dataset/engine/datasets.py", line 1443, in output_types
self.saved_output_shapes = runtime_getter[0].GetOutputShapes()
RuntimeError: Thread ID 281470327271920 Unexpected error. Could not convert to CV Tensor
Line of code : 142
File : /home/jenkins/agent-working-dir/workspace/Compile_Ascend_ARM_Ubuntu/mindspore/mindspore/ccsrc/minddata/dataset/kernels/image/image_utils.cc
报错截图:
解决方案:
看报错应该是数据集使用方式不对,应该是数据集路径没有使用到训练那级的路径,排查下数据集,可以试下
python train.py --dataset_path=/home/ImageNet2012_origin/train
参考了@zhaoting_731 做了修改后,原来的问题解决了,但是遇到了新的报错
看起来似乎和hccl 多卡训练有关系,但我运行的命令是:
python train.py --dataset_path=/home/ImageNet2012_origin/ilsvrc
所以run_distribute是默认的False,走的应该是单卡训练
错误信息:
WARNING: 'ControlDepend' is deprecated from version 1.1 and will be removed in a future version, use 'Depend' instead.
WARNING: 'ControlDepend' is deprecated from version 1.1 and will be removed in a future version, use 'Depend' instead.
[ERROR] HCCL_ADPT(78728,python):2021-04-06-20:10:05.673.721 [mindspore/ccsrc/runtime/hccl_adapter/hccl_adapter.cc:124] GenTask] : The pointer[ops_kernel_builder] is null.
Traceback (most recent call last):
File "train.py", line 143, in <module>
model.train(config.epoch_size, dataset, callbacks=cb)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/model.py", line 592, in train
sink_size=sink_size)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/train/model.py", line 391, in _train
self._train_dataset_sink_process(epoch, train_dataset, list_callback, cb_params, sink_size)
File "/home/thor/mindspore/model_zoo/official/cv/resnet_thor/src/model_thor.py", line 254, in _train_dataset_sink_process
outputs = self._train_network(*inputs)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/nn/cell.py", line 322, in __call__
out = self.compile_and_run(*inputs)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/nn/cell.py", line 578, in compile_and_run
self.compile(*inputs)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/nn/cell.py", line 565, in compile
_executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode)
File "/usr/local/python3.7.5/lib/python3.7/site-packages/mindspore/common/api.py", line 505, in compile
result = self._executor.compile(obj, args_list, phase, use_vm)
RuntimeError: mindspore/ccsrc/runtime/hccl_adapter/hccl_adapter.cc:124 GenTask] : The pointer[ops_kernel_builder] is null.
model zoo中的这个示例主要是针对多卡场景的,目前我们已经将resnet及resnet_thor脚本合并为resnet,如果想要运行单卡训练的话,推荐使用resnet目录下的代码,将src/config.py中的优化器改为Thor,然后按照README 执行训练。如:
python train.py --net=resnet50 --dataset=imagenet2012 --device_target=Ascend --dataset_path=[DATASET_PATH]
边栏推荐
- 【PHPWord】PHPOffice 套件之PHPWord快速入门
- 【科普】无线电波怎样传送信息?
- LeetCode每日一题(1717. Maximum Score From Removing Substrings)
- Listen to the boot broadcast
- nlohmann json 使用指南【visual studio 2022】
- 攻防世界web-Cat
- Golang logging library zerolog use record
- What is a RESTful API?
- 启动前台Activity
- MYSQL (Basic) - An article takes you into the wonderful world of MYSQL
猜你喜欢
MySql中@符号的使用
生物医学论文有何价值 论文中译英怎样翻译效果好
Fixed asset visualization intelligent management system
Swiper轮播图片并播放背景音乐
Mysql执行原理剖析
电脑死机的时候,发生了什么?
运营 23 年,昔日“国内第一大电商网站”黄了...
第十七届“振兴杯”全国青年 职业技能大赛——计算机程序设计员(云计算平台与运维)参赛回顾与总结
kotlin的by lazy
NC | Tao Liang Group of West Lake University - TMPRSS2 "assists" virus infection and mediates the host invasion of Clostridium sothrix hemorrhagic toxin...
随机推荐
nlohmann json 使用指南【visual studio 2022】
【Pointing to Offer】Pointing to Offer 22. The kth node from the bottom in the linked list
Scrapy框架介绍
【Pointing to Offer】Pointing to Offer 18. Delete the node of the linked list
The advanced version of the cattle brushing series (search for rotating sorted arrays, inversion of the specified range in the linked list)
2种手绘风格效果比较,你更喜欢哪一种呢?
VBA 连接Access数据库和Excle
跨域问题的解决方法
【网站放大镜效果】两种方式实现
WEBSOCKETPP使用简介+demo
[TypeScript]编译配置
What is the difference between a cloud database and an on-premises database?
Critical Reviews | A review of the global distribution of antibiotics and resistance genes in farmland soil by Nannong Zou Jianwen's group
中集世联达飞瞳全球工业人工智能AI领军者,全球顶尖AI核心技术高泛化性高鲁棒性稀疏样本持续学习,工业级高性能成熟AI产品规模应用
Vulkan开启特征(feature)的正确姿势
The Meta metaverse division lost 2.8 billion in the second quarter!Still want to keep betting?Metaverse development has yet to see a way out!
scrapy基本使用
The advanced version of the Niu Ke brushing series (team competition, sorting subsequences, inverting strings, deleting common characters, repairing pastures)
VS Code 连接SQL Server
基于inquirer封装一个控制台文件选择器