当前位置:网站首页>Faster-ILOD、maskrcnn_benchmark训练coco数据集及问题汇总
Faster-ILOD、maskrcnn_benchmark训练coco数据集及问题汇总
2022-07-02 06:26:00 【chenf0】
loading annotations into memory...
Done (t=18.25s)
creating index...
index created!
number of images used for training: 31235
2022-06-05 03:17:10,191 maskrcnn_benchmark.trainer INFO: Start training
/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py:422: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
item = item.nonzero()
/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:123: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [1,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [15,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [17,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [23,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
File "tools/train_first_step.py", line 232, in <module>
main()
File "tools/train_first_step.py", line 224, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_first_step.py", line 103, in train
arguments,
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 70, in do_train
loss_dict,_,_,_,_ = model(images, targets)
File "/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/apex-0.1-py3.7-linux-x86_64.egg/apex/amp/_initialize.py", line 197, in new_fwd
**applier(kwargs, input_caster))
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 67, in forward
x, result, detector_losses = self.roi_heads(features, proposals, targets)
File "/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 27, in forward
x, detections, loss_box = self.box(features, proposals, targets)
File "/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/box_head/box_head.py", line 55, in forward
loss_classifier, loss_box_reg = self.loss_evaluator([class_logits], [box_regression])
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/box_head/loss.py", line 151, in __call__
sampled_pos_inds_subset = torch.nonzero(labels > 0).squeeze(1)
RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered
原因是我训练的数据集是70,coco一共80类
coco.py中写的很清楚,之前没看哈哈,当训练基础类别为70时,first 70 categories对应1 ~ 79。将num_classes改成81即可成功运行
# first 40 categories: 1 ~ 44; first 70 categories: 1 ~ 79; first 75 categories: 1 ~ 85
# second 40 categories: 45 ~ 91; second 10 categories: 80 ~ 91; second 5 categories: 86 ~ 91
# totally 80 categories
边栏推荐
- [introduction to information retrieval] Chapter II vocabulary dictionary and inverted record table
- Oracle segment advisor, how to deal with row link row migration, reduce high water level
- Conda 创建,复制,分享虚拟环境
- Calculate the difference in days, months, and years between two dates in PHP
- 华为机试题
- 腾讯机试题
- 聊天中文语料库对比(附上各资源链接)
- SSM实验室设备管理
- view的绘制机制(一)
- 实现接口 Interface Iterable&lt;T&gt;
猜你喜欢
SSM student achievement information management system
Using MATLAB to realize: Jacobi, Gauss Seidel iteration
Illustration of etcd access in kubernetes
【Ranking】Pre-trained Language Model based Ranking in Baidu Search
基于pytorch的YOLOv5单张图片检测实现
CSRF attack
ORACLE EBS中消息队列fnd_msg_pub、fnd_message在PL/SQL中的应用
【BERT,GPT+KG调研】Pretrain model融合knowledge的论文集锦
叮咚,Redis OM对象映射框架来了
Interpretation of ernie1.0 and ernie2.0 papers
随机推荐
SSM supermarket order management system
Oracle EBs and apex integrated login and principle analysis
Use matlab to realize: chord cut method, dichotomy, CG method, find zero point and solve equation
A slide with two tables will help you quickly understand the target detection
【信息检索导论】第二章 词项词典与倒排记录表
点云数据理解(PointNet实现第3步)
allennlp 中的TypeError: Object of type Tensor is not JSON serializable错误
《Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer》论文翻译
Proteus -- RS-232 dual computer communication
Spark SQL task performance optimization (basic)
Drawing mechanism of view (II)
Pratique et réflexion sur l'entrepôt de données hors ligne et le développement Bi
ARP attack
软件开发模式之敏捷开发(scrum)
叮咚,Redis OM对象映射框架来了
ssm人事管理系统
【Torch】最简洁logging使用指南
华为机试题
parser.parse_args 布尔值类型将False解析为True
【BERT,GPT+KG调研】Pretrain model融合knowledge的论文集锦