当前位置:网站首页>Faster-ILOD、maskrcnn_benchmark训练coco数据集及问题汇总
Faster-ILOD、maskrcnn_benchmark训练coco数据集及问题汇总
2022-07-02 06:26:00 【chenf0】
loading annotations into memory...
Done (t=18.25s)
creating index...
index created!
number of images used for training: 31235
2022-06-05 03:17:10,191 maskrcnn_benchmark.trainer INFO: Start training
/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py:422: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
item = item.nonzero()
/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:123: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [1,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [15,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [17,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [23,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
File "tools/train_first_step.py", line 232, in <module>
main()
File "tools/train_first_step.py", line 224, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_first_step.py", line 103, in train
arguments,
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 70, in do_train
loss_dict,_,_,_,_ = model(images, targets)
File "/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/apex-0.1-py3.7-linux-x86_64.egg/apex/amp/_initialize.py", line 197, in new_fwd
**applier(kwargs, input_caster))
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 67, in forward
x, result, detector_losses = self.roi_heads(features, proposals, targets)
File "/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 27, in forward
x, detections, loss_box = self.box(features, proposals, targets)
File "/home/earhian/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/box_head/box_head.py", line 55, in forward
loss_classifier, loss_box_reg = self.loss_evaluator([class_logits], [box_regression])
File "/data3/cf/papercpde/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/box_head/loss.py", line 151, in __call__
sampled_pos_inds_subset = torch.nonzero(labels > 0).squeeze(1)
RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered

原因是我训练的数据集是70,coco一共80类
coco.py中写的很清楚,之前没看哈哈,当训练基础类别为70时,first 70 categories对应1 ~ 79。将num_classes改成81即可成功运行
# first 40 categories: 1 ~ 44; first 70 categories: 1 ~ 79; first 75 categories: 1 ~ 85
# second 40 categories: 45 ~ 91; second 10 categories: 80 ~ 91; second 5 categories: 86 ~ 91
# totally 80 categories
边栏推荐
- 点云数据理解(PointNet实现第3步)
- 解决万恶的open failed: ENOENT (No such file or directory)/(Operation not permitted)
- Pyspark build temporary report error
- Optimization method: meaning of common mathematical symbols
- MySQL组合索引加不加ID
- sparksql数据倾斜那些事儿
- 矩阵的Jordan分解实例
- [torch] the most concise logging User Guide
- One field in thinkphp5 corresponds to multiple fuzzy queries
- Oracle general ledger balance table GL for foreign currency bookkeeping_ Balance change (Part 1)
猜你喜欢

ABM论文翻译

MapReduce concepts and cases (Shang Silicon Valley Learning Notes)
![[introduction to information retrieval] Chapter 7 scoring calculation in search system](/img/cc/a5437cd36956e4c239889114b783c4.png)
[introduction to information retrieval] Chapter 7 scoring calculation in search system

Tencent machine test questions

PointNet原理证明与理解

Classloader and parental delegation mechanism

【MEDICAL】Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization

Oracle EBS database monitoring -zabbix+zabbix-agent2+orabbix

Implementation of purchase, sales and inventory system with ssm+mysql

ORACLE 11G利用 ORDS+pljson来实现json_table 效果
随机推荐
Using MATLAB to realize: Jacobi, Gauss Seidel iteration
SSM supermarket order management system
MySQL composite index with or without ID
基于onnxruntime的YOLOv5单张图片检测实现
DNS attack details
Analysis of MapReduce and yarn principles
生成模型与判别模型的区别与理解
Two table Association of pyspark in idea2020 (field names are the same)
读《敏捷整洁之道:回归本源》后感
【信息检索导论】第二章 词项词典与倒排记录表
parser. parse_ Args boolean type resolves false to true
软件开发模式之敏捷开发(scrum)
view的绘制机制(三)
点云数据理解(PointNet实现第3步)
ERNIE1.0 与 ERNIE2.0 论文解读
A slide with two tables will help you quickly understand the target detection
[tricks] whiteningbert: an easy unsupervised sentence embedding approach
Oracle RMAN semi automatic recovery script restore phase
传统目标检测笔记1__ Viola Jones
ORACLE EBS 和 APEX 集成登录及原理分析