当前位置:网站首页>Faster-ILOD、maskrcnn_ Benchmark installation process and problems encountered
Faster-ILOD、maskrcnn_ Benchmark installation process and problems encountered
2022-07-02 07:34:00 【chenf0】
The paper
Faster ILOD: Incremental learning for object detectors based on faster RCNN 2020
The paper :https://arxiv.org/abs/2003.03901
Code :https://github.com/CanPeng123/Faster-ILOD
Code
One 、Requirements:
PyTorch 1.0 from a nightly release. It will not work with 1.0 nor
1.0.1. Installation instructions can be found in https://pytorch.org/get-started/locally/torchvision from master
cocoapi
yacs
matplotlib
GCC >= 4.9
OpenCV
CUDA >= 9.0
Two 、 install Step-by-step installation
# first, make sure that your conda is setup properly with the right environment
# for that, check that `which conda`, `which pip` and `which python` points to the
# right path. From a clean conda env, this is what you need to do
conda create --name maskrcnn_benchmark -y
conda activate maskrcnn_benchmark
# this installs the right pip and dependencies for the fresh python
conda install ipython pip
# maskrcnn_benchmark and coco api dependencies
pip install ninja yacs cython matplotlib tqdm opencv-python
# follow PyTorch installation in https://pytorch.org/get-started/locally/
# we give the instructions for CUDA 9.0
conda install -c pytorch pytorch-nightly torchvision cudatoolkit=9.0
export INSTALL_DIR=$PWD
# install pycocotools
cd $INSTALL_DIR
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py build_ext install
# install cityscapesScripts
cd $INSTALL_DIR
git clone https://github.com/mcordts/cityscapesScripts.git
cd cityscapesScripts/
python setup.py build_ext install
# install apex
cd $INSTALL_DIR
git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext
# install PyTorch Detection
cd $INSTALL_DIR
git clone https://github.com/facebookresearch/maskrcnn-benchmark.git
cd maskrcnn-benchmark
# the following will install the lib with
# symbolic links, so that you can modify
# the files if you want and won't need to
# re-build it
python setup.py build develop
unset INSTALL_DIR
# or if you are on macOS
# MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py build develop
3、 ... and 、Faster-ILOD
take maskrcnn After the environment is installed , take Faster-ILOD The relevant code covers maskrcnn In the relevant folder , function python setup.py build develop recompile Or download it directly Faster-ILOD Code .
Four 、 function Faster-ILOD
With 15+5 For example :
1. Modify dataset path
modify Faster-ILOD/maskrcnn_benchmark/config/paths_catalog.py find voc The corresponding path is modified to its own .
2. Modify the configuration file
/configs/e2e_faster_rcnn_R_50_C4_1x.yaml
Various parameters can be modified as required , This file will not be modified for the time being .
3. Training basic network
function python tools/train_first_step.py --config-file="./configs/e2e_faster_rcnn_R_50_C4_1x.yaml"
After successful operation, you can /home/incremental_learning_ResNet50_C4/RPN_15_classes_40k_steps View the output of the training in .
4. Incremental training
(1) modify e2e_faster_rcnn_R_50_C4_1x_Source_model.yaml and e2e_faster_rcnn_R_50_C4_1x_Target_model.yaml, Put the category in the file 、 New category 、 Old category , Modify the path of the final model and the output path after the training in the previous stage . function python tools/train_incremental.py Get the final training result in the corresponding output file .
- e2e_faster_rcnn_R_50_C4_1x_Source_model.yaml

- e2e_faster_rcnn_R_50_C4_1x_Target_model.yaml

- tools/train_incremental.py
source_model_config_file = "/home/chenfang/maskrcnn-benchmark/configs/e2e_faster_rcnn_R_50_C4_1x_Source_model.yaml"
target_model_config_file = "/home/chenfang/maskrcnn-benchmark/configs/e2e_faster_rcnn_R_50_C4_1x_Target_model.yaml"
5、 ... and 、 Have a problem
1.git clone Because of network problems, I can't download it
Files can be downloaded locally , Upload to server ;
The installation package can also be downloaded locally , Upload to server ,pip install File path Installation
2.RuntimeError: Error compiling objects for extension
pytorch The version is not suitable
I am cuda10.1 pytorch1.7,1
After looking at the solution , take pytorch The version is reduced to 1.5 success
CUDA 10.1
Pytorch 1.4.0
torchvision 0.5.0
For more solutions, please refer to https://github.com/facebookresearch/maskrcnn-benchmark/issues/1236
3.RuntimeError: Output 0 of UnbindBackward is a view and its base or another view of its base has been modified inplace.
RuntimeError: Output 0 of UnbindBackward is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

Reference resources : https://blog.csdn.net/Ginomica_xyx/article/details/120491859
Knowing the cause of the problem is right self.bbox Multiple modifications , The second modification ,python It is not clear that the operation is original self.bbox Or after modification self.bbox.
Know the problem , Try to solve the problem : Modify the code take self.bbox Copy to a parameter Then operate on this parameter ( Can not be ); Make a deep copy You can't either .
Check related questions , At the end of the day pytorch1.7.0 Of bug.
take pytorch The version is reduced to 1.6.0 This problem is solved
4.unable to execute ‘usr/local/cuda-10.0/bin/nvcc‘: No such file or directory
Reference resources :
linux View and modify PATH Method of environment variablehttps://blog.csdn.net/qq_41251963/article/details/110120386
https://blog.csdn.net/tailonh/article/details/120322932
https://blog.csdn.net/G_inkk/article/details/124584873
5. error: cannot call member function ‘void std::basic_string<_CharT, _Traits, _Alloc>::
python setup.py build develop Recompile is an error
RuntimeError: Error compiling objects for extension, The reason for the error is
/usr/include/c++/7/bits/basic_string.tcc:1067:16: error: cannot call member function ‘void std::basic_string<_CharT, _Traits, _Alloc>::_Rep
resolvent
Reference resources :https://blog.csdn.net/weixin_45328592/article/details/114646355
https://blog.csdn.net/qq_29695701/article/details/118548238
sudo gedit /usr/include/c++/7/bits/basic_string.tcc
take
__p->_M_set_sharable()
Change it to
(*__p)._M_set_sharable()
that will do .
If there is a problem in modifying the file :
‘readonly’ option is set (add ! to override)
The current user does not have permission , First sudo -i Switch to root Permission can be modified Use it directly sudo vim Open the file for modification
Reference resources
https://blog.csdn.net/cheng_feng_xiao_zhan/article/details/53391474
RuntimeError: Error compiling objects for extension
There may also be the following reasons :
solve : There is a colon in the error path , It indicates that there is a problem with the setting of environment variables
sudo vim ~/.bashrc
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda
Change it to
export CUDA_HOME=/usr/local/cuda
source ~/.bashrc
Reference resources :
https://blog.csdn.net/loovelj/article/details/110490986
https://www.codeleading.com/article/95735054818/
https://blog.csdn.net/zt1091574181/article/details/113611468
6.AsstributeError:‘tuple’ object has no attribute ‘values’
take loss_dict Change it to loss_dict[0]
7.RuntimeError: The size of tensor a (16) must match the size of tensor b (21) at non-singleton dimension 0
Incremental learning error , The problem should occur when loading basic training data , take optimizer Value changed to None that will do
checkpointer_target = DetectronCheckpointer(
cfg_target, model_target, optimizer=None, scheduler=scheduler,
save_dir=output_dir_target,save_to_disk=save_to_disk, logger=logger_target)
8. Incremental learning without training , Direct tests
It should be that the basic model has already run 40000, arguments_target[“iteration”] Directly for 40000, Our incremental training is still set to 40000, I think it's finished , Just train directly , Can be e2e_faster_rcnn_R_50_C4_1x_Target_model.yaml in MAX_ITER: 80000 # number of iteration Change it to 80000 That's all right.
ps: many cuda Switching between versions
stay /usr/local/ Check your installed... In the directory cuda edition
cd /usr/local
ls
bin cuda cuda-10.2 etc include man share
cud cuda-10.1 cuda-11.0 games lib sbin src
View the current cuda edition
nvcc -V
Or use stat cuda View the current cuda Soft connection
File: cuda -> /usr/local/cuda-10.1
Size: 20 Blocks: 0 IO Block: 4096 symbolic link
Device: 812h/2066d Inode: 2757665 Links: 1
Access: (0777/lrwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2022-06-06 21:34:32.342489356 +0800
Modify: 2022-05-22 15:11:26.498549390 +0800
Change: 2022-05-22 15:11:26.498549390 +0800
Birth: -
If you want to change it to 10.2 edition , You need to delete the current link first , And reset it to 10.2, Just two lines of code
sudo rm -rf cuda
sudo ln -s /usr/local/cuda-10.2 /usr/local/cuda
Now check out cuda edition
nvcc -V
You can see that the version has been switched
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
边栏推荐
猜你喜欢

Oracle EBs and apex integrated login and principle analysis

Pratique et réflexion sur l'entrepôt de données hors ligne et le développement Bi

spark sql任务性能优化(基础)

JSP intelligent community property management system
![[introduction to information retrieval] Chapter 1 Boolean retrieval](/img/78/df4bcefd3307d7cdd25a9ee345f244.png)
[introduction to information retrieval] Chapter 1 Boolean retrieval

Principle analysis of spark

SSM学生成绩信息管理系统

一份Slide两张表格带你快速了解目标检测
![[introduction to information retrieval] Chapter 6 term weight and vector space model](/img/42/bc54da40a878198118648291e2e762.png)
[introduction to information retrieval] Chapter 6 term weight and vector space model

【MEDICAL】Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
随机推荐
[model distillation] tinybert: distilling Bert for natural language understanding
離線數倉和bi開發的實踐和思考
@Transational踩坑
Typeerror in allenlp: object of type tensor is not JSON serializable error
Drawing mechanism of view (II)
【信息检索导论】第一章 布尔检索
ssm超市订单管理系统
Drawing mechanism of view (I)
Play online games with mame32k
Implement interface Iterable & lt; T&gt;
CRP implementation methodology
矩阵的Jordan分解实例
Alpha Beta Pruning in Adversarial Search
使用Matlab实现:Jacobi、Gauss-Seidel迭代
【MEDICAL】Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
The first quickapp demo
Oracle 11.2.0.3 handles the problem of continuous growth of sysaux table space without downtime
Pyspark build temporary report error
类加载器及双亲委派机制
Alpha Beta Pruning in Adversarial Search