End-to-End Semi-Supervised Object Detection with Soft Teacher
By Mengde Xu*, Zheng Zhang*, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai, Zicheng Liu.
This repo is the official implementation of ICCV2021 paper "End-to-End Semi-Supervised Object Detection with Soft Teacher".
Citation
@article{xu2021end,
title={End-to-End Semi-Supervised Object Detection with Soft Teacher},
author={Xu, Mengde and Zhang, Zheng and Hu, Han and Wang, Jianfeng and Wang, Lijuan and Wei, Fangyun and Bai, Xiang and Liu, Zicheng},
journal={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2021}
}
Main Results
Partial Labeled Data
We followed STAC[1] to evaluate on 5 different data splits for each setting, and report the average performance of 5 splits. The results are shown in the following:
1% labeled data
Method | mAP | Model Weights | Config Files |
---|---|---|---|
Baseline | 10.0 | - | Config |
Ours (thr=5e-2) | 21.62 | Drive | Config |
Ours (thr=1e-3) | 22.64 | Drive | Config |
5% labeled data
Method | mAP | Model Weights | Config Files |
---|---|---|---|
Baseline | 20.92 | - | Config |
Ours (thr=5e-2) | 30.42 | Drive | Config |
Ours (thr=1e-3) | 31.7 | Drive | Config |
10% labeled data
Method | mAP | Model Weights | Config Files |
---|---|---|---|
Baseline | 26.94 | - | Config |
Ours (thr=5e-2) | 33.78 | Drive | Config |
Ours (thr=1e-3) | 34.7 | Drive | Config |
Full Labeled Data
Faster R-CNN (ResNet-50)
Model | mAP | Model Weights | Config Files |
---|---|---|---|
Baseline | 40.9 | - | Config |
Ours (thr=5e-2) | 44.05 | Drive | Config |
Ours (thr=1e-3) | 44.6 | Drive | Config |
Ours* (thr=5e-2) | 44.5 | - | Config |
Ours* (thr=1e-3) | 44.9 | - | Config |
Faster R-CNN (ResNet-101)
Model | mAP | Model Weights | Config Files |
---|---|---|---|
Baseline | 43.8 | - | Config |
Ours* (thr=5e-2) | 46.8 | - | Config |
Ours* (thr=1e-3) | 47.3 | - | Config |
Notes
- Ours* means we use longer training schedule.
thr
indicatesmodel.test_cfg.rcnn.score_thr
in config files. This inference trick was first introduced by Instant-Teaching[2].- All models are trained on 8*V100 GPUs
Usage
Requirements
Ubuntu 16.04
Anaconda3
withpython=3.6
Pytorch=1.9.0
mmdetection=2.16.0+fe46ffe
mmcv=1.3.9
wandb=0.10.31
Notes
- We use wandb for visualization, if you don't want to use it, just comment line
273-284
inconfigs/soft_teacher/base.py
.
Installation
make install
Data Preparation
- Download the COCO dataset
- Execute the following command to generate data set splits:
# YOUR_DATA should be a directory contains coco dataset.
# For eg.:
# YOUR_DATA/
# coco/
# train2017/
# val2017/
# unlabeled2017/
# annotations/
ln -s ${YOUR_DATA} data
bash tools/dataset/prepare_coco_data.sh conduct
Training
- To train model on the partial labeled data setting:
# JOB_TYPE: 'baseline' or 'semi', decide which kind of job to run
# PERCENT_LABELED_DATA: 1, 5, 10. The ratio of labeled coco data in whole training dataset.
# GPU_NUM: number of gpus to run the job
for FOLD in 1 2 3 4 5;
do
bash tools/dist_train_partially.sh <JOB_TYPE> ${FOLD} <PERCENT_LABELED_DATA> <GPU_NUM>
done
For example, we could run the following scripts to train our model on 10% labeled data with 8 GPUs:
for FOLD in 1 2 3 4 5;
do
bash tools/dist_train_partially.sh semi ${FOLD} 10 8
done
- To train model on the full labeled data setting:
bash tools/dist_train.sh <CONFIG_FILE_PATH> <NUM_GPUS>
For example, to train ours R50
model with 8 GPUs:
bash tools/dist_train.sh configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py 8
Evaluation
bash tools/dist_test.sh <CONFIG_FILE_PATH> <CHECKPOINT_PATH> <NUM_GPUS> --eval bbox --cfg-options model.test_cfg.rcnn.score_thr=<THR>
Inference
To inference with trained model and visualize the detection results:
# [IMAGE_FILE_PATH]: the path of your image file in local file system
# [CONFIG_FILE]: the path of a confile file
# [CHECKPOINT_PATH]: the path of a trained model related to provided confilg file.
# [OUTPUT_PATH]: the directory to save detection result
python demo/image_demo.py [IMAGE_FILE_PATH] [CONFIG_FILE] [CHECKPOINT_PATH] --output [OUTPUT_PATH]
For example:
- Inference on single image with provided
R50
model:
python demo/image_demo.py /tmp/tmp.png configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py work_dirs/downloaded.model --output work_dirs/
After the program completes, a image with the same name as input will be saved to work_dirs
- Inference on many images with provided
R50
model:
python demo/image_demo.py '/tmp/*.jpg' configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py work_dirs/downloaded.model --output work_dirs/
[1] A Simple Semi-Supervised Learning Framework for Object Detection
[2] Instant-Teaching: An End-to-End Semi-SupervisedObject Detection Framework