Semi-Supervised Learning, Object Detection, ICCV2021

Last update: Dec 27, 2022

Overview

End-to-End Semi-Supervised Object Detection with Soft Teacher

By Mengde Xu*, Zheng Zhang*, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai, Zicheng Liu.

This repo is the official implementation of ICCV2021 paper "End-to-End Semi-Supervised Object Detection with Soft Teacher".

Citation

@article{xu2021end,
  title={End-to-End Semi-Supervised Object Detection with Soft Teacher},
  author={Xu, Mengde and Zhang, Zheng and Hu, Han and Wang, Jianfeng and Wang, Lijuan and Wei, Fangyun and Bai, Xiang and Liu, Zicheng},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

Main Results

Partial Labeled Data

We followed STAC[1] to evaluate on 5 different data splits for each setting, and report the average performance of 5 splits. The results are shown in the following:

1% labeled data

Method	mAP	Model Weights	Config Files
Baseline	10.0	-	Config
Ours (thr=5e-2)	21.62	Drive	Config
Ours (thr=1e-3)	22.64	Drive	Config

5% labeled data

Method	mAP	Model Weights	Config Files
Baseline	20.92	-	Config
Ours (thr=5e-2)	30.42	Drive	Config
Ours (thr=1e-3)	31.7	Drive	Config

10% labeled data

Method	mAP	Model Weights	Config Files
Baseline	26.94	-	Config
Ours (thr=5e-2)	33.78	Drive	Config
Ours (thr=1e-3)	34.7	Drive	Config

Full Labeled Data

Faster R-CNN (ResNet-50)

Model	mAP	Model Weights	Config Files
Baseline	40.9	-	Config
Ours (thr=5e-2)	44.05	Drive	Config
Ours (thr=1e-3)	44.6	Drive	Config
Ours* (thr=5e-2)	44.5	-	Config
Ours* (thr=1e-3)	44.9	-	Config

Faster R-CNN (ResNet-101)

Model	mAP	Model Weights	Config Files
Baseline	43.8	-	Config
Ours* (thr=5e-2)	46.8	-	Config
Ours* (thr=1e-3)	47.3	-	Config

Notes

Ours* means we use longer training schedule.
thr indicates model.test_cfg.rcnn.score_thr in config files. This inference trick was first introduced by Instant-Teaching[2].
All models are trained on 8*V100 GPUs

Usage

Requirements

Ubuntu 16.04
Anaconda3 with python=3.6
Pytorch=1.9.0
mmdetection=2.16.0+fe46ffe
mmcv=1.3.9
wandb=0.10.31

Notes

We use wandb for visualization, if you don't want to use it, just comment line 273-284 in configs/soft_teacher/base.py.

Installation

make install

Data Preparation

Download the COCO dataset
Execute the following command to generate data set splits:

# YOUR_DATA should be a directory contains coco dataset.
# For eg.:
# YOUR_DATA/
#  coco/
#     train2017/
#     val2017/
#     unlabeled2017/
#     annotations/
ln -s ${YOUR_DATA} data
bash tools/dataset/prepare_coco_data.sh conduct

Training

To train model on the partial labeled data setting:

# JOB_TYPE: 'baseline' or 'semi', decide which kind of job to run
# PERCENT_LABELED_DATA: 1, 5, 10. The ratio of labeled coco data in whole training dataset.
# GPU_NUM: number of gpus to run the job
for FOLD in 1 2 3 4 5;
do
  bash tools/dist_train_partially.sh <JOB_TYPE> ${FOLD} <PERCENT_LABELED_DATA> <GPU_NUM>
done

For example, we could run the following scripts to train our model on 10% labeled data with 8 GPUs:

for FOLD in 1 2 3 4 5;
do
  bash tools/dist_train_partially.sh semi ${FOLD} 10 8
done

To train model on the full labeled data setting:

bash tools/dist_train.sh <CONFIG_FILE_PATH> <NUM_GPUS>

For example, to train ours R50 model with 8 GPUs:

bash tools/dist_train.sh configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py 8

Evaluation

bash tools/dist_test.sh <CONFIG_FILE_PATH> <CHECKPOINT_PATH> <NUM_GPUS> --eval bbox --cfg-options model.test_cfg.rcnn.score_thr=<THR>

Inference

To inference with trained model and visualize the detection results:

# [IMAGE_FILE_PATH]: the path of your image file in local file system
# [CONFIG_FILE]: the path of a confile file
# [CHECKPOINT_PATH]: the path of a trained model related to provided confilg file.
# [OUTPUT_PATH]: the directory to save detection result
python demo/image_demo.py [IMAGE_FILE_PATH] [CONFIG_FILE] [CHECKPOINT_PATH] --output [OUTPUT_PATH]

For example:

Inference on single image with provided R50 model:

python demo/image_demo.py /tmp/tmp.png configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py work_dirs/downloaded.model --output work_dirs/

After the program completes, a image with the same name as input will be saved to work_dirs

Inference on many images with provided R50 model:

python demo/image_demo.py '/tmp/*.jpg' configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py work_dirs/downloaded.model --output work_dirs/

[1] A Simple Semi-Supervised Learning Framework for Object Detection

[2] Instant-Teaching: An End-to-End Semi-SupervisedObject Detection Framework

Semi-Supervised Learning, Object Detection, ICCV2021

Related tags

Overview

End-to-End Semi-Supervised Object Detection with Soft Teacher

Citation

Main Results

Partial Labeled Data

1% labeled data

5% labeled data

10% labeled data

Full Labeled Data

Faster R-CNN (ResNet-50)

Faster R-CNN (ResNet-101)

Notes

Usage

Requirements

Notes

Installation

Data Preparation

Training

Evaluation

Inference

Owner

Microsoft

HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps.

Implementation for Curriculum DeepSDF

Official code for our EMNLP2021 Outstanding Paper MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks

Project page for the paper Semi-Supervised Raw-to-Raw Mapping 2021.

RoadMap and preparation material for Machine Learning and Data Science - From beginner to expert.

这是一个mobilenet-yolov4-lite的库，把yolov4主干网络修改成了mobilenet，修改了Panet的卷积组成，使参数量大幅度缩小。

Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

2021 CCF BDCI 全国信息检索挑战杯（CCIR-Cup）智能人机交互自然语言理解赛道第二名参赛解决方案

Implementation of the paper ''Implicit Feature Refinement for Instance Segmentation''.

Predicting Event Memorability from Contextual Visual Semantics

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

The PyTorch implementation of paper REST: Debiased Social Recommendation via Reconstructing Exposure Strategies

TensorFlow implementation of Elastic Weight Consolidation

Learning to See by Looking at Noise

Surrogate-Assisted Genetic Algorithm for Wrapper Feature Selection

This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems

Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

The author's officially unofficial PyTorch BigGAN implementation.

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)