Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Overview

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Official implementation of:

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation
Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang and Junsong Yuan
In ACM International Conference on Multimedia , Seattle WA, October 12-16, 2020.

Many thanks to mmdetection authors for their great framework!

News

Mar 2, 2021 Update: We test Forest R-CNN on LVIS v1.0 set. Thanks for considering comparing with our method :)

Jan 1, 2021 Update: We propose Forest DetSeg, an extension of original Forest R-CNN. Forest DetSeg extends the proposed method to RetinaNet. While the new work is under review now, the code has been available. More details will come up along with the new paper.

Installation

Please refer to INSTALL.md for installation and dataset preparation.

Forest R-CNN

Inference

# Examples
# single-gpu testing
python tools/test.py configs/lvis/forest_rcnn_r50_fpn.py forest_rcnn_res50.pth --out out.pkl --eval bbox segm

# multi-gpu testing
./tools/dist_test.sh configs/lvis/forest_rcnn_r50_fpn.py forest_rcnn_res50.pth ${GPU_NUM} --out out.pkl --eval bbox segm

Training

# Examples
# single-gpu training
python tools/train.py configs/lvis/forest_rcnn_r50_fpn.py --validate

# multi-gpu training
./tools/dist_train.sh configs/lvis/forest_rcnn_r50_fpn.py ${GPU_NUM} --validate

(Note that we found in our experiments the best result comes up around the 20-th epoch instead of the end of training.)

Forest RetinaNet

Inference

# Examples  
# multi-gpu testing
./tools/dist_test.sh configs/lvis/forest_retinanet_r50_fpn_1x.py forest_retinanet_res50.pth ${GPU_NUM} --out out.pkl --eval bbox segm

Training

# Examples    
# multi-gpu training
./tools/dist_train.sh configs/lvis/forest_retinanet_r50_fpn_1x.py ${GPU_NUM} --validate

Main Results

Instance Segmentation on LVIS v0.5 val set

AP and AP.b denote the mask AP and box AP. r, c, f represent the rare, common, frequent contegoires.

Method Backbone AP AP.r AP.c AP.f AP.b AP.b.r AP.b.c AP.b.f download
MaskRCNN R50-FPN 21.7 6.8 22.6 26.4 21.8 6.5 21.6 28.0 model 
Forest R-CNN R50-FPN 25.6 18.3 26.4 27.6 25.9 16.9 26.1 29.2 model 
MaskRCNN R101-FPN 23.6 10.0 24.8 27.6 23.5 8.7 23.1 29.8 model 
Forest R-CNN R101-FPN 26.9 20.1 27.9 28.3 27.5 20.0 27.5 30.4 model 
MaskRCNN X-101-32x4d-FPN 24.8 10.0 26.4 28.6 24.8 8.6 25.0 30.9 model 
Forest R-CNN X-101-32x4d-FPN 28.5 21.6 29.7 29.7 28.8 20.6 29.2 31.7 model 

Instance Segmentation on LVIS v1.0 val set

Method Backbone AP AP.r AP.c AP.f AP.b
MaskRCNN R50-FPN 19.2 0.0 17.2 29.5 20.0
Forest R-CNN R50-FPN 23.2 14.2 22.7 27.7 24.6

Visualized Examples

Citation

If you find it useful in your research, please consider citing our paper as follows:

@inproceedings{wu2020forest,
title={Forest R-CNN: Large-vocabulary long-tailed object detection and instance segmentation},
author={Wu, Jialian and Song, Liangchen and Wang, Tiancai and Zhang, Qian and Yuan, Junsong},
booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
pages={1570--1578},
year={2020}}
Owner
Jialian Wu
Ph.D. Candidate at SUNY Buffalo
Jialian Wu
The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Ren Yurui 261 Jan 09, 2023
EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

EFENet EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation Code is a bit messy now. I woud clean up soon. For training the EF

Yaping Zhao 19 Nov 05, 2022
A simple software for capturing human body movements using the Kinect camera.

KinectMotionCapture A simple software for capturing human body movements using the Kinect camera. The software can seamlessly save joints and bones po

Aleksander Palkowski 5 Aug 13, 2022
This is the official implementation of our proposed SwinMR

SwinMR This is the official implementation of our proposed SwinMR: Swin Transformer for Fast MRI Please cite: @article{huang2022swin, title={Swi

A Yang Lab (led by Dr Guang Yang) 27 Nov 17, 2022
A TensorFlow Implementation of "Deep Multi-Scale Video Prediction Beyond Mean Square Error" by Mathieu, Couprie & LeCun.

Adversarial Video Generation This project implements a generative adversarial network to predict future frames of video, as detailed in "Deep Multi-Sc

Matt Cooper 704 Nov 26, 2022
Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

mandos 43 Dec 07, 2022
Linear image-to-image translation

Linear (Un)supervised Image-to-Image Translation Examples for linear orthogonal transformations in PCA domain, learned without pairing supervision. Tr

Eitan Richardson 40 Aug 31, 2022
Online-compatible Unsupervised Non-resonant Anomaly Detection Repository

Online-compatible Unsupervised Non-resonant Anomaly Detection Repository Repository containing all scripts used in the studies of Online-compatible Un

0 Nov 09, 2021
The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

DG-TrajGen The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022. Our Meth

Wang 25 Sep 26, 2022
Code samples for my book "Neural Networks and Deep Learning"

Code samples for "Neural Networks and Deep Learning" This repository contains code samples for my book on "Neural Networks and Deep Learning". The cod

Michael Nielsen 13.9k Dec 26, 2022
[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

MobileSal IEEE TPAMI 2021: MobileSal: Extremely Efficient RGB-D Salient Object Detection This repository contains full training & testing code, and pr

Yu-Huan Wu 52 Jan 06, 2023
MBPO (paper: When to trust your model: Model-based policy optimization) in offline RL settings

offline-MBPO This repository contains the code of a version of model-based RL algorithm MBPO, which is modified to perform in offline RL settings Pape

LxzGordon 1 Oct 24, 2021
Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

PixelSynth: Generating a 3D-Consistent Experience from a Single Image (ICCV 2021) Chris Rockwell, David F. Fouhey, and Justin Johnson [Project Website

Chris Rockwell 95 Nov 22, 2022
KinectFusion implemented in Python with PyTorch

KinectFusion implemented in Python with PyTorch This is a lightweight Python implementation of KinectFusion. All the core functions (TSDF volume, fram

Jingwen Wang 80 Jan 03, 2023
Training BERT with Compute/Time (Academic) Budget

Training BERT with Compute/Time (Academic) Budget This repository contains scripts for pre-training and finetuning BERT-like models with limited time

Intel Labs 263 Jan 07, 2023
TResNet: High Performance GPU-Dedicated Architecture

TResNet: High Performance GPU-Dedicated Architecture paperV2 | pretrained models Official PyTorch Implementation Tal Ridnik, Hussam Lawen, Asaf Noy, I

426 Dec 28, 2022
[ICSE2020] MemLock: Memory Usage Guided Fuzzing

MemLock: Memory Usage Guided Fuzzing This repository provides the tool and the evaluation subjects for the paper "MemLock: Memory Usage Guided Fuzzing

Cheng Wen 54 Jan 07, 2023
Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement

Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement In this project, we proposed a Domain Disentanglement Faster-RCNN (DDF)

19 Nov 24, 2022
Equivariant Imaging: Learning Beyond the Range Space

Equivariant Imaging: Learning Beyond the Range Space Equivariant Imaging: Learning Beyond the Range Space Dongdong Chen, Julián Tachella, Mike E. Davi

Dongdong Chen 46 Jan 01, 2023
ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Introduction The official repository for "Mining Contextual Information Beyond Image for Semantic Segmentation". Our full code has been merged into ss

55 Nov 09, 2022