QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

Overview

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

  • TL;DR: QueryInst is a simple and effective query based instance segmentation method driven by parallel supervision on dynamic mask heads, which outperforms previous arts in terms of both accuracy and speed.

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation,

by Yuxin Fang*, Shusheng Yang*, Xinggang Wang†, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu.

(*) equal contribution, (†) corresponding author.

arXiv technical report (arXiv 2105.01928)

QueryInst

  • This repo serves as the official implementation for QueryInst, based on mmdetection and built upon Sparse R-CNN & DETR. Implantations based on Detectron2 will be released in the near future.

  • This project is under active development, we will extend QueryInst to a wide range of instance-level recognition tasks.

Updates

[06/05/2021] 🌟 QueryInst training and inference code has been released!

Getting Started

python setup.py develop
  • Prepare datasets:
mkdir data && cd data
ln -s /path/to/coco coco
  • Training QueryInst with single GPU:
python tools/train.py configs/queryinst/queryinst_r50_fpn_1x_coco.py
  • Training QueryInst with multi GPUs:
./tools/dist_train.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py 8
  • Test QueryInst on COCO val set with single GPU:
python tools/test.py configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth --eval bbox segm
  • Test QueryInst on COCO val set with multi GPUs:
./tools/dist_test.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth 8 --eval bbox segm

Main Results on COCO val

Configs Aug. Weights Box AP Mask AP
QueryInst_R50_3x_300_queries 480 ~ 800, w/ Crop - 46.9 41.4
QueryInst_R101_3x_300_queries 480 ~ 800, w/ Crop - 48.0 42.4
QueryInst_X101-DCN_3x_300_queries 480 ~ 800, w/ Crop - 50.3 44.2

Citation

If you find our paper and code useful in your research, please consider giving a star and citation ?? :

@article{QueryInst,
  title={QueryInst: Parallelly Supervised Mask Query for Instance Segmentation},
  author={Fang, Yuxin and Yang, Shusheng and Wang, Xinggang and Li, Yu and Fang, Chen and Shan, Ying and Feng, Bin and Liu, Wenyu},
  journal={arXiv preprint arXiv:2105.01928},
  year={2021}
}

TODO

  • QueryInst training and inference code.
  • QueryInst based on Detectron2 toolbox will be released in the near future.
  • QueryInst configurations for Cityscapes and YouTube-VIS.
  • QueryInst pretrain weights.
Owner
Hust Visual Learning Team
Hust Visual Learning Team belongs to the Artificial Intelligence Research Institute in the School of EIC in HUST
Hust Visual Learning Team
VR Viewport Pose Model for Quantifying and Exploiting Frame Correlations

This repository contains the introduction to the collected VRViewportPose dataset and the code for the IEEE INFOCOM 2022 paper: "VR Viewport Pose Model for Quantifying and Exploiting Frame Correlatio

0 Aug 10, 2022
Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design"

Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design". CoordAttention tensorflow slim

Billy 9 Aug 22, 2022
Official project repository for 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination'

NCAE_UAD Official project repository of 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination' Abstract In this p

Jongmin Andrew Yu 2 Feb 10, 2022
[ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds

3DVG-Transformer This repository is for the ICCV 2021 paper "3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds" Our method "3DV

22 Dec 11, 2022
Implementation of SwinTransformerV2 in TensorFlow.

SwinTransformerV2-TensorFlow A TensorFlow implementation of SwinTransformerV2 by Microsoft Research Asia, based on their official implementation of Sw

Phan Nguyen 2 May 30, 2022
The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

PIC4SeRCentre 20 Jan 03, 2023
An implementation of MobileFormer

MobileFormer An implementation of MobileFormer proposed by Yinpeng Chen, Xiyang Dai et al. Including [1] Mobile-Former proposed in:

slwang9353 62 Dec 28, 2022
Code of paper "Compositionally Generalizable 3D Structure Prediction"

Compositionally Generalizable 3D Structure Prediction In this work, We bring in the concept of compositional generalizability and factorizes the 3D sh

Songfang Han 30 Dec 17, 2022
Tzer: TVM Implementation of "Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation (OOPSLA'22)“.

Artifact • Reproduce Bugs • Quick Start • Installation • Extend Tzer Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation This is the s

12 Dec 29, 2022
Pytorch implementation of our method for regularizing nerual radiance fields for few-shot neural volume rendering.

InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering Pytorch implementation of our method for regularizing nerual radiance fields f

106 Jan 06, 2023
[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Just Ask: Learning to Answer Questions from Millions of Narrated Videos Webpage • Demo • Paper This repository provides the code for our paper, includ

Antoine Yang 87 Jan 05, 2023
My tensorflow implementation of "A neural conversational model", a Deep learning based chatbot

Deep Q&A Table of Contents Presentation Installation Running Chatbot Web interface Results Pretrained model Improvements Upgrade Presentation This wor

Conchylicultor 2.9k Dec 28, 2022
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP

CLIP2Video: Mastering Video-Text Retrieval via Image CLIP The implementation of paper CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. CLIP2

168 Dec 29, 2022
Official pytorch implementation of the AAAI 2021 paper Semantic Grouping Network for Video Captioning

Semantic Grouping Network for Video Captioning Hobin Ryu, Sunghun Kang, Haeyong Kang, and Chang D. Yoo. AAAI 2021. [arxiv] Environment Ubuntu 16.04 CU

Hobin Ryu 43 Nov 25, 2022
PECOS - Prediction for Enormous and Correlated Spaces

PECOS - Predictions for Enormous and Correlated Output Spaces PECOS is a versatile and modular machine learning (ML) framework for fast learning and i

Amazon 387 Jan 04, 2023
Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"

VoCapXLM Code for EMNLP2021 paper Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training Environment DockerFile: dancingso

Bo Zheng 15 Jul 28, 2022
Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

ViLT Code for the paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision" Install pip install -r requirements.txt pip

Wonjae Kim 922 Jan 01, 2023
TLXZoo - Pre-trained models based on TensorLayerX

Pre-trained models based on TensorLayerX. TensorLayerX is a multi-backend AI fra

TensorLayer Community 13 Dec 07, 2022
Code & Data for the Paper "Time Masking for Temporal Language Models", WSDM 2022

Time Masking for Temporal Language Models This repository provides a reference implementation of the paper: Time Masking for Temporal Language Models

Guy Rosin 12 Jan 06, 2023
This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Prompt-Based Multi-Modal Image Segmentation This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation". The sys

Timo Lüddecke 305 Dec 30, 2022