(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

Overview

RDPNet

IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

PyTorch training and testing code are available. We have achieved SOTA performance on the salient instance segmentation (SIS) task.

If you run into any problems or feel any difficulties to run this code, do not hesitate to leave issues in this repository.

My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

[Official Ver.] [PDF]

Citations

If you are using the code/model/data provided here in a publication, please consider citing:

@article{wu2021regularized,
   title={Regularized Densely-Connected Pyramid Network for Salient Instance Segmentation},
   volume={30},
   ISSN={1941-0042},
   DOI={10.1109/tip.2021.3065822},
   journal={IEEE Transactions on Image Processing},
   publisher={Institute of Electrical and Electronics Engineers (IEEE)},
   author={Wu, Yu-Huan and Liu, Yun and Zhang, Le and Gao, Wang and Cheng, Ming-Ming},
   year={2021},
   pages={3897–3907}
}

Requirements

  • PyTorch 1.1/1.0.1, Torchvision 0.2.2.post3, CUDA 9.0/10.0/10.1, apex
  • Validated on Ubuntu 16.04/18.04, PyTorch 1.1/1.0.1, CUDA 9.0/10.0/10.1, NVIDIA TITAN Xp

Installing

Please check INSTALL.md.

Note: we have provided an early tested apex version (url: here) and place it in our root folder (./apex/). You can also try other apex versions, which are not tested by us.

Data

Before training/testing our network, please download the data: [Google Drive, 0.7G], [Baidu Yun, yhwu].

The above zip file contains data of the ISOD and SOC dataset.

Note: if you are blocked by Google and Baidu services, you can contact me via e-mail and I will send you a copy of data and model weights.

We have processed the data to json format so you can use them without any preprocessing steps. After completion of downloading, extract the data and put them to ./datasets/ folder. Then, the ./datasets/ folder should contain two folders: isod/, soc/.

Train

It is very simple to train our network. We have prepared a script to run the training step. You can at first train our ResNet-50-based network on the ISOD dataset:

cd scripts
bash ./train_isod.sh

The training step should cost less than 1 hour for single GTX 1080Ti or TITAN Xp. This script will also store the network code, config file, log, and model weights.

We also provide ResNet-101 and ResNeXt-101 training scripts, and they are all in the scripts folder.

The default training code is for single gpu training since the training time is very low. You can also try multi gpus training by replacing --nproc_per_node=1 \ with --nproc_per_node=2 \ for 2-gpu training.

Test / Evaluation / Results

It is also very simple to test our network. First you need to download the model weights:

Taking the test on the ISOD dataset for example:

  1. Download the ISOD trained model weights, put it to model_zoo/ folder.
  2. cd the scripts folder, then run bash test_isod.sh.
  3. Testing step usually costs less than a minute. We use the official cocoapi for evaluation.

Note1: We strongly recommend to use cocoapi to evaluate the performance. Such evaluation is also automatically done with the testing process.

Note2: Default cocoapi evaluation outputs AP, AP50, AP75 peformance. To output the score of AP70, you need to change the cocoeval.py in cocoapi. See changes in this commitment:

BEFORE: stats[2] = _summarize(1, iouThr=.75, maxDets=self.params.maxDets[2])
AFTER:  stats[2] = _summarize(1, iouThr=.70, maxDets=self.params.maxDets[2])

Note3: If you are not familiar with the evalutation metric AP, AP50, AP75, you can refer to the introduction website here. Our official paper also introduces them in the Experiments section.

Visualize

We provide a simple python script to visualize the result: demo/visualize.py.

  1. Be sure that you have downloaded the ISOD pretrained weights [Google Drive, 0.14G].
  2. Put images to the demo/examples/ folder. I have prepared some images in this paper so do not worry that you have no images.
  3. cd demo, run python visualize.py
  4. Visualized images are generated in the same folder. You can change the target folder in visualize.py.

TODO

  1. Release the weights for real-world applications
  2. Add Jittor implementation
  3. Train with the enhanced base detector (FCOS TPAMI version) for better performance. Currently the base detector is the FCOS conference version with a bit lower performance.

Other Tips

I am free to answer your question if you are interested in salient instance segmentation. I also encourage everyone to contact me via my e-mail. My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

Acknowlogdement

This repository is built under the help of the following three projects for academic use only:

Owner
Yu-Huan Wu
Ph.D. student at Nankai University
Yu-Huan Wu
Remote sensing change detection tool based on PaddlePaddle

PdRSCD PdRSCD(PaddlePaddle Remote Sensing Change Detection)是一个基于飞桨PaddlePaddle的遥感变化检测的项目,pypi包名为ppcd。目前0.2版本,最新支持图像列表输入的训练和预测,如多期影像、多源影像甚至多期多源影像。可以快速完

38 Aug 31, 2022
TrTr: Visual Tracking with Transformer

TrTr: Visual Tracking with Transformer We propose a novel tracker network based on a powerful attention mechanism called Transformer encoder-decoder a

趙 漠居(Zhao, Moju) 66 Dec 27, 2022
Image Super-Resolution by Neural Texture Transfer

SRNTT: Image Super-Resolution by Neural Texture Transfer Tensorflow implementation of the paper Image Super-Resolution by Neural Texture Transfer acce

Zhifei Zhang 413 Nov 30, 2022
A deep learning library that makes face recognition efficient and effective

Distributed Arcface Training in Pytorch This is a deep learning library that makes face recognition efficient, and effective, which can train tens of

Sajjad Aemmi 10 Nov 23, 2021
Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

gans-collection.torch Torch implementation of various types of GANs (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN). Note that EBGAN and

Minchul Shin 53 Jan 22, 2022
Drone Task1 - Drone Task1 With Python

Drone_Task1 Matching Results 3.mp4 1.mp4

MLV Lab (Machine Learning and Vision Lab at Korea University) 11 Nov 14, 2022
CCPD: a diverse and well-annotated dataset for license plate detection and recognition

CCPD (Chinese City Parking Dataset, ECCV) UPdate on 10/03/2019. CCPD Dataset is now updated. We are confident that images in subsets of CCPD is much m

detectRecog 1.8k Dec 30, 2022
a Pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021"

A pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021" 1. Notes This is a pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in

91 Dec 26, 2022
PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)

D-VQA We provide the PyTorch implementation for Debiased Visual Question Answering from Feature and Sample Perspectives (NeurIPS 2021). Dependencies P

Zhiquan Wen 19 Dec 22, 2022
Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

Hongsuk Choi 215 Jan 06, 2023
Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

Vaibhav Shukla 1 Oct 27, 2021
Super Resolution for images using deep learning.

Neural Enhance Example #1 — Old Station: view comparison in 24-bit HD, original photo CC-BY-SA @siv-athens. As seen on TV! What if you could increase

Alex J. Champandard 11.7k Dec 29, 2022
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

ActNN : Activation Compressed Training This is the official project repository for ActNN: Reducing Training Memory Footprint via 2-Bit Activation Comp

UC Berkeley RISE 178 Jan 05, 2023
Hand gesture recognition model that can be used as a remote control for a smart tv.

Gesture_recognition The training data consists of a few hundred videos categorised into one of the five classes. Each video (typically 2-3 seconds lon

Pratyush Negi 1 Aug 11, 2022
Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions

EMS-COLS-recourse Initial Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions Folder structure: data folder contains raw an

Prateek Yadav 1 Nov 25, 2022
An implementation of EWC with PyTorch

EWC.pytorch An implementation of Elastic Weight Consolidation (EWC), proposed in James Kirkpatrick et al. Overcoming catastrophic forgetting in neural

Ryuichiro Hataya 166 Dec 22, 2022
PyTorch implementation for Partially View-aligned Representation Learning with Noise-robust Contrastive Loss (CVPR 2021)

2021-CVPR-MvCLN This repo contains the code and data of the following paper accepted by CVPR 2021 Partially View-aligned Representation Learning with

XLearning Group 33 Nov 01, 2022
converts nominal survey data into a numerical value based on a dictionary lookup.

SWAP RATE Converts nominal survey data into a numerical values based on a dictionary lookup. It allows the user to switch nominal scale data from text

Jake Rhodes 1 Jan 18, 2022
Probabilistic Tensor Decomposition of Neural Population Spiking Activity

Probabilistic Tensor Decomposition of Neural Population Spiking Activity Matlab (recommended) and Python (in developement) implementations of Soulat e

Hugo Soulat 6 Nov 30, 2022
Python Assignments for the Deep Learning lectures by Andrew NG on coursera with complete submission for grading capability.

Python Assignments for the Deep Learning lectures by Andrew NG on coursera with complete submission for grading capability.

Utkarsh Agiwal 1 Feb 03, 2022