Training PSPNet in Tensorflow. Reproduce the performance from the paper.

Overview

Training Reproduce of PSPNet.

(Updated 2021/04/09. Authors of PSPNet have provided a Pytorch implementation for PSPNet and their new work with supporting Sync Batch Norm, see https://github.com/hszhao/semseg.)

(Updated 2019/02/26. A major change of code structure. For the version before, checkout v0.9 https://github.com/holyseven/PSPNet-TF-Reproduce/tree/v0.9.)

This is an implementation of PSPNet (from training to test) in pure Tensorflow library (tested on TF1.12, Python 3).

  • Supported Backbones: ResNet-V1-50, ResNet-V1-101 and other ResNet-V1s can be easily added.
  • Supported Databases: ADE20K, SBD (Augmented Pascal VOC) and Cityscapes.
  • Supported Modes: training, validation and inference with multi-scale inputs.
  • More things: L2-SP regularization and sync batch normalization implementation.

L2-SP Regularization

L2-SP regularization is a variant of L2 regularization. Instead of the origin like L2 does, L2-SP sets the pre-trained model as reference, just like (w - w0)^2, where w0 is the pre-trained model. Simple but effective. More details about L2-SP can be found in the paper and the code.

If you find the L2-SP useful for your research (not limited in image segmentation), please consider citing our work:

@inproceedings{li2018explicit,
  author    = {Li, Xuhong and Grandvalet, Yves and Davoine, Franck},
  title     = {Explicit Inductive Bias for Transfer Learning with Convolutional Networks},
  booktitle={International Conference on Machine Learning (ICML)},
   pages     = {2830--2839},
  year      = {2018}
}

Sync Batch Norm

When concerning image segmentation, batch size is usually limited. Small batch size will make the gradients instable and harm the performance, especially for batch normalization layers. Multi-GPU settings by default does not help because the statistics in batch normalization layer are computed independently within each GPU. More discussion can be found here and here.

This repo resolves this problem in pure python and pure Tensorflow by simply using a list as input. The main idea is located in model/utils_mg.py

I do not know if this is the first implementation of sync batch norm in Tensorflow, but there is already an implementation in PyTorch and some applications.

Update: There is other implementation that uses NCCL to gather statistics across GPUs, see in tensorpack. However, TF1.1 does not support gradients passing by nccl_all_reduce. Plus, ppc64le with tf1.10, cuda9.0 and nccl1.3.5 was not able to run this code. No idea why, and do not want to spend a lot of time on this. Maybe nccl2 can solve this.

Results

Numerical Results

  • Random scaling for all
  • Random rotation for SBD
  • SS/MS on validation set
  • Welcome to correct and fill in the table
Backbones L2 L2-SP
Cityscapes (train set: 3K) ResNet-50 76.9/? 77.9/?
ResNet-101 77.9/? 78.6/?
Cityscapes (coarse + train set: 20K + 3K) ResNet-50
ResNet-101 80.0/80.9 80.1/81.2*
SBD ResNet-50 76.5/? 76.6/?
ResNet-101 77.5/79.2 78.5/79.9
ADE20K ResNet-50 41.92/43.09
ResNet-101 42.80/?

*This model gets 80.3 without post-processing methods on Cityscapes test set (1525).

Qualitative Results on Cityscapes

Devil Details

Training and Evaluation

Download the databases with the links: ADE20K, SBD (Augmented Pascal VOC) and Cityscapes.

Prepare the database for Cityscapes by generating *labelTrainIds.png images with createTrainIdLabelImgs, and then change the code in database/reader.py or move undersired images to other directory.

Download pretrained models.

cd z_pretrained_weights
sh download_resnet_v1_101.sh

A script of training resnet-50 on ADE20K, getting around 41.92 mIoU scores (with single-scale test):

python ./run.py --network 'resnet_v1_50' --visible_gpus '0,1' --reader_method 'queue' --lrn_rate 0.01 --weight_decay_mode 0 --weight_decay_rate 0.0001 --weight_decay_rate2 0.001 --database 'ADE' --subsets_for_training 'train' --batch_size 8 --train_image_size 480 --snapshot 30000 --train_max_iter 90000 --test_image_size 480 --random_rotate 0 --fine_tune_filename './z_pretrained_weights/resnet_v1_50.ckpt'

Test and Infer

Test with multi-scale (set batch_size as large as you can to speed up).

python predict.py --visible_gpus '0' --network 'resnet_v1_101' --database 'ADE' --weights_ckpt './log/ADE/PSP-resnet_v1_101-gpu_num2-batch_size8-lrn_rate0.01-random_scale1-random_rotate1-480-60000-train-1-0.0001-0.001-0-0-1-1/snapshot/model.ckpt-60000' --test_subset 'val' --test_image_size 480 --batch_size 8 --ms 1 --mirror 1

Infer one image (with multi-scale).

python demo_infer.py --database 'Cityscapes' --network 'resnet_v1_101' --weights_ckpt './log/Cityscapes/old/model.ckpt-50000' --test_image_size 864 --batch_size 4 --ms 1

Uncertainties for Training Details:

  1. (Cityscapes only) Whether finely labeled data in the first training stage should be involved?
  2. (Cityscapes only) Whether the (base) learning rate should be reduced in the second training stage?
  3. Whether logits should be resized to original size before computing the loss?
  4. Whether new layers should receive larger learning rate?
  5. About weired padding behavior of tf.image.resize_images(). Whether the align_corners=True should be set?
  6. What is optimal hyperparameter of decay for statistics of batch normalization layers? (0.9, 0.95, 0.9997)
  7. may be more but not sure how much these little changes can effect the results ...
  8. Welcome to discuss !

Change Log

26 Febuary, 2019

  • Code structure: on-the-fly evaluation during training.
  • Code structure: wrapping of the model.
  • Add tf.data support, but with queue-based reader is faster.
  • print results using python utils.py in experiment_manager dir.
  • The default environment is Python 3 and TF1.12. OpenCV is needed for predicting and demo_infer.
  • The previous version becomes a branch of this repo named as v0.9.

External links

Pyramid Scene Parsing Network paper and official github.

Owner
Li Xuhong
Researcher at Baidu Research, focus on interpretable deep learning and transfer learning.
Li Xuhong
Automate issue discovery for your projects against Lightning nightly and releases.

Automated Testing for Lightning EcoSystem Projects Automate issue discovery for your projects against Lightning nightly and releases. You get CPUs, Mu

Pytorch Lightning 41 Dec 24, 2022
Video Instance Segmentation with a Propose-Reduce Paradigm (ICCV 2021)

Propose-Reduce VIS This repo contains the official implementation for the paper: Video Instance Segmentation with a Propose-Reduce Paradigm Huaijia Li

DV Lab 39 Nov 23, 2022
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 387 Jan 08, 2023
TCPNet - Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition

Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition This is an implementation of TCPNet. Introduction For video recognition task, a g

Zilin Gao 21 Dec 08, 2022
Calculates carbon footprint based on fuel mix and discharge profile at the utility selected. Can create graphs and tabular output for fuel mix based on input file of series of power drawn over a period of time.

carbon-footprint-calculator Conda distribution ~/anaconda3/bin/conda install anaconda-client conda-build ~/anaconda3/bin/conda config --set anaconda_u

Seattle university Renewable energy research 7 Sep 26, 2022
Evaluating AlexNet features at various depths

Linear Separability Evaluation This repo provides the scripts to test a learned AlexNet's feature representation performance at the five different con

Yuki M. Asano 32 Dec 30, 2022
The Official PyTorch Implementation of "VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models" (ICLR 2021 spotlight paper)

Official PyTorch implementation of "VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models" (ICLR 2021 Spotlight Paper) Zhisheng

NVIDIA Research Projects 45 Dec 26, 2022
Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

Microsoft 197 Jan 06, 2023
[NIPS 2021] UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.

UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration This repository is the official PyTorch implementation of UOT

6 Jun 29, 2022
Implementation of Research Paper "Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation"

Zero-DCE and Zero-DCE++(Lite architechture for Mobile and edge Devices) Papers Abstract The paper presents a novel method, Zero-Reference Deep Curve E

Tauhid Khan 15 Dec 10, 2022
A PyTorch based deep learning library for drug pair scoring.

Documentation | External Resources | Datasets | Examples ChemicalX is a deep learning library for drug-drug interaction, polypharmacy side effect and

AstraZeneca 597 Dec 30, 2022
A Lightweight Hyperparameter Optimization Tool 🚀

Lightweight Hyperparameter Optimization 🚀 The mle-hyperopt package provides a simple and intuitive API for hyperparameter optimization of your Machin

136 Jan 08, 2023
Code for "On Memorization in Probabilistic Deep Generative Models"

On Memorization in Probabilistic Deep Generative Models This repository contains the code necessary to reproduce the experiments in On Memorization in

The Alan Turing Institute 3 Jun 09, 2022
A framework for GPU based high-performance medical image processing and visualization

FAST is an open-source cross-platform framework with the main goal of making it easier to do high-performance processing and visualization of medical images on heterogeneous systems utilizing both mu

Erik Smistad 315 Dec 30, 2022
This project aims to segment 4 common retinal lesions from Fundus Images.

This project aims to segment 4 common retinal lesions from Fundus Images.

Husam Nujaim 1 Oct 10, 2021
The official project of SimSwap (ACM MM 2020)

SimSwap: An Efficient Framework For High Fidelity Face Swapping Proceedings of the 28th ACM International Conference on Multimedia The official reposi

Six_God 2.6k Jan 08, 2023
Construct a neural network frame by Numpy

本项目的CSDN博客链接:https://blog.csdn.net/weixin_41578567/article/details/111482022 1. 概览 本项目主要用于神经网络的学习,通过基于numpy的实现,了解神经网络底层前向传播、反向传播以及各类优化器的原理。 该项目目前已实现的功

24 Jan 22, 2022
A system for quickly generating training data with weak supervision

Programmatically Build and Manage Training Data Announcement The Snorkel team is now focusing their efforts on Snorkel Flow, an end-to-end AI applicat

Snorkel Team 5.4k Jan 02, 2023
An Api for Emotion recognition.

PLAYEMO Playemo was built from the ground-up with Flask, a python tool that makes it easy for developers to build APIs. Use Cases Is Python your langu

greek geek 2 Jul 16, 2022
This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

Differentiable Volumetric Rendering Paper | Supplementary | Spotlight Video | Blog Entry | Presentation | Interactive Slides | Project Page This repos

697 Jan 06, 2023