Applying PVT to Semantic Segmentation

Last update: Nov 30, 2022

Related tags

Deep Learning PVTv2-Seg

Overview

Applying PVT to Semantic Segmentation

Here, we take MMSegmentation v0.13.0 as an example, applying PVTv2 to SemanticFPN.

For details see Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions.

If you use this code for a paper please cite:

@misc{wang2021pyramid,
      title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions}, 
      author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
      year={2021},
      eprint={2102.12122},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Usage

Install MMSegmentation.

Data preparation

First, prepare ADE20K according to the guidelines in MMSegmentation.

Then, download the weights pretrained on ImageNet at here, and put them in a folder pretrained/

Results and models

Backbone	Iters	mIoU	Config
PVTv2-B0 + Semantic FPN	40K	37.2	config
PVTv2-B1 + Semantic FPN	40K	42.5	config
PVTv2-B2 + Semantic FPN	40K	45.2	config
PVTv2-B3 + Semantic FPN	40K	47.3	config
PVTv2-B4 + Semantic FPN	40K	47.9	config
PVTv2-B5 + Semantic FPN	40K	48.7	config

Evaluation

To evaluate PVTv2-B2 + SemFPN on a single node with 8 gpus run:

dist_test.sh configs/sem_fpn/PVT/fpn_pvtv2_b2_ade20k_40k.py /path/to/checkpoint_file 8 --out results.pkl --eval mIoU

Training

To train PVTv2-B2 + SemFPN on a single node with 8 gpus run:

dist_train.sh configs/sem_fpn/PVT/fpn_pvtv2_b2_ade20k_40k.py 8

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Applying PVT to Semantic Segmentation

Related tags

Overview

Applying PVT to Semantic Segmentation

Usage

Data preparation

Results and models

Evaluation

Training

License

Owner

The Agriculture Domain of ERPNext comes with features to record crops and land

Turning pixels into virtual points for multimodal 3D object detection.

一个目标检测的通用框架(不需要cuda编译)，支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

PyTorch original implementation of Cross-lingual Language Model Pretraining.

BLEND: A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches

This program was designed to detect whether someone is wearing a facemask through a live video stream.

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

[ICCV'21] Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment

DeepStruc is a Conditional Variational Autoencoder which can predict the mono-metallic nanoparticle from a Pair Distribution Function.

Live Hand Tracking Using Python

Parris, the automated infrastructure setup tool for machine learning algorithms.

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Dynamic View Synthesis from Dynamic Monocular Video

Implementation of Hierarchical Transformer Memory (HTM) for Pytorch

Rule Based Classification Project

Music source separation is a task to separate audio recordings into individual sources

Large-scale language modeling tutorials with PyTorch

Computationally efficient algorithm that identifies boundary points of a point cloud.

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX