Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Last update: Jan 01, 2023

Related tags

Deep Learning PS-ViT

Overview

Vision Transformer with Progressive Sampling

This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Installation Instructions

Clone this repo:

git clone [email protected]:yuexy/PS-ViT.git
cd PS-ViT

Create a conda virtual environment and activate it:

conda create -n ps_vit python=3.7 -y
conda activate ps_vit

Install CUDA==10.1 with cudnn7 following the official installation instructions
Install PyTorch==1.7.1 and torchvision==0.8.2 with CUDA==10.1:

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch

Install timm==0.3.4, einops, pyyaml:

pip3 install timm=0.3.4, einops, pyyaml

Install Apex:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Install PS-ViT:

python setup.py build_ext --inplace

Results and Models

All models listed below are evaluated with input size 224x224

Model	Top1 Acc	#params	FLOPS	Download
PS-ViT-Ti/14	75.6	4.8M	1.6G	Coming Soon
PS-ViT-B/10	80.6	21.3M	3.1G	Coming Soon
PS-ViT-B/14	81.7	21.3M	5.4G	Google Drive
PS-ViT-B/18	82.3	21.3M	8.8G	Google Drive

Evaluation

To evaluate a pre-trained PS-ViT on ImageNet val, run:

python3 main.py <data-root> --model <model-name> -b <batch-size> --eval_checkpoint <path-to-checkpoint>

Training from scratch

To train a PS-ViT on ImageNet from scratch, run:

bash ./scripts/train_distributed.sh <job-name> <config-path> <num-gpus>

Citing PS-ViT

@article{psvit,
  title={Vision Transformer with Progressive Sampling},
  author={Yue, Xiaoyu and Sun, Shuyang and Kuang, Zhanghui and Wei, Meng and Torr, Philip and Zhang, Wayne and Lin, Dahua},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

Contact

If you have any questions, don't hesitate to contact Xiaoyu Yue. You can easily reach him by sending an email to [email protected].

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Related tags

Overview

Vision Transformer with Progressive Sampling

Installation Instructions

Results and Models

Evaluation

Training from scratch

Citing PS-ViT

Contact

Owner

yuexy

[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.

mPose3D, a mmWave-based 3D human pose estimation model.

Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.

Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting

Framework for abstracting Amiga debuggers and access to AmigaOS libraries and devices.

This is an official implementation for "Video Swin Transformers".

An implementation of based on pytorch and mmcv

[CVPR 2021] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

Learning Energy-Based Models by Diffusion Recovery Likelihood

Development kit for MIT Scene Parsing Benchmark

Deep learning model for EEG artifact removal

Text Generation by Learning from Demonstrations

Fashion Recommender System With Python

PyTorch implementation of the Crafting Better Contrastive Views for Siamese Representation Learning

The ARCA23K baseline system

basic tutorial on pytorch

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

GMFlow: Learning Optical Flow via Global Matching

Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"