DeiT: Data-efficient Image Transformers

Related tags

Deep Learningdeit
Overview

DeiT: Data-efficient Image Transformers

This repository contains PyTorch evaluation code, training code and pretrained models for DeiT (Data-Efficient Image Transformers).

They obtain competitive tradeoffs in terms of speed / precision:

DeiT

For details see Training data-efficient image transformers & distillation through attention by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles and Hervé Jégou.

If you use this code for a paper please cite:

@article{touvron2020deit,
  title={Training data-efficient image transformers & distillation through attention},
  author={Hugo Touvron and Matthieu Cord and Matthijs Douze and Francisco Massa and Alexandre Sablayrolles and Herv\'e J\'egou},
  journal={arXiv preprint arXiv:2012.12877},
  year={2020}
}

Model Zoo

We provide baseline DeiT models pretrained on ImageNet 2012.

name [email protected] [email protected] #params url
DeiT-tiny 72.2 91.1 5M model
DeiT-small 79.9 95.0 22M model
DeiT-base 81.8 95.6 86M model
DeiT-tiny distilled 74.5 91.9 6M model
DeiT-small distilled 81.2 95.4 22M model
DeiT-base distilled 83.4 96.5 87M model
DeiT-base 384 82.9 96.2 87M model
DeiT-base distilled 384 (1000 epochs) 85.2 97.2 88M model

The models are also available via torch hub. Before using it, make sure you have the pytorch-image-models package timm==0.3.2 by Ross Wightman installed. Note that our work relies of the augmentations proposed in this library. In particular, the RandAugment and RandErasing augmentations that we invoke are the improved versions from the timm library, which already led the timm authors to report up to 79.35% top-1 accuracy with Imagenet training for their best model, i.e., an improvement of about +1.5% compared to prior art.

To load DeiT-base with pretrained weights on ImageNet simply do:

import torch
# check you have the right version of timm
import timm
assert timm.__version__ == "0.3.2"

# now load it with torchhub
model = torch.hub.load('facebookresearch/deit:main', 'deit_base_patch16_224', pretrained=True)

Additionnally, we provide a Colab notebook which goes over the steps needed to perform inference with DeiT.

Usage

First, clone the repository locally:

git clone https://github.com/facebookresearch/deit.git

Then, install PyTorch 1.7.0+ and torchvision 0.8.1+ and pytorch-image-models 0.3.2:

conda install -c pytorch pytorch torchvision
pip install timm==0.3.2

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Evaluation

To evaluate a pre-trained DeiT-base on ImageNet val with a single GPU run:

python main.py --eval --resume https://dl.fbaipublicfiles.com/deit/deit_base_patch16_224-b5f2ef4d.pth --data-path /path/to/imagenet

This should give

* [email protected] 81.846 [email protected] 95.594 loss 0.820

For Deit-small, run:

python main.py --eval --resume https://dl.fbaipublicfiles.com/deit/deit_small_patch16_224-cd65a155.pth --model deit_small_patch16_224 --data-path /path/to/imagenet

giving

* [email protected] 79.854 [email protected] 94.968 loss 0.881

Note that Deit-small is not the same model as in Timm.

And for Deit-tiny:

python main.py --eval --resume https://dl.fbaipublicfiles.com/deit/deit_tiny_patch16_224-a1311bcf.pth --model deit_tiny_patch16_224 --data-path /path/to/imagenet

which should give

* [email protected] 72.202 [email protected] 91.124 loss 1.219

Here you'll find the command-lines to reproduce the inference results for the distilled and finetuned models

deit_base_distilled_patch16_224
python main.py --eval --model deit_base_distilled_patch16_224 --resume https://dl.fbaipublicfiles.com/deit/deit_base_distilled_patch16_224-df68dfff.pth

giving

* [email protected] 83.372 [email protected] 96.482 loss 0.685
deit_small_distilled_patch16_224
python main.py --eval --model deit_small_distilled_patch16_224 --resume https://dl.fbaipublicfiles.com/deit/deit_small_distilled_patch16_224-649709d9.pth

giving

* [email protected] 81.164 [email protected] 95.376 loss 0.752
deit_tiny_distilled_patch16_224
python main.py --eval --model deit_tiny_distilled_patch16_224 --resume https://dl.fbaipublicfiles.com/deit/deit_tiny_distilled_patch16_224-b40b3cf7.pth

giving

* [email protected] 74.476 [email protected] 91.920 loss 1.021
deit_base_patch16_384
python main.py --eval --model deit_base_patch16_384 --input-size 384 --resume https://dl.fbaipublicfiles.com/deit/deit_base_patch16_384-8de9b5d1.pth

giving

* [email protected] 82.890 [email protected] 96.222 loss 0.764
deit_base_distilled_patch16_384
python main.py --eval --model deit_base_distilled_patch16_384 --input-size 384 --resume https://dl.fbaipublicfiles.com/deit/deit_base_distilled_patch16_384-d0272ac0.pth

giving

* [email protected] 85.224 [email protected] 97.186 loss 0.636

Training

To train DeiT-small and Deit-tiny on ImageNet on a single node with 4 gpus for 300 epochs run:

DeiT-small

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model deit_small_patch16_224 --batch-size 256 --data-path /path/to/imagenet --output_dir /path/to/save

DeiT-tiny

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model deit_tiny_patch16_224 --batch-size 256 --data-path /path/to/imagenet --output_dir /path/to/save

Multinode training

Distributed training is available via Slurm and submitit:

pip install submitit

To train DeiT-base model on ImageNet on 2 nodes with 8 gpus each for 300 epochs:

python run_with_submitit.py --model deit_base_patch16_224 --data-path /path/to/imagenet

To train DeiT-base with hard distillation using a RegNetY-160 as teacher, on 2 nodes with 8 GPUs with 32GB each for 300 epochs (make sure that the model weights for the teacher have been downloaded before to the correct location, to avoid multiple workers writing to the same file):

python run_with_submitit.py --model deit_base_distilled_patch16_224 --distillation-type hard --teacher-model regnety_160 --teacher-path https://dl.fbaipublicfiles.com/deit/regnety_160-a5fe301d.pth --use_volta32

To finetune a DeiT-base on 384 resolution images for 30 epochs, starting from a DeiT-base trained on 224 resolution images, do (make sure that the weights to the original model have been downloaded before, to avoid multiple workers writing to the same file):

python run_with_submitit.py --model deit_base_patch16_384 --batch-size 32 --finetune https://dl.fbaipublicfiles.com/deit/deit_base_patch16_224-b5f2ef4d.pth --input-size 384 --use_volta32 --nodes 2 --lr 5e-6 --weight-decay 1e-8 --epochs 30 --min-lr 5e-6

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.

Owner
Facebook Research
Facebook Research
Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Mingrui Yu 3 Jan 07, 2022
QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

249 Jan 03, 2023
Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.

TechSEO Crawler Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index. Play with the r

JR Oakes 57 Nov 24, 2022
Exporter for Storage Area Network (SAN)

SAN Exporter Prometheus exporter for Storage Area Network (SAN). We all know that each SAN Storage vendor has their own glossary of terms, health/perf

vCloud 32 Dec 16, 2022
Improving Deep Network Debuggability via Sparse Decision Layers

Improving Deep Network Debuggability via Sparse Decision Layers This repository contains the code for our paper: Leveraging Sparse Linear Layers for D

Madry Lab 35 Nov 14, 2022
Array Camera Ptychography

Array Camera Ptychography This repository provides the code for the following papers: Schulz, Timothy J., David J. Brady, and Chengyu Wang. "Photon-li

Brady lab in Optical Sciences 1 Nov 15, 2021
Machine Learning Models were applied to predict the mass of the brain based on gender, age ranges, and head size.

Brain Weight in Humans Variations of head sizes and brain weights in humans Kaggle dataset obtained from this link by Anubhab Swain. Image obtained fr

Anne Livia 1 Feb 02, 2022
pytorch implementation of GPV-Pose

GPV-Pose Pytorch implementation of GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting. (link) UPDATE A new version

40 Dec 01, 2022
Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Deep-Learning-based-Spectrum-Sensing Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectru

10 Dec 14, 2022
Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

deepface Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid

Kushal Shingote 2 Feb 10, 2022
This is the code for the paper "Motion-Focused Contrastive Learning of Video Representations" (ICCV'21).

Motion-Focused Contrastive Learning of Video Representations Introduction This is the code for the paper "Motion-Focused Contrastive Learning of Video

11 Sep 23, 2022
System Combination for Grammatical Error Correction Based on Integer Programming

System Combination for Grammatical Error Correction Based on Integer Programming This repository contains the code and scripts that implement the syst

NUS NLP Group 0 Mar 29, 2022
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

Bo Wan 32 Dec 23, 2022
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data

We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for

Zekeriyya Demirci 1 Jan 09, 2022
Spline is a tool that is capable of running locally as well as part of well known pipelines like Jenkins (Jenkinsfile), Travis CI (.travis.yml) or similar ones.

Welcome to spline - the pipeline tool Important note: Since change in my job I didn't had the chance to continue on this project. My main new project

Thomas Lehmann 29 Aug 22, 2022
The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.

SeaLion is designed to teach today's aspiring ml-engineers the popular machine learning concepts of today in a way that gives both intuition and ways of application. We do this through concise algori

Anish 324 Dec 27, 2022
Utility code for use with PyXLL

pyxll-utils There is no need to use this package as of PyXLL 5. All features from this package are now provided by PyXLL. If you were using this packa

PyXLL 10 Dec 18, 2021
An Straight Dilated Network with Wavelet for image Deblurring

SDWNet: A Straight Dilated Network with Wavelet Transformation for Image Deblurring(offical) 1. Introduction This repo is not only used for our paper(

FlyEgle 41 Jan 04, 2023
AutoML library for deep learning

Official Website: autokeras.com AutoKeras: An AutoML system based on Keras. It is developed by DATA Lab at Texas A&M University. The goal of AutoKeras

Keras 8.7k Jan 08, 2023
P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy for sma

THUDM 540 Dec 30, 2022