SwinTrack: A Simple and Strong Baseline for Transformer Tracking

Overview

SwinTrack

This is the official repo for SwinTrack.

banner

A Simple and Strong Baseline

performance

Prerequisites

Environment

conda (recommended)

conda create -y -n SwinTrack
conda activate SwinTrack
conda install -y anaconda
conda install -y pytorch torchvision cudatoolkit -c pytorch
conda install -y -c fvcore -c iopath -c conda-forge fvcore
pip install wandb
pip install timm

pip

pip install -r requirements.txt

Dataset

Download

Unzip

The paths should be organized as following:

lasot
├── airplane
├── basketball
...
├── training_set.txt
└── testing_set.txt

lasot_extension
├── atv
├── badminton
...
└── wingsuit

got-10k
├── train
│   ├── GOT-10k_Train_000001
│   ...
├── val
│   ├── GOT-10k_Val_000001
│   ...
└── test
    ├── GOT-10k_Test_000001
    ...
    
trackingnet
├── TEST
├── TRAIN_0
...
└── TRAIN_11

coco2017
├── annotations
│   ├── instances_train2017.json
│   └── instances_val2017.json
└── images
    ├── train2017
    │   ├── 000000000009.jpg
    │   ├── 000000000025.jpg
    │   ...
    └── val2017
        ├── 000000000139.jpg
        ├── 000000000285.jpg
        ...

Prepare path.yaml

Copy path.template.yaml as path.yaml and fill in the paths.

LaSOT_PATH: '/path/to/lasot'
LaSOT_Extension_PATH: '/path/to/lasot_ext'
GOT10k_PATH: '/path/to/got10k'
TrackingNet_PATH: '/path/to/trackingnet'
COCO_2017_PATH: '/path/to/coco2017'

Prepare dataset metadata cache (optional)

Download the metadata cache from google drive, and unzip it in datasets/cache/

datasets
└── cache
    ├── SingleObjectTrackingDataset_MemoryMapped
    │   └── filtered
    │       ├── got-10k-got10k_vot_train_split-train-3c1ffeb0c530522f0345d088b2f72168.np
    │       ...
    └── DetectionDataset_MemoryMapped
        └── filtered
            └── coco2017-nocrowd-train-bcd5bf68d4b87619ab451fe293098401.np

Login to wandb

Register an account at wandb, then login with command:

wandb login

Training & Evaluation

Train and evaluate on a single GPU

# Tiny
python main.py SwinTrack Tiny --output_dir /path/to/output -W $num_dataloader_workers

# Base
python main.py SwinTrack Base --output_dir /path/to/output -W $num_dataloader_workers

# Base-384
python main.py SwinTrack Base-384 --output_dir /path/to/output -W $num_dataloader_workers

--output_dir is optional, -W defaults to 4.

note: our code performs evaluation automatically when training is done, output is saved in /path/to/output/test_metrics.

Train and evaluate on multiple GPUs using DDP

# Tiny
python main.py SwinTrack Tiny --distributed_nproc_per_node $num_gpus --distributed_do_spawn_workers --output_dir /path/to/output -W $num_dataloader_workers

Train and evaluate on multiple nodes with multiple GPUs using DDP

# Tiny
python main.py SwinTrack Tiny --master_address $master_address --distributed_node_rank $node_rank distributed_nnodes $num_nodes --distributed_nproc_per_node $num_gpus --distributed_do_spawn_workers --output_dir /path/to/output -W $num_dataloader_workers 

Train and evaluate with run.sh helper script

# Train and evaluate on all GPUs
./run.sh SwinTrack Tiny --output_dir /path/to/output -W $num_dataloader_workers
# Train and evaluate on multiple nodes
NODE_RANK=$NODE_INDEX NUM_NODES=$NUM_NODES MASTER_ADDRESS=$MASTER_ADDRESS DATE_WITH_TIME=$DATE_WITH_TIME ./run.sh SwinTrack Tiny --output_dir /path/to/output -W $num_dataloader_workers 

Ablation study

The ablation study can be done by applying a small patch to the main config file.

Take the ResNet 50 backbone as the example, the rest parameters are the same as the above.

# Train and evaluate with resnet50 backbone
python main.py SwinTrack Tiny --mixin_config resnet.yaml
# or with run.sh
./run.sh SwinTrack Tiny --mixin resnet.yaml

All available config patches are listed in config/SwinTrack/Tiny/mixin.

Train and evaluate with GOT-10k dataset

python main.py SwinTrack Tiny --mixin_config got10k.yaml

Submit $output_dir/test_metrics/got10k/submit/*.zip to the GOT-10k evaluation server to get the result of GOT-10k test split.

Evaluate Existing Model

Download the pretrained model from google drive, then type:

python main.py SwinTrack Tiny --weight_path /path/to/weigth_file.pth --mixin_config evaluation.yaml --output_dir /path/to/output

Our code can evaluate the model on multiple GPUs in parallel, so all parameters above are also available.

Tracking results

Touch here google drive

Citation

@misc{lin2021swintrack,
      title={SwinTrack: A Simple and Strong Baseline for Transformer Tracking}, 
      author={Liting Lin and Heng Fan and Yong Xu and Haibin Ling},
      year={2021},
      eprint={2112.00995},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
LitingLin
LitingLin
Multiple Object Tracking with Yolov5!

Tracking with yolov5 This implementation is for who need to tracking multi-object only with detector. You can easily track mult-object with your well

9 Nov 08, 2022
End-To-End Crowdsourcing

End-To-End Crowdsourcing Comparison of traditional crowdsourcing approaches to a state-of-the-art end-to-end crowdsourcing approach LTNet on sentiment

Andreas Koch 1 Mar 06, 2022
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Wentao Zhu 24 May 20, 2022
This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

HRNet 367 Dec 27, 2022
BasicRL: easy and fundamental codes for deep reinforcement learning。It is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up.

BasicRL: easy and fundamental codes for deep reinforcement learning BasicRL is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up. It is

RayYoh 12 Apr 28, 2022
Self-Supervised Learning with Kernel Dependence Maximization

Self-Supervised Learning with Kernel Dependence Maximization This is the code for SSL-HSIC, a self-supervised learning loss proposed in the paper Self

DeepMind 29 Dec 29, 2022
An example to implement a new backbone with OpenMMLab framework.

Backbone example on OpenMMLab framework English | 简体中文 Introduction This is an template repo about how to use OpenMMLab framework to develop a new bac

Ma Zerun 22 Dec 29, 2022
Distinguishing Commercial from Editorial Content in News

Distinguishing Commercial from Editorial Content in News In this repository you can find the following: An anonymized version of the data used for my

Timo Kats 3 Sep 26, 2022
Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach

Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach Thanh Luan Nguyen, Tri Nhu Do, Georges Kaddoum

Thanh Luan Nguyen 2 Oct 10, 2022
Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics This repository is the official PyTorch implementation of "Physics-aware Differ

USC-Melady 46 Nov 20, 2022
The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

This repository is the official PyTorch implementation of SAINT. Find the paper on arxiv SAINT: Improved Neural Networks for Tabular Data via Row Atte

Gowthami Somepalli 284 Dec 21, 2022
Export CenterPoint PonintPillars ONNX Model For TensorRT

CenterPoint-PonintPillars Pytroch model convert to ONNX and TensorRT Welcome to CenterPoint! This project is fork from tianweiy/CenterPoint. I impleme

CarkusL 149 Dec 13, 2022
Bridging Vision and Language Model

BriVL BriVL (Bridging Vision and Language Model) 是首个中文通用图文多模态大规模预训练模型。BriVL模型在图文检索任务上有着优异的效果,超过了同期其他常见的多模态预训练模型(例如UNITER、CLIP)。 BriVL论文:WenLan: Bridgi

235 Dec 27, 2022
Source code of "Hold me tight! Influence of discriminative features on deep network boundaries"

Hold me tight! Influence of discriminative features on deep network boundaries This is the source code to reproduce the experiments of the NeurIPS 202

EPFL LTS4 19 Dec 10, 2021
CAPRI: Context-Aware Interpretable Point-of-Interest Recommendation Framework

CAPRI: Context-Aware Interpretable Point-of-Interest Recommendation Framework This repository contains a framework for Recommender Systems (RecSys), a

RecSys Lab 8 Jul 03, 2022
Train Yolov4 using NBX-Jobs

yolov4-trainer-nbox Train Yolov4 using NBX-Jobs. Use the powerfull functionality available in nbox-SDK repo to train a tiny-Yolo v4 model on Pascal VO

Yash Bonde 1 Jan 12, 2022
Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

Yu Bai 43 Nov 07, 2022
ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

(Comet-) ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs Paper Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sa

AI2 152 Dec 27, 2022
Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

GraphMask This repository contains an implementation of GraphMask, the interpretability technique for graph neural networks presented in our ICLR 2021

Michael Schlichtkrull 29 Sep 02, 2022
Scaling and Benchmarking Self-Supervised Visual Representation Learning

FAIR Self-Supervision Benchmark is deprecated. Please see VISSL, a ground-up rewrite of benchmark in PyTorch. FAIR Self-Supervision Benchmark This cod

Meta Research 584 Dec 31, 2022