[CVPR-2021] UnrealPerson: An adaptive pipeline for costless person re-identification

Overview

UnrealPerson: An Adaptive Pipeline for Costless Person Re-identification

In our paper (arxiv), we propose a novel pipeline, UnrealPerson, that decreases the costs in both the training and deployment stages of person ReID. We develop an automatic data synthesis toolkit and use synthesized data in mutiple ReID tasks, including (i) Direct transfer, (ii) Unsupervised domain adaptation, and (iii) Supervised fine-tuning.

The repo contains the synthesized data we use in the paper and presents examples of how to use synthesized data in various down-stream tasks to boost the ReID performance.

The codes are based on CBN (ECCV 2020) and JVTC (ECCV 2020).

Highlights:

  1. In direct transfer evaluation, we achieve 38.5% rank-1 accuracy on MSMT17 and 79.0% on Market-1501 using our unreal data.
  2. In unsupervised domain adaptation, we achieve 68.2% rank-1 accuracy on MSMT17 and 93.0% on Market-1501 using our unreal data.
  3. We obtain a better pre-trained ReID model with our unreal data.

Demonstration

Data Details

Our synthesized data (named Unreal in the paper) is generated with Makehuman, Mixamo, and UnrealEngine 4. We provide 1.2M images of 6.8K identities, captured from 4 unreal environments.

Beihang Netdisk: Download Link valid until: 2024-01-01

BaiduPan: Download Link password: abcd

The image path is formulated as: unreal_v{X}.{Y}/images/{P}_c{D}_{F}.jpg, for example, unreal_v3.1/images/333_c001_78.jpg.

X represents the ID of unreal environment; Y is the version of human models; P is the person identity label; D is the camera label; F is the frame number.

We provide three types of human models: version 1 is the basic type; version 2 contains accessories, like handbags, hats and backpacks; version 3 contains hard samples with similar global appearance. Four virtual environments are used in our synthesized data: the first three are city environments and the last one is a supermarket. Note that cameras under different virtual environments may have the same label and persons of different versions may also have the same identity label. Therefore, images with the same (Y, P) belong to the same virtual person; images with the same (X, D) belong to the same camera.

The data synthesis toolkit, including Makehuman plugin, several UE4 blueprints and data annotation scripts, will be published soon.

UnrealPerson Pipeline

Direct Transfer and Supervised Fine-tuning

We use Camera-based Batch Normalization baseline for direct transfer and supervised fine-tuning experiments.

1. Clone this repo and change directory to CBN

git clone https://github.com/FlyHighest/UnrealPerson.git
cd UnrealPerson/CBN

2. Download Market-1501, DukeMTMC-reID, MSMT17, UnrealPerson data and organize them as follows:

.
+-- data
|   +-- market
|       +-- bounding_box_train
|       +-- query
|       +-- bounding_box_test
|   +-- duke
|       +-- bounding_box_train
|       +-- query
|       +-- bounding_box_test
|   +-- msmt17
|       +-- train
|       +-- test
|       +-- list_train.txt
|       +-- list_val.txt
|       +-- list_query.txt
|       +-- list_gallery.txt
|   +-- unreal_vX.Y
|       +-- images
+ -- other files in this repo

3. Install the required packages

pip install -r requirements.txt

4. Put the official PyTorch ResNet-50 pretrained model to your home folder: '~/.torch/models/'

5. Train a ReID model with our synthesized data

Reproduce the results in our paper:

CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0,1 \
python train_model.py train --trainset_name unreal --datasets='unreal_v1.1,unreal_v2.1,unreal_v3.1,unreal_v4.1,unreal_v1.2,unreal_v2.2,unreal_v3.2,unreal_v4.2,unreal_v1.3,unreal_v2.3,unreal_v3.3,unreal_v4.3' --save_dir='unreal_4678_v1v2v3_cambal_3000' --save_step 15  --num_pids 3000 --cam_bal True --img_per_person 40

We also provide the trained weights of this experiment in the data download links above.

Configs: When trainset_name is unreal, datasets contains the directories of unreal data that will be used. num_pids is the number of humans and cam_bal denotes the camera balanced sampling strategy is adopted. img_per_person controls the size of the training set.

More configurations are in config.py.

6.1 Direct transfer to real datasets

CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 \
python test_model.py test --testset_name market --save_dir='unreal_4678_v1v2v3_cambal_3000'

6.2 Fine-tuning

CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=1,0 \
python train_model.py train --trainset_name market --save_dir='market_unrealpretrain_demo' --max_epoch 60 --decay_epoch 40 --model_path pytorch-ckpt/current/unreal_4678_v1v2v3_cambal_3000/model_best.pth.tar


CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 \
python test_model.py test --testset_name market --save_dir='market_unrealpretrain_demo'

Unsupervised Domain Adaptation

We use joint visual and temporal consistency (JVTC) framework. CBN is also implemented in JVTC.

1. Clone this repo and change directory to JVTC

git clone https://github.com/FlyHighest/UnrealPerson.git
cd UnrealPerson/JVTC

2. Prepare data

Basicly, it is the same as CBN, except for an extra directory bounding_box_train_camstyle_merge, which can be downloaded from ECN. We suggest using ln -s to save disk space.

.
+-- data
|   +-- market
|       +-- bounding_box_train
|       +-- query
|       +-- bounding_box_test
|       +-- bounding_box_train_camstyle_merge
+ -- other files in this repo

3. Install the required packages

pip install -r ../CBN/requirements.txt

4. Put the official PyTorch ResNet-50 pretrained model to your home folder: '~/.torch/models/'

5. Train and test

(Unreal to MSMT)

python train_cbn.py --gpu_ids 0,1,2 --src unreal --tar msmt --num_cam 6 --name unreal2msmt --max_ep 60

python test_cbn.py --gpu_ids 1 --weights snapshot/unreal2msmt/resnet50_unreal2market_epoch60_cbn.pth --name 'unreal2msmt' --tar market --num_cam 6 --joint True 

The unreal data used in JVTC is defined in list_unreal/list_unreal_train.txt. The CBN codes support generating this file (see CBN/io_stream/datasets/unreal.py).

More details can be seen in JVTC.

References

  • [1] Rethinking the Distribution Gap of Person Re-identification with Camera-Based Batch Normalization. ECCV 2020.

  • [2] Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-Identification. ECCV 2020.

Cite our paper

If you find our work useful in your research, please kindly cite:

@misc{zhang2020unrealperson,
      title={UnrealPerson: An Adaptive Pipeline towards Costless Person Re-identification}, 
      author={Tianyu Zhang and Lingxi Xie and Longhui Wei and Zijie Zhuang and Yongfei Zhang and Bo Li and Qi Tian},
      year={2020},
      eprint={2012.04268},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

If you have any questions about the data or paper, please leave an issue or contact me: [email protected]

Owner
ZhangTianyu
ZhangTianyu
Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution

FAU Implementation of the paper: Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution. Yingruo

Evelyn 78 Nov 29, 2022
Multi-Anchor Active Domain Adaptation for Semantic Segmentation (ICCV 2021 Oral)

Multi-Anchor Active Domain Adaptation for Semantic Segmentation Munan Ning*, Donghuan Lu*, Dong Wei†, Cheng Bian, Chenglang Yuan, Shuang Yu, Kai Ma, Y

Munan Ning 36 Dec 07, 2022
This repository contains the code for the binaural-detection model used in the publication arXiv:2111.04637

This repository contains the code for the binaural-detection model used in the publication arXiv:2111.04637 Dependencies The model depends on the foll

Jörg Encke 2 Oct 14, 2022
Jingju baseline - A baseline model of our project of Beijing opera script generation

Jingju Baseline It is a baseline of our project about Beijing opera script gener

midon 1 Jan 14, 2022
Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

aft-pytorch Unofficial PyTorch implementation of Attention Free Transformer's layers by Zhai, et al. [abs, pdf] from Apple Inc. Installation You can i

Rishabh Anand 184 Dec 12, 2022
(under submission) Bayesian Integration of a Generative Prior for Image Restoration

BIGPrior: Towards Decoupling Learned Prior Hallucination and Data Fidelity in Image Restoration Authors: Majed El Helou, and Sabine Süsstrunk {Note: p

Majed El Helou 22 Dec 17, 2022
SAMO: Streaming Architecture Mapping Optimisation

SAMO: Streaming Architecture Mapping Optimiser The SAMO framework provides a method of optimising the mapping of a Convolutional Neural Network model

Alexander Montgomerie-Corcoran 20 Dec 10, 2022
This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".

BiCAT This is our TensorFlow implementation for the paper: "BiCAT: Sequential Recommendation with Bidirectional Chronological Augmentation of Transfor

John 15 Dec 06, 2022
salabim - discrete event simulation in Python

Object oriented discrete event simulation and animation in Python. Includes process control features, resources, queues, monitors. statistical distrib

181 Dec 21, 2022
Visualizer for neural network, deep learning, and machine learning models

Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX (.onnx, .pb, .pbtxt), Keras (.h5, .keras), Tens

Lutz Roeder 21k Jan 06, 2023
Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

Code Transformer This is an official PyTorch implementation of the CodeTransformer model proposed in: D. Zügner, T. Kirschstein, M. Catasta, J. Leskov

Daniel Zügner 131 Dec 13, 2022
Deep learning model for EEG artifact removal

DeepSeparator Introduction Electroencephalogram (EEG) recordings are often contaminated with artifacts. Various methods have been developed to elimina

23 Dec 21, 2022
A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution

DRSAN A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution Karam Park, Jae Woong Soh, and Nam Ik Cho Environments U

4 May 10, 2022
EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

Frustratingly Simple Pretraining Alternatives to Masked Language Modeling This is the official implementation for "Frustratingly Simple Pretraining Al

Atsuki Yamaguchi 31 Nov 18, 2022
PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street

PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street This is

ShotaDEGUCHI 2 Apr 18, 2022
competitions-v2

Codabench (formerly Codalab Competitions v2) Installation $ cp .env_sample .env $ docker-compose up -d $ docker-compose exec django ./manage.py migrat

CodaLab 21 Dec 02, 2022
Deep Markov Factor Analysis (NeurIPS2021)

Deep Markov Factor Analysis (DMFA) Codes and experiments for deep Markov factor analysis (DMFA) model accepted for publication at NeurIPS2021: A. Farn

Sarah Ostadabbas 2 Dec 16, 2022
DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe.

DeepLab Introduction DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe. It combines densely-compute

Ali 234 Nov 14, 2022
Compact Bidirectional Transformer for Image Captioning

Compact Bidirectional Transformer for Image Captioning Requirements Python 3.8 Pytorch 1.6 lmdb h5py tensorboardX Prepare Data Please use git clone --

YE Zhou 19 Dec 12, 2022
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

Phil Wang 4.4k Jan 03, 2023