Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

Related tags

Deep LearningTGraM
Overview

TGraM

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling,
Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu

Abstract

Recently, satellite video has become an emerging means of earth observation, providing the possibility of tracking moving objects. However, the existing multi-object trackers are commonly designed for natural scenes without considering the characteristics of remotely sensed data. In addition, most trackers are composed of two independent stages of detection and re-identification (ReID), which means that they cannot be mutually promoted. To this end, we propose an end-to-end online framework, which is called TGraM, for multi-object tracking in satellite videos. It models multi-object tracking as a graph information reasoning procedure from the multi-task learning perspective. Specifically, a graph-based spatiotemporal reasoning module is presented to mine the potential high-order correlations between video frames. Furthermore, considering the inconsistency of optimization objectives between detection and ReID, a multi-task gradient adversarial learning strategy is designed to regularize each task-specific network. Additionally, aiming at the data scarcity in this field, a large-scale and high-resolution Jilin1 satellite video dataset for multi-object tracking (AIR-MOT) is built for the experiments. Compared with state-of-the-art multi-object trackers, TGraM achieves efficient collaborative learning between detection and ReID, improving the tracking accuracy by 1.2 MOTA.

Paper

Please cite our paper if you find the code or dataset useful for your research.

@ARTICLE{He-TGRS-TGraM-2022,
  author={Q. {He} and X. {Sun} and Z. {Yan} and B. {Li} and K. {Fu}},
  journal={IEEE Transactions on Geoscience and Remote Sensing}, 
  title={Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling}, 
  year={2022},
  volume={},
  number={},
  pages={1-14},
  doi={}}

Installation

  • Clone this repo, and we'll call the directory that you cloned as ${TGRAM_ROOT}
  • Install dependencies. We use python 3.7 and pytorch >= 1.2.0
conda create -n TGraM
conda activate TGraM
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch
cd ${TGRAM_ROOT}
pip install -r requirements.txt
  • We use DCNv2 in our backbone network and more details can be found in their repo.
git clone https://github.com/CharlesShang/DCNv2
cd DCNv2
./make.sh
  • In order to run the code for demos, you also need to install ffmpeg.

Data preparation

AIR-MOT
   |——————images
   |        └——————train
   |        └——————test
   └——————labels_with_ids
            └——————train(empty)

Then, you can change the seq_root and label_root in src/gen_labels_airmot.py and run:

cd src
python gen_labels_airmot.py

to generate the labels of AIR-MOT.

Training

  • Download the training data
  • Change the dataset root directory 'root' in src/lib/cfg/data.json and 'data_dir' in src/lib/opts.py
  • Train on AIR-MOT:
sh experiments/airmot.sh

Tracking

  • The default settings run tracking on the testing dataset from AIR-MOT. Using the trained model, you can run:
cd src
CUDA_VISIBLE_DEVICES=0 python track_half_air.py mot --load_model ../exp/airmot/210529_airmot_tgrammbseg/model_last.pth --conf_thres 0.4 --val_mot17 True --gpus 5 --data_dir '/workspace/tgram/src/data/' --arch tgrammbseg  --num_frames 3 --num_workers 2 --output_dir '/workspace/tgram/result/' --save_images --down_ratio 4 --exp_name 210526_tgrammbseg_cam

to obtain the tracking results. You can also set save_images=True in src/track.py to save the visualization results of each frame.

Train on custom dataset

You can train TGraM on custom dataset by following several steps bellow:

  1. Generate one txt label file for one image. Each line of the txt label file represents one object. The format of the line is: "class id x_center/img_width y_center/img_height w/img_width h/img_height". You can modify src/gen_labels_16.py to generate label files for your custom dataset.
  2. Generate files containing image paths. The example files are in src/data/. Some similar code can be found in src/gen_labels_crowd.py
  3. Create a json file for your custom dataset in src/lib/cfg/. You need to specify the "root" and "train" keys in the json file. You can find some examples in src/lib/cfg/.
  4. Add --data_cfg '../src/lib/cfg/your_dataset.json' when training.

Acknowledgement

A large part of the code is borrowed from Zhongdao/Towards-Realtime-MOT and xingyizhou/CenterNet. Thanks for their wonderful works.

Owner
Qibin He
Qibin He
Code for "Layered Neural Rendering for Retiming People in Video."

Layered Neural Rendering in PyTorch This repository contains training code for the examples in the SIGGRAPH Asia 2020 paper "Layered Neural Rendering

Google 154 Dec 16, 2022
The Adapter-Bot: All-In-One Controllable Conversational Model

The Adapter-Bot: All-In-One Controllable Conversational Model This is the implementation of the paper: The Adapter-Bot: All-In-One Controllable Conver

CAiRE 37 Nov 04, 2022
PyTorch Lightning implementation of Automatic Speech Recognition

lasr Lightening Automatic Speech Recognition An MIT License ASR research library, built on PyTorch-Lightning, for developing end-to-end ASR models. In

Soohwan Kim 40 Sep 19, 2022
Implementation of the Point Transformer layer, in Pytorch

Point Transformer - Pytorch Implementation of the Point Transformer self-attention layer, in Pytorch. The simple circuit above seemed to have allowed

Phil Wang 501 Jan 03, 2023
AdamW optimizer and cosine learning rate annealing with restarts

AdamW optimizer and cosine learning rate annealing with restarts This repository contains an implementation of AdamW optimization algorithm and cosine

Maksym Pyrozhok 133 Dec 20, 2022
PyGCL: A PyTorch Library for Graph Contrastive Learning

PyGCL is a PyTorch-based open-source Graph Contrastive Learning (GCL) library, which features modularized GCL components from published papers, standa

PyGCL 588 Dec 31, 2022
T-LOAM: Truncated Least Squares Lidar-only Odometry and Mapping in Real-Time

T-LOAM: Truncated Least Squares Lidar-only Odometry and Mapping in Real-Time The first Lidar-only odometry framework with high performance based on tr

Pengwei Zhou 183 Dec 01, 2022
This project provides a stock market environment using OpenGym with Deep Q-learning and Policy Gradient.

Stock Trading Market OpenAI Gym Environment with Deep Reinforcement Learning using Keras Overview This project provides a general environment for stoc

Kim, Ki Hyun 769 Dec 25, 2022
Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

Chetan Hirapara 3 Oct 07, 2022
Retinal vessel segmentation based on GT-UNet

Retinal vessel segmentation based on GT-UNet Introduction This project is a retinal blood vessel segmentation code based on UNet-like Group Transforme

Kent0n 27 Dec 18, 2022
Keepsake is a Python library that uploads files and metadata (like hyperparameters) to Amazon S3 or Google Cloud Storage

Keepsake Version control for machine learning. Keepsake is a Python library that uploads files and metadata (like hyperparameters) to Amazon S3 or Goo

Replicate 1.6k Dec 29, 2022
Roadmap to becoming a machine learning engineer in 2020

Roadmap to becoming a machine learning engineer in 2020, inspired by web-developer-roadmap.

Chris Hoyean Song 1.7k Dec 29, 2022
Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Yaoming Cai 5 Jul 18, 2022
Code for "Retrieving Black-box Optimal Images from External Databases" (WSDM 2022)

Retrieving Black-box Optimal Images from External Databases (WSDM 2022) We propose how a user retreives an optimal image from external databases of we

joisino 5 Apr 13, 2022
A tensorflow implementation of Fully Convolutional Networks For Semantic Segmentation

##A tensorflow implementation of Fully Convolutional Networks For Semantic Segmentation. #USAGE To run the trained classifier on some images: python w

Alex Seewald 13 Nov 17, 2022
Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

Boostcamp AI Tech 3rd : Basic Paper Reading w.r.t Embedding TL;DR 1992년부터 2018년도까지 이루어진 word/sentence embedding의 중요한 줄기를 이루는 기초 논문 스터디를 진행하고자 합니다. 논

Soyeon Kim 14 Nov 14, 2022
Optical Character Recognition + Instance Segmentation for russian and english languages

Распознавание рукописного текста в школьных тетрадях Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS. Результаты

Gerasimov Maxim 21 Dec 19, 2022
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Junbin Xiao 50 Nov 24, 2022
Fairness Metrics: All you need to know

Fairness Metrics: All you need to know Testing machine learning software for ethical bias has become a pressing current concern. Recent research has p

Anonymous2020 1 Jan 17, 2022
Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

Scripts for "Current best-practices in single-cell RNA-seq: a tutorial" This repository is complementary to the publication: M.D. Luecken, F.J. Theis,

Theis Lab 968 Dec 28, 2022