A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

Overview

Open3DSOT

A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

The official code release of BAT and MM Track.

Features

  • Modular design. It is easy to config the model and training/testing behaviors through just a .yaml file.
  • DDP support for both training and testing.
  • Support all common tracking datasets (KITTI, NuScenes, Waymo Open Dataset).

📣 One tracking paper is accepted by CVPR2022 (Oral)! 👇

Trackers

This repository includes the implementation of the following models:

MM-Track (CVPR2022 Oral)

[Paper] [Project Page]

MM-Track is the first motion-centric tracker in LiDAR SOT, which robustly handles distractors and drastic appearance changes in complex driving scenes. Unlike previous methods, MM-Track is a matching-free two-stage tracker which localizes the targets by explicitly modeling the "relative target motion" among frames.

BAT (ICCV2021)

[Paper] [Results]

Official implementation of BAT. BAT uses the BBox information to compensate the information loss of incomplete scans. It augments the target template with box-aware features that efficiently and effectively improve appearance matching.

P2B (CVPR2020)

[Paper] [Official implementation]

Third party implementation of P2B. Our implementation achieves better results than the official code release. P2B adapts SiamRPN to 3D point clouds by integrating a pointwise correlation operator with a point-based RPN (VoteNet).

Setup

Installation

  • Create the environment

    git clone https://github.com/Ghostish/Open3DSOT.git
    cd Open3DSOT
    conda create -n Open3DSOT  python=3.6
    conda activate Open3DSOT
    
  • Install pytorch

    conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
    

    Our code is well tested with pytorch 1.4.0 and CUDA 10.1. But other platforms may also work. Follow this to install another version of pytorch. Note: In order to reproduce the reported results with the provided checkpoints, please use CUDA 10.x.

  • Install other dependencies:

    pip install -r requirement.txt
    

    Install the nuscenes-devkit if you use want to use NuScenes dataset:

    pip install nuscenes-devkit
    

KITTI dataset

  • Download the data for velodyne, calib and label_02 from KITTI Tracking.
  • Unzip the downloaded files.
  • Put the unzipped files under the same folder as following.
    [Parent Folder]
    --> [calib]
        --> {0000-0020}.txt
    --> [label_02]
        --> {0000-0020}.txt
    --> [velodyne]
        --> [0000-0020] folders with velodynes .bin files
    

NuScenes dataset

  • Download the dataset from the download page
  • Extract the downloaded files and make sure you have the following structure:
    [Parent Folder]
      samples	-	Sensor data for keyframes.
      sweeps	-	Sensor data for intermediate frames.
      maps	        -	Folder for all map files: rasterized .png images and vectorized .json files.
      v1.0-*	-	JSON tables that include all the meta data and annotations. Each split (trainval, test, mini) is provided in a separate folder.
    

Note: We use the train_track split to train our model and test it with the val split. Both splits are officially provided by NuScenes. During testing, we ignore the sequences where there is no point in the first given bbox.

Waymo dataset

  • Download and prepare dataset by the instruction of CenterPoint.
    [Parent Folder]
      tfrecord_training	                    
      tfrecord_validation	                 
      train 	                                    -	all training frames and annotations 
      val   	                                    -	all validation frames and annotations 
      infos_train_01sweeps_filter_zero_gt.pkl
      infos_val_01sweeps_filter_zero_gt.pkl
    
  • Prepare SOT dataset. Data from specific category and split will be merged (e.g., sot_infos_vehicle_train.pkl).
  python datasets/generate_waymo_sot.py

Quick Start

Training

To train a model, you must specify the .yaml file with --cfg argument. The .yaml file contains all the configurations of the dataset and the model. Currently, we provide four .yaml files under the cfgs directory. Note: Before running the code, you will need to edit the .yaml file by setting the path argument as the correct root of the dataset.

python main.py --gpu 0 1 --cfg cfgs/BAT_Car.yaml  --batch_size 50 --epoch 60 --preloading

After you start training, you can start Tensorboard to monitor the training process:

tensorboard --logdir=./ --port=6006

By default, the trainer runs a full evaluation on the full test split after training every epoch. You can set --check_val_every_n_epoch to a larger number to speed up the training. The --preloading flag is used to preload the training samples into the memory to save traning time. Remove this flag if you don't have enough memory.

Testing

To test a trained model, specify the checkpoint location with --checkpoint argument and send the --test flag to the command.

python main.py --gpu 0 1 --cfg cfgs/BAT_Car.yaml  --checkpoint /path/to/checkpoint/xxx.ckpt --test

Reproduction

Model Category Success Precision Checkpoint
BAT-KITTI Car 65.37 78.88 pretrained_models/bat_kitti_car.ckpt
BAT-NuScenes Car 40.73 43.29 pretrained_models/bat_nuscenes_car.ckpt
BAT-KITTI Pedestrian 45.74 74.53 pretrained_models/bat_kitti_pedestrian.ckpt

Three trained BAT models for KITTI and NuScenes datasets are provided in the pretrained_models directory. To reproduce the results, simply run the code with the corresponding .yaml file and checkpoint. For example, to reproduce the tracking results on KITTI Car, just run:

python main.py --gpu 0 1 --cfg cfgs/BAT_Car.yaml  --checkpoint ./pretrained_models/bat_kitti_car.ckpt --test

Acknowledgment

  • This repo is built upon P2B and SC3D.
  • Thank Erik Wijmans for his pytorch implementation of PointNet++

License

This repository is released under MIT License (see LICENSE file for details).

Owner
Kangel Zenn
Ph.D. Student in CUHKSZ.
Kangel Zenn
VOGUE: Try-On by StyleGAN Interpolation Optimization

VOGUE is a StyleGAN interpolation optimization algorithm for photo-realistic try-on. Top: shirt try-on automatically synthesized by our method in two different examples.

Wei ZHANG 66 Dec 09, 2022
Generate pixel-style avatars with python.

face2pixel Generate pixel-style avatars with python. Run: Clone the project: git clone https://github.com/theodorecooper/face2pixel install requiremen

Theodore Cooper 2 May 11, 2022
Learning to See by Looking at Noise

Learning to See by Looking at Noise This is the official implementation of Learning to See by Looking at Noise. In this work, we investigate a suite o

Manel Baradad Jurjo 82 Dec 24, 2022
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
Posterior predictive distributions quantify uncertainties ignored by point estimates.

Posterior predictive distributions quantify uncertainties ignored by point estimates.

DeepMind 177 Dec 06, 2022
Repo for "Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks"

Summary This is the code for the paper Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks by Yanxiang Wang, Xian Zh

zhangxian 54 Jan 03, 2023
A little Python application to auto tag your photos with the power of machine learning.

Tag Machine A little Python application to auto tag your photos with the power of machine learning. Report a bug or request a feature Table of Content

Florian Torres 14 Dec 21, 2022
Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021) The implementation of Reducing Infromation Bottleneck for W

Jungbeom Lee 81 Dec 16, 2022
unet-family: Ultimate version

unet-family: Ultimate version 基于之前my-unet代码,我整理出来了这一份终极版本unet-family,方便其他人阅读。 相比于之前的my-unet代码,代码分类更加规范,有条理 对于clone下来的代码不需要修改各种复杂繁琐的路径问题,直接就可以运行。 并且代码有

2 Sep 19, 2022
Simple machine learning library / 簡單易用的機器學習套件

FukuML Simple machine learning library / 簡單易用的機器學習套件 Installation $ pip install FukuML Tutorial Lesson 1: Perceptron Binary Classification Learning Al

Fukuball Lin 279 Sep 15, 2022
Research on Tabular Deep Learning (Python package & papers)

Research on Tabular Deep Learning For paper implementations, see the section "Papers and projects". rtdl is a PyTorch-based package providing a user-f

Yura Gorishniy 510 Dec 30, 2022
Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

fwhr-calc-website This project is to automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azur

SoohyunPark 1 Feb 07, 2022
The Codebase for Causal Distillation for Language Models.

Causal Distillation for Language Models Zhengxuan Wu*,Atticus Geiger*, Josh Rozner, Elisa Kreiss, Hanson Lu, Thomas Icard, Christopher Potts, Noah D.

Zen 20 Dec 31, 2022
Learning Time-Critical Responses for Interactive Character Control

Learning Time-Critical Responses for Interactive Character Control Abstract This code implements the paper Learning Time-Critical Responses for Intera

Movement Research Lab 227 Dec 31, 2022
Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

ATISS: Autoregressive Transformers for Indoor Scene Synthesis This repository contains the code that accompanies our paper ATISS: Autoregressive Trans

138 Dec 22, 2022
Extracts data from the database for a graph-node and stores it in parquet files

subgraph-extractor Extracts data from the database for a graph-node and stores it in parquet files Installation For developing, it's recommended to us

Cardstack 0 Jan 10, 2022
Classifying cat and dog images using Kaggle dataset

PyTorch Image Classification Classifies an image as containing either a dog or a cat (using Kaggle's public dataset), but could easily be extended to

Robert Coleman 74 Nov 22, 2022
Pytorch-3dunet - 3D U-Net model for volumetric semantic segmentation written in pytorch

pytorch-3dunet PyTorch implementation 3D U-Net and its variants: Standard 3D U-Net based on 3D U-Net: Learning Dense Volumetric Segmentation from Spar

Adrian Wolny 1.3k Dec 28, 2022
Multi-query Video Retreival

Multi-query Video Retreival

Princeton Visual AI Lab 17 Nov 22, 2022
Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".

Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".

Xiangyin Kong 7 Nov 08, 2022