LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Overview

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Introduction

This is the official code of LiDAR R-CNN: An Efficient and Universal 3D Object Detector. In this work, we present LiDAR R-CNN, a second stage detector that can generally improve any existing 3D detector. We find a common problem in Point-based RCNN, which is the learned features ignore the size of proposals, and propose several methods to remedy it. Evaluated on WOD benchmarks, our method significantly outperforms previous state-of-the-art.

中文介绍:https://zhuanlan.zhihu.com/p/359800738

Requirements

All the codes are tested in the following environment:

  • Linux (tested on Ubuntu 16.04)
  • Python 3.6+
  • PyTorch 1.5 or higher (tested on PyTorch 1.5, 6, 7)
  • CUDA 10.1

To install pybind11:

git clone [email protected]:pybind/pybind11.git
cd pybind11
mkdir build && cd build
cmake .. && make -j 
sudo make install

To install requirements:

pip install -r requirements.txt
apt-get install ninja-build libeigen3-dev

Install LiDAR_RCNN library:

python setup.py develop --user

Preparing Data

Please refer to data processer to generate the proposal data.

Training

After preparing WOD data, we can train the vehicle only model in the paper, run this command:

python -m torch.distributed.launch --nproc_per_node=4 tools/train.py --cfg config/lidar_rcnn.yaml --name lidar_rcnn

For 3 class in WOD:

python -m torch.distributed.launch --nproc_per_node=8 tools/train.py --cfg config/lidar_rcnn_all_cls.yaml --name lidar_rcnn_all

The models and logs will be saved to work_dirs/outputs.

Evaluation

To evaluate, run distributed testing with 4 gpus:

python -m torch.distributed.launch --nproc_per_node=4 tools/test.py --cfg config/lidar_rcnn.yaml --checkpoint outputs/lidar_rcnn/checkpoint_lidar_rcnn_59.pth.tar
python tools/create_results.py --cfg config/lidar_rcnn.yaml

Note that, you should keep the nGPUS in config equal to nproc_per_node .This will generate a val.bin file in the work_dir/results. You can create submission to Waymo server using waymo-open-dataset code by following the instructions here.

Results

Our model achieves the following performance on:

Waymo Open Dataset Challenges (3D Detection)

Proposals from Class Channel 3D AP L1 Vehicle 3D AP L1 Pedestrian 3D AP L1 Cyclist
PointPillars Vehicle 1x 75.6 - -
PointPillars Vehicle 2x 75.6 - -
PointPillars 3 Class 1x 73.4 70.7 67.4
PointPillars 3 Class 2x 73.8 71.9 69.4
Proposals from Class Channel 3D AP L2 Vehicle 3D AP L2 Pedestrian 3D AP L2 Cyclist
PointPillars Vehicle 1x 66.8 - -
PointPillars Vehicle 2x 67.9 - -
PointPillars 3 Class 1x 64.8 62.4 64.8
PointPillars 3 Class 2x 65.1 63.5 66.8

Citation

If you find our paper or repository useful, please consider citing

@article{li2021lidar,
  title={LiDAR R-CNN: An Efficient and Universal 3D Object Detector},
  author={Li, Zhichao and Wang, Feng and Wang, Naiyan},
  journal={CVPR},
  year={2021},
}

Acknowledgement

Comments
  • How is the PP model trained

    How is the PP model trained

    This model file checkpoints/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car-9fa20624.pth in the docs cannot be found in mmdet3d official repo (they only have the interval-5 pretrained models). Are the proposals extracted with interval-1 models: 3d-car and 3d-3class? If I want to reproduce your results, do I need to first train with these two configs? Thanks.

    opened by haotian-liu 21
  • checkpoint shape error

    checkpoint shape error

    hi~ Zhichao Li /Feng Wang/ Naiyan Wang~

    I am very interested in your work LIDAR RCNN, but when I use the LIDAR RCNN pretrained model you gave me checkpoint_lidar_rcnn_59.pth.tar(MD5:6416c502af3cb73f0c39dd0cabdee2cb, I found that the weights of the pretrained model are 9 dimensions, but your input data is 12 dimensions.

    Can you provide me a pretrained model whose dimensions are correctly matched.

    image

    image

    I found that in one of your commits, the dimension was increased from 9 to 12 dimensions, but the latest pre-trained model is still 9 dimensions

    opened by hutao568 11
  • Transfered To Nuscenes Dataset,Performance decline

    Transfered To Nuscenes Dataset,Performance decline

    When I transfered it to the CenterpointNet and nuscenes datasets, Then evaluated on nuscense, it didn’t seem to work. I don’t know what went wrong, Looking forward to your suggestions and comments.

    opened by Suodislie 9
  • Run inference on single GPU

    Run inference on single GPU

    Hi, I am able to do all setup as per instructions given in README In the evaluation step,

    python -m torch.distributed.launch --nproc_per_node=4 tools/test.py --cfg config/lidar_rcnn.yaml --checkpoint outputs/lidar_rcnn/checkpoint_lidar_rcnn_59.pth.tar
    python tools/create_results.py --cfg config/lidar_rcnn.yaml
    

    I am facing the following questions while running the evaluation.

    1. How to change the command to run a single GPU, nproc_per_node needs to be 1.
    2. What should be MODEL.Frame number for checkpoint_lidar_rcnn_59.pth.tar? Since I am trying to understand the evaluation, kindly help me on this to fix.
    opened by kamalasubha 7
  • The cls scores are useless on my own dataset

    The cls scores are useless on my own dataset

    Thanks for your awesome works. When I use Lidar-RCNN on my own dataset, the refine score is useless, Most objects are classified as backgrounds. In addition, the average refined center error is only reduced by 1 cm. I don't know Is this normal?

    opened by xiuzhizheng 6
  • What processes in LIDAR-RCNN are specific for waymo dataset?

    What processes in LIDAR-RCNN are specific for waymo dataset?

    Hello, just like the title saying, I wonder what are the specific processes for WOD, which means if I want to use LIDAR R-CNN on my own dataset, I have to do it differently. I already change the data_processor and everything I can think of in the loader and creat_results that are respect to waymo dataset, then I use the refined results to perform evaluation on my own dataset. However, I got NAN on rotation error, and the MAP is pretty low. issue2

    Therefore, I'm confused about some subtle processes that are performed just for waymo not for other datasets. For example, compute heading residual is necessary for using LiDAR R-CNN? Did you guys use rotation in some sublte ways? (In my dataset, the rotation is according to y axis, while in your code, it's x axis, but the way of computing rotZ is the same, I already changed it.) image

    This bug has been driving me crazy, that's why my issue description above is a bit messy, forgive me please. I would be grateful if you could provide me some hints. Thank you a lot. Save this almost desperate kid, please.🥺

    opened by QingXIA233 6
  • The num of boxes of matching_gt_bbox is more than that of valid_gt?

    The num of boxes of matching_gt_bbox is more than that of valid_gt?

    Hello, sorry I come back with another question...... Recently, I've been working on using LiDAR R-CNN to refine the results of the CenterPoint-PP model with my own dataset. During data processing for my own dataset, I notice that the results of my CenterPoint-PP model has more bboxes detected than the ground truth ones (false detection case). When performing get_matching_by_iou function in LiDAR R-CNN, the obtained matching_gt_bbox has the same number of bboxes as the model predictions instead of the groundtruth data. I'm a bit confused about this process. Now that we are trying to do refinement, shouldn't we remove the falsely detected bboxes in the results and keep to the groundtruth? If so, why the matching bboxes is according to the predictions instead of groundtruth?

    issue

    Maybe I have some misunderstandings here, it would be a great helper if you could give me some hints. Thanks in advance.

    opened by QingXIA233 6
  • The pretrained model

    The pretrained model

    Hi, I am very interested in your paper, and I am reproducing it. The pretrained model of pointpillar provided in mmdetection3d does not reach the performance shown in the Table 2 below, so could you please provide the pretrained model of pointpillar in Table 2? Thank you very much!

    lidar_rcnn_per

    opened by SSY-1276 6
  • About train one iter data

    About train one iter data

    Hi~Sorry to bother you again! Is that right? The prediction frames of all frames are extracted at one time and then disrupted globally, which means that when lidar RCNN trains a batch, it contains different boxes of different frames. When the batchsize is 256, the extreme case may contain up to 256 frames, and each frame takes a box. Below is my idea! If I train two frames at a time, extract proposals through the frozen one-stage network, and then use lidarcnn for end-to-end training, is it ok?Do u have an idea about how to design the ROI sampler ratio?

    opened by DongfeiJi 6
  • Collaboration with MMDetection3D

    Collaboration with MMDetection3D

    Hi developers of LiDAR R-CNN,

    Congrats on the acceptance of the paper!

    LiDAR R-CNN achieves new state-of-the-art results through simple yet effective improvement, which is very insightful to the community. We also found that the baseline is based on the implementations in MMDetection3D.

    Therefore, I am coming to ask, as we believe LiDAR R-CNN might have a great impact on the community, would you like to also contribute an implementation of LiDAR R-CNN to MMDetection3D? If so, maybe we could have a more detailed discussion about that? MMDetection3D welcomes any kind of contribution. Please feel free to ask if there is anything from the MMDet3D team that could help.

    On behalf of the MMDet3D Development Team

    BR,

    Wenwei

    opened by ZwwWayne 6
  • checkpoint shape error

    checkpoint shape error

    hi~ Zhichao Li /Feng Wang/ Naiyan Wang~ 我对你们的工作LIDAR RCNN非常感兴趣,但是我在使用您给我的LIDAR RCNN预训练模型checkpoint_lidar_rcnn_59.pth.tar(MD5:6416c502af3cb73f0c39dd0cabdee2cb 时,发现预训练模型的权重是9维,但是你们的输入数据是12维12维 您可以提供给我维度可以正确匹配的预训练模型吗

    opened by hutao568 4
Releases(v0.1.1)
Owner
TuSimple
The Future of Trucking
TuSimple
SCALoss: Side and Corner Aligned Loss for Bounding Box Regression (AAAI2022).

SCALoss PyTorch implementation of the paper "SCALoss: Side and Corner Aligned Loss for Bounding Box Regression" (AAAI 2022). Introduction IoU-based lo

TuZheng 20 Sep 07, 2022
The official PyTorch implementation for the paper "sMGC: A Complex-Valued Graph Convolutional Network via Magnetic Laplacian for Directed Graphs".

Magnetic Graph Convolutional Networks About The official PyTorch implementation for the paper sMGC: A Complex-Valued Graph Convolutional Network via M

3 Feb 25, 2022
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis Multi-View Consistent Generative Adversarial Networks for 3D-aware

Xuanmeng Zhang 78 Dec 10, 2022
Implementations of paper Controlling Directions Orthogonal to a Classifier

Classifier Orthogonalization Implementations of paper Controlling Directions Orthogonal to a Classifier , ICLR 2022, Yilun Xu, Hao He, Tianxiao Shen,

Yilun Xu 33 Dec 01, 2022
Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Pytorch Pedestrian Attribute Recognition: A strong PyTorch baseline of pedestrian attribute recognition and multi-label classification.

Jian 79 Dec 18, 2022
Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

ViLT Code for the paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision" Install pip install -r requirements.txt pip

Wonjae Kim 922 Jan 01, 2023
Parameterized Explainer for Graph Neural Network

PGExplainer This is a Tensorflow implementation of the paper: Parameterized Explainer for Graph Neural Network https://arxiv.org/abs/2011.04573 NeurIP

Dongsheng Luo 89 Dec 12, 2022
Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019) Introduction Official implementation of Dynamic Multi-scale Filters for Semant

23 Oct 21, 2022
[CVPR'21] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation

MonoRUn MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. CVPR 2021. [paper] Hansheng Chen, Yuyao Huang, Wei Tian*

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 96 Dec 10, 2022
A project to make Amazon Echo respond to sign language using your webcam

Making Alexa respond to Sign Language using Tensorflow.js Try the live demo Read the Blog Post on Tensorflow's Blog Coming Soon Watch the video This p

Abhishek Singh 444 Jan 03, 2023
Source Code for Simulations in the Publication "Can the brain use waves to solve planning problems?"

Code for Simulations in the Publication Can the brain use waves to solve planning problems? Installing Required Python Packages Please use Python vers

EMD Group 2 Jul 01, 2022
A Python library for working with arbitrary-dimension hypercomplex numbers following the Cayley-Dickson construction of algebras.

Hypercomplex A Python library for working with quaternions, octonions, sedenions, and beyond following the Cayley-Dickson construction of hypercomplex

7 Nov 04, 2022
Open CV - Convert a picture to look like a cartoon sketch in python

Use the video https://www.youtube.com/watch?v=k7cVPGpnels for initial learning.

Sammith S Bharadwaj 3 Jan 29, 2022
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Keon Lee 67 Nov 14, 2022
pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

MVTN: Multi-View Transformation Network for 3D Shape Recognition (ICCV 2021) By Abdullah Hamdi, Silvio Giancola, Bernard Ghanem Paper | Video | Tutori

Abdullah Hamdi 64 Jan 03, 2023
Implementation for Shape from Polarization for Complex Scenes in the Wild

sfp-wild Implementation for Shape from Polarization for Complex Scenes in the Wild project website | paper Code and dataset will be released soon. Int

Chenyang LEI 41 Dec 23, 2022
CSE-519---Project - Job Title Analysis (Project for CSE 519 - Data Science Fundamentals)

A Multifaceted Approach to Job Title Analysis CSE 519 - Data Science Fundamentals Project Description Project consists of three parts: Salary Predicti

Jimit Dholakia 1 Jan 04, 2022
《Dual-Resolution Correspondence Network》(NeurIPS 2020)

Dual-Resolution Correspondence Network Dual-Resolution Correspondence Network, NeurIPS 2020 Dependency All dependencies are included in asset/dualrcne

Active Vision Laboratory 45 Nov 21, 2022
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling For Official repo of NU-Wave: A Diffusion Probabilistic Model for Neural Audio Up

Rishikesh (ऋषिकेश) 38 Oct 11, 2022
Semantic Segmentation in Pytorch. Network include: FCN、FCN_ResNet、SegNet、UNet、BiSeNet、BiSeNetV2、PSPNet、DeepLabv3_plus、 HRNet、DDRNet

🚀 If it helps you, click a star! ⭐ Update log 2020.12.10 Project structure adjustment, the previous code has been deleted, the adjustment will be re-

Deeachain 269 Jan 04, 2023