An efficient 3D semantic segmentation framework for Urban-scale point clouds like SensatUrban, Campus3D, etc.

Overview

SensatUrban-BEV-Seg3D

This is the official implementation of our BEV-Seg3D-Net, an efficient 3D semantic segmentation framework for Urban-scale point clouds like SensatUrban, Campus3D, etc.

Features of our framework/model:

  • leveraging various proven methods in 2D segmentation for 3D tasks
  • achieve competitive performance in the SensatUrban benchmark
  • fast inference process, about 1km^2 area per minute with RTX 3090.

To be done:

  • add more complex/efficient fusion models
  • add more backbone like ResNeXt, HRNet, DenseNet, etc.
  • add more novel projection methods like pointpillars

For technical details, please refer to:

Efficient Urban-scale Point Clouds Segmentation with BEV Projection
Zhenhong Zou, Yizhe Li, Xinyu Zhang

(1) Setup

This code has been tested with Python 3.7, PyTorch 1.8, CUDA 11.0 on Ubuntu 16.04. PyTorch of earlier versions should be supported.

  • Clone the repository
git clone https://github.com/zouzhenhong98/SensatUrban-BEV-Seg3D.git & cd SensatUrban-BEV-Seg3D
  • Setup python environment
conda create -n bevseg python=3.7
source activate bevseg
pip install -r requirements.txt

(2) Preprocess

We provide various data analysis and preprocess methods for the SensatUrban dataset. (Part of the following steps are optional)

  • Before data generation, change the path_to_your_dataset in preprocess/point_EDA_31.py by:
Sensat = SensatUrbanEDA()
Sensat.root_dir = 'path_to_your_dataset'
Sensat.split = 'train' # change to 'test' for inference
  • Initialize the BEV projection arguments. We provide our optimal setting below, but you can set other values for analysis:
Sensat.grids_scale = 0.05
Sensat.grids_size = 25
Sensat.grids_step = 25
  • (Optional) If you want to test the sliding window points generator:
data_dir = os.path.join(self.root_dir, self.split)
ply_list = sorted(os.listdir(data_dir))[0]
ply_path = os.path.join(data_dir, ply_name)
ply_data = self.load_points(ply_path, reformat=True)
grids_data = self.grid_generator(ply_data, self.grids_size, self.grids_step, False) # return an Iterator
  • Calculating spatial overlap ratio in BEV projection:
Sensat.single_ply_analysis(Sensat.exp_point_overlay_count) # randomly select one ply file
Sensat.batch_ply_analysis(Sensat.exp_point_overlay_count) # for all ply files in the path
  • Calculating class overlap ratio in BEV projection, that means we ignore overlapped points belonging to the same category:
Sensat.single_ply_analysis(Sensat.exp_class_overlay_count) # randomly select one ply file
Sensat.batch_ply_analysis(Sensat.exp_class_overlay_count) # for all ply files in the path
  • Test BEV projection and 3D remapping with IoU index test (reflecting the consistency in 3D Segmentation and BEV Segmentation tasks):
Sensat.evaluate('offline', Sensat.map_offline_img2pts)
  • BEV data generation:
Sensat.batch_ply_analysis(Sensat.exp_gen_bev_projection)
  • Point Spatial Overlap Ratio Statistics at different projection scales

  • More BEV projection testing results refers to our sample images: completion test at imgs/completion_test, edge detection with different CV operators at imgs/edge_detection, rgb and label projection samples at imgs/projection_sample

(3) Training & Inference

We provide two basic multimodal fusion network developped from U-Net in the modeling folder, unet.py is the basic feature fusion, and uneteca.py is the attention fusion.

  • Change the path_to_your_dataset in mypath.py and dataloaders/init.py >>> 'cityscapes'

  • Train from sratch

python train.py --use-balanced-weights --batch-size 8 --base-size 500 --crop-size 500 --loss-type focal --epochs 200 --eval-interval 1
  • Change the save_dir in inference.py

  • Inference on test data

python inference.py --batch-size 8
  • Prediction Results Visualization (RGB, altitude, label, prediction)

(4) Evaluation

  • Remap your BEV prediction to 3D and evaluate in 3D benchmark in preprocess/point_EDA_31.py (following the prvious initialization steps):
Sensat.evaluate_batch(Sensat.evaluate_batch_nn(Sensat.eval_offline_img2pts))

(5) Citation

If you find our work useful in your research, please consider citing: (Information is coming soon! We are asking the open-access term of the conference!)

(6) Acknowledgment

  • Part of our data processing code (read_ply and metrics) is developped based on https://github.com/QingyongHu/SensatUrban
  • Our code of neural network is developped based on a U-Net repo from the github, but unfortunately we are unable to recognize the raw github repo. Please tell us if you can help.

(7) Related Work

To learn more about our fusion segmentation methods, please refers to our previous work:

@article{Zhang2021ChannelAI,
    title={Channel Attention in LiDAR-camera Fusion for Lane Line Segmentation},
    author={Xinyu Zhang and Zhiwei Li and Xin Gao and Dafeng Jin and Jun Li},
    journal={Pattern Recognit.},
    year={2021},
    volume={118},
    pages={108020}
}

@article{Zou2021ANM,
    title={A novel multimodal fusion network based on a joint coding model for lane line segmentation},
    author={Zhenhong Zou and Xinyu Zhang and Huaping Liu and Zhiwei Li and A. Hussain and Jun Li},
    journal={ArXiv},
    year={2021},
    volume={abs/2103.11114}
}
Simple SN-GAN to generate CryptoPunks

CryptoPunks GAN Simple SN-GAN to generate CryptoPunks. Neural network architecture and training code has been modified from the PyTorch DCGAN example.

Teddy Koker 66 Dec 15, 2022
Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target i

NanYoMy 13 Oct 09, 2022
When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings This is the repository for t

RegLab 39 Jan 07, 2023
GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

GemNet: Universal Directional Graph Neural Networks for Molecules Reference implementation in PyTorch of the geometric message passing neural network

Data Analytics and Machine Learning Group 124 Dec 30, 2022
CRF-RNN for Semantic Image Segmentation - PyTorch version

This repository contains the official PyTorch implementation of the "CRF-RNN" semantic image segmentation method, published in the ICCV 2015

Sadeep Jayasumana 170 Dec 13, 2022
Implementing DropPath/StochasticDepth in PyTorch

%load_ext memory_profiler Implementing Stochastic Depth/Drop Path In PyTorch DropPath is available on glasses my computer vision library! Introduction

Francesco Saverio Zuppichini 13 Jan 05, 2023
NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch

PyTorch implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping Paper: https://arxiv.org/abs/2102.06171.pdf Original code: htt

Vaibhav Balloli 320 Jan 02, 2023
Neural Architecture Search Powered by Swarm Intelligence 🐜

Neural Architecture Search Powered by Swarm Intelligence 🐜 DeepSwarm DeepSwarm is an open-source library which uses Ant Colony Optimization to tackle

288 Oct 28, 2022
Parameterising Simulated Annealing for the Travelling Salesman Problem

Parameterising Simulated Annealing for the Travelling Salesman Problem

Gary Sun 55 Jun 15, 2022
A memory-efficient implementation of DenseNets

efficient_densenet_pytorch A PyTorch =1.0 implementation of DenseNets, optimized to save GPU memory. Recent updates Now works on PyTorch 1.0! It uses

Geoff Pleiss 1.4k Dec 25, 2022
NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM

NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM Automatic Evaluation Metric described in the papers BaryScore (EM

Pierre Colombo 28 Dec 28, 2022
ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection This repository contains implementation of the

Visual Understanding Lab @ Samsung AI Center Moscow 190 Dec 30, 2022
Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering

Nvdiffrast – Modular Primitives for High-Performance Differentiable Rendering Modular Primitives for High-Performance Differentiable Rendering Samuli

NVIDIA Research Projects 675 Jan 06, 2023
source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval This repository contains source code and pre-trained/fine-tun

Siqi 65 Dec 26, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

Lea Müller 83 Dec 14, 2022
Official code for UnICORNN (ICML 2021)

UnICORNN (Undamped Independent Controlled Oscillatory RNN) [ICML 2021] This repository contains the implementation to reproduce the numerical experime

Konstantin Rusch 21 Dec 22, 2022
A tensorflow model that predicts if the image is of a cat or of a dog.

Quick intro Hello and thank you for your interest in my project! This is the backend part of a two-repo application. The other part can be found here

Tudor Matei 0 Mar 08, 2022
How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021

LETGAN How to Learn a Domain Adaptive Event Simulator? ACM MM 2021 Running Environment: pytorch=1.4, 1 NVIDIA-1080TI. More details can be found in pap

CVTEAM 4 Sep 20, 2022
Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation"

Implicit-Semantic-Response-Alignment Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation" Prerequisites pyt

4 Dec 19, 2022