PyTorch implementation of SQN based on CloserLook3D's encoder

Overview

SQN_pytorch

This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, check our SQN_tensorflow repo.

Caution: currently, this repo does not achieve a satisfactory result as the SQN paper reports. For performance details, check performance section.

The repo is still under development, with the aim of reaching the level of performance reported in the SQN paper.(Note: our SQN_tensorflow repo has slightly higher performance than this pytorch repo.)

Please open an issue, if you have any comments and suggestions for improving the model performance.

TODOs

  • implement the training strategy mentioned in the Appendix of the paper.
  • ablation study
  • benchmark weak supervision

Install python packages

The latest codes are tested on two Ubuntu settings:

  • Ubuntu 18.04, Nvidia 1080, CUDA 10.1, PyTorch 1.4 and Python 3.6
  • Ubuntu 18.04, Nvidia 3090, CUDA 11.3, PyTorch 1.4 and Python 3.6

For details setting up the development environment, check CloserLook3D Pytorch version. To facilitate settings, below I also provide my own bash script( install.sh ) to create a conda environment from scratch for this repo. (You may need tailor this script according to your own system)

#!/bin/bash
ENV_NAME='closerlook'
conda create –n $ENV_NAME python=3.6.10 -y
source activate $ENV_NAME
conda install -c anaconda pillow=6.2 -y
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch -y
conda install -c conda-forge opencv -y
pip3 install termcolor tensorboard h5py easydict

Datasets

Take S3DIS as an example.

Scene Segmentation on S3DIS

You can download the S3DIS dataset from here (4.8 GB). You only need to download the file named Stanford3dDataset_v1.2.zip, unzip and move (or link) it to data/S3DIS/Stanford3dDataset_v1.2. (same as the CloserLook3D repo setting.)

The file structure should look like:

<root>
├── cfgs
│   └── s3dis
├── data
│   └── S3DIS
│       └── Stanford3dDataset_v1.2
│           ├── Area_1
│           ├── Area_2
│           ├── Area_3
│           ├── Area_4
│           ├── Area_5
│           └── Area_6
├── init.sh
├── datasets
├── function
├── models
├── ops
└── utils

run prepare-s3dis-sqn.sh to preprocess the S3DIS dataset. This script will generate a processed folder with the below structure with five types of data, including: raw, sub-sampled point clouds for each area, KDtrees for each sub-sampled area, projection indices for each raw point over the sub-sampled area and weak labels for raw and sub-sampled point clouds (involving different weak proportion of the dataset, e.g., 0.1, 0.01, 0.001, etc.. Details check datasets/S3DIS_sqn.py and my summary notes in this file.

The processed folder is organized as follows:

<root>
├── data
│   └── S3DIS
│       └── Stanford3dDataset_v1.2
│           ├── Area_1
│           ├── Area_2
│           ├── Area_3
│           ├── Area_4
│           ├── Area_5
│           ├── Area_6
│           └── processed
│             ├── weak_label_0.01
│             ├── weak_label_1.0
│             ├── Area_1_0.040_sub.pkl
│             ├── Area_1.pkl
│             ├── ...(many other pkl files)

Compile custom CUDA operators

sh init.sh

Run

use the run-sqn.sh script for training or evaluation.

The core training script is as follows:

python -m torch.distributed.launch \
--master_port 1234567 \
--nproc_per_node ${num_gpu} \
function/train_s3dis_dist_sqn.py \
--dataset_name ${dataset_name} \
--cfg cfgs/${dataset_name}/pospool_xyz_avg_sqn.yaml \
--num_points ${num_points} \
--batch_size ${batch_size} \
--val_freq 20 \
--weak_ratio ${weak_ratio}

The core evaluation script is as follows:

python -m torch.distributed.launch \
--master_port 12346 \
--nproc_per_node 1 \
function/evaluate_s3dis_dist_sqn.py \
--cfg cfgs/s3dis/pospool_xyz_avg_sqn.yaml \
--load_path <checkpoint>
[--log_dir <log directory>]

Performance on S3DIS

The experiments are still in progress due to my slow GPU.

Model Weak ratio Performance (mIoU, %) Description
Official RandLA-Net 100% 63.0 Fully supervised method trained with full labels.
Official SQN 1/1000 61.4 This official SQN uses additional techniques to improve the performance, our replicaed SQN currently does not investigate this yet. Official SQN does not provide results of S3DIS under the weak ratio of 1/10 and 1/100
Our replicated SQN 1/10 51.4 Use PosPool (s) as the encoder whose width=36, due to limited GPU usage and active learning is currently not used.
Our replicated SQN 1/100 25.22 Use PosPool (s) as the encoder whose width=36, due to limited GPU usage and active learning is currently not used.
Our replicated SQN 1/1000 21.10 Use PosPool (s) as the encoder whose width=36, due to limited GPU usage and active learning is currently not used.

Acknowledgements

Our pytorch codes borrowed a lot from CloserLook3D and the custom trilinear interoplation CUDA ops are modified from erikwijmans's Pointnet2_PyTorch.

Citation

If you find our work useful in your research, please consider citing:

@article{pytorchpointnet++,
    Author = {YIN, Chao},
    Title = {SQN Pytorch implementation based on CloserLook3D's encoder},
    Journal = {https://github.com/PointCloudYC/SQN_pytorch},
    Year = {2021}
   }

@article{hu2021sqn,
    title={SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds with 1000x Fewer Labels},
    author={Hu, Qingyong and Yang, Bo and Fang, Guangchi and Guo, Yulan and Leonardis, Ales and Trigoni, Niki and Markham, Andrew},
    journal={arXiv preprint arXiv:2104.04891},
    year={2021}
  }
Owner
PointCloudYC
Ph.D candidate at HKUST, focus on point cloud processing and deep learning.
PointCloudYC
Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 07, 2022
BookMyShowPC - Movie Ticket Reservation App made with Tkinter

Book My Show PC What is this? Movie Ticket Reservation App made with Tkinter. Tk

The Nithin Balaji 3 Dec 09, 2022
Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

Liangming Pan 47 Jan 01, 2023
Code and data for "TURL: Table Understanding through Representation Learning"

TURL This Repo contains code and data for "TURL: Table Understanding through Representation Learning". Environment and Setup Data Pretraining Finetuni

SunLab-OSU 63 Nov 23, 2022
Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

SA-Net: Shuffle Attention for Deep Convolutional Neural Networks (paper) By Qing-Long Zhang and Yu-Bin Yang [State Key Laboratory for Novel Software T

Qing-Long Zhang 199 Jan 08, 2023
VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations 3D-aware Image Synthesis via Learning Structural and Textura

GenForce: May Generative Force Be with You 116 Dec 26, 2022
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)

A simple API for working with University of California, Irvine (UCI) Machine Learning (ML) repository Table of Contents Introduction About Page of the

Tirthajyoti Sarkar 223 Dec 05, 2022
The UI as a mobile display for OP25

OP25 Mobile Control Head A 'remote' control head that interfaces with an OP25 instance. We take advantage of some data end-points left exposed for the

Sarah Rose Giddings 13 Dec 28, 2022
[CVPR 2021] Monocular depth estimation using wavelets for efficiency

Single Image Depth Prediction with Wavelet Decomposition Michaël Ramamonjisoa, Michael Firman, Jamie Watson, Vincent Lepetit and Daniyar Turmukhambeto

Niantic Labs 205 Jan 02, 2023
This repository contains the code for the paper ``Identifiable VAEs via Sparse Decoding''.

Sparse VAE This repository contains the code for the paper ``Identifiable VAEs via Sparse Decoding''. Data Sources The datasets used in this paper wer

Gemma Moran 17 Dec 12, 2022
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [BCNet, CVPR 2021] This is the official pytorch implementation of BCNet built on

Lei Ke 434 Dec 01, 2022
LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations

LIMEcraft LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations The LIMEcraft algorithm is an explanatory method based on

MI^2 DataLab 4 Aug 01, 2022
Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

Troyanskaya Laboratory 323 Jan 01, 2023
CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

CvT-ASSD including extra CvT, CvT-SSD, VGG-ASSD models original-code-website: https://github.com/albert-jin/CvT-SSD new-code-website: https://github.c

金伟强 -上海大学人工智能小渣渣~ 5 Mar 07, 2022
Really awesome semantic segmentation

really-awesome-semantic-segmentation A list of all papers on Semantic Segmentation and the datasets they use. This site is maintained by Holger Caesar

Holger Caesar 400 Nov 28, 2022
Deep Reinforcement Learning with pytorch & visdom

Deep Reinforcement Learning with pytorch & visdom Sample testings of trained agents (DQN on Breakout, A3C on Pong, DoubleDQN on CartPole, continuous A

Jingwei Zhang 783 Jan 04, 2023
Omniscient Video Super-Resolution

Omniscient Video Super-Resolution This is the official code of OVSR (Omniscient Video Super-Resolution, ICCV 2021). This work is based on PFNL. Datase

36 Oct 27, 2022
PyTorch reimplementation of minimal-hand (CVPR2020)

Minimal Hand Pytorch Unofficial PyTorch reimplementation of minimal-hand (CVPR2020). you can also find in youtube or bilibili bare hand youtube or bil

Hao Meng 228 Dec 29, 2022
DenseNet Implementation in Keras with ImageNet Pretrained Models

DenseNet-Keras with ImageNet Pretrained Models This is an Keras implementation of DenseNet with ImageNet pretrained weights. The weights are converted

Felix Yu 568 Oct 31, 2022
Mercer Gaussian Process (MGP) and Fourier Gaussian Process (FGP) Regression

Mercer Gaussian Process (MGP) and Fourier Gaussian Process (FGP) Regression We provide the code used in our paper "How Good are Low-Rank Approximation

Aristeidis (Ares) Panos 0 Dec 13, 2021