Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

Overview

RfD-Net [Project Page] [Paper] [Video]

RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Nießner
In CVPR, 2021.

points.png pred.png

From an incomplete point cloud of a 3D scene (left), our method learns to jointly understand the 3D objects and reconstruct instance meshes as the output (right).


Install

  1. This implementation uses Python 3.6, Pytorch1.7.1, cudatoolkit 11.0. We recommend to use conda to deploy the environment.

    • Install with conda:
    conda env create -f environment.yml
    conda activate rfdnet
    
    • Install with pip:
    pip install -r requirements.txt
    
  2. Next, compile the external libraries by

    python setup.py build_ext --inplace
    
  3. Install PointNet++ by

    cd external/pointnet2_ops_lib
    pip install .
    

Demo

The pretrained model can be downloaded here. Put the pretrained model in the directory as below

out/pretrained_models/pretrained_weight.pth

A demo is illustrated below to see how our method works.

cd RfDNet
python main.py --config configs/config_files/ISCNet_test.yaml --mode demo --demo_path demo/inputs/scene0549_00.off

VTK is used here to visualize the 3D scenes. The outputs will be saved under 'demo/outputs'. You can also play with your toy with this script.

If everything goes smooth, there will be a GUI window popped up and you can interact with the scene as below. screenshot_demo.png

If you run it on machines without X display server, you can use the offscreen mode by setting offline=True in demo.py. The rendered image will be saved in demo/outputs/some_scene_id/pred.png.


Prepare Data

In our paper, we use the input point cloud from the ScanNet dataset, and the annotated instance CAD models from the Scan2CAD dataset. Scan2CAD aligns the object CAD models from ShapeNetCore.v2 to each object in ScanNet, and we use these aligned CAD models as the ground-truth.

Preprocess ScanNet and Scan2CAD data

You can either directly download the processed samples [link] to the directory below (recommended)

datasets/scannet/processed_data/

or

  1. Ask for the ScanNet dataset and download it to
    datasets/scannet/scans
    
  2. Ask for the Scan2CAD dataset and download it to
    datasets/scannet/scan2cad_download_link
    
  3. Preprocess the ScanNet and Scan2CAD dataset for training by
    cd RfDNet
    python utils/scannet/gen_scannet_w_orientation.py
    
Preprocess ShapeNet data

You can either directly download the processed data [link] and extract them to datasets/ShapeNetv2_data/ as below

datasets/ShapeNetv2_data/point
datasets/ShapeNetv2_data/pointcloud
datasets/ShapeNetv2_data/voxel
datasets/ShapeNetv2_data/watertight_scaled_simplified

or

  1. Download ShapeNetCore.v2 to the path below

    datasets/ShapeNetCore.v2
    
  2. Process ShapeNet models into watertight meshes by

    python utils/shapenet/1_fuse_shapenetv2.py
    
  3. Sample points on ShapeNet models for training (similar to Occupancy Networks).

    python utils/shapenet/2_sample_mesh.py --resize --packbits --float16
    
  4. There are usually 100K+ points per object mesh. We simplify them to speed up our testing and visualization by

    python utils/shapenet/3_simplify_fusion.py --in_dir datasets/ShapeNetv2_data/watertight_scaled --out_dir datasets/ShapeNetv2_data/watertight_scaled_simplified
    
Verify preprocessed data

After preprocessed the data, you can run the visualization script below to check if they are generated correctly.

  • Visualize ScanNet+Scan2CAD+ShapeNet samples by

    python utils/scannet/visualization/vis_gt.py
    

    A VTK window will be popped up like below.

    verify.png

Training, Generating and Evaluation

We use the configuration file (see 'configs/config_files/****.yaml') to fully control the training/testing/generating process. You can check a template at configs/config_files/ISCNet.yaml.

Training

We firstly pretrain our detection module and completion module followed by a joint refining. You can follow the process below.

  1. Pretrain the detection module by

    python main.py --config configs/config_files/ISCNet_detection.yaml --mode train
    

    It will save the detection module weight at out/iscnet/a_folder_with_detection_module/model_best.pth

  2. Copy the weight path of detection module (see 1.) into configs/config_files/ISCNet_completion.yaml as

    weight: ['out/iscnet/a_folder_with_detection_module/model_best.pth']
    

    Then pretrain the completion module by

    python main.py --config configs/config_files/ISCNet_completion.yaml --mode train
    

    It will save the completion module weight at out/iscnet/a_folder_with_completion_module/model_best.pth

  3. Copy the weight path of completion module (see 2.) into configs/config_files/ISCNet.yaml as

    weight: ['out/iscnet/a_folder_with_completion_module/model_best.pth']
    

    Then jointly finetune RfD-Net by

    python main.py --config configs/config_files/ISCNet.yaml --mode train
    

    It will save the trained model weight at out/iscnet/a_folder_with_RfD-Net/model_best.pth

Generating

Copy the weight path of RfD-Net (see 3. above) into configs/config_files/ISCNet_test.yaml as

weight: ['out/iscnet/a_folder_with_RfD-Net/model_best.pth']

Run below to output all scenes in the test set.

python main.py --config configs/config_files/ISCNet_test.yaml --mode test

The 3D scenes for visualization are saved in the folder of out/iscnet/a_folder_with_generated_scenes/visualization. You can visualize a triplet of (input, pred, gt) following a demo below

python utils/scannet/visualization/vis_for_comparison.py 

If everything goes smooth, there will be three windows (corresponding to input, pred, gt) popped up by sequence as

Input Prediction Ground-truth

Evaluation

You can choose each of the following ways for evaluation.

  1. You can export all scenes above to calculate the evaluation metrics with any external library (for researchers who would like to unify the benchmark). Lower the dump_threshold in ISCNet_test.yaml in generation to enable more object proposals for mAP calculation (e.g. dump_threshold=0.05).

  2. In our evaluation, we voxelize the 3D scenes to keep consistent resolution with the baseline methods. To enable this,

    1. make sure the executable binvox are downloaded and configured as an experiment variable (e.g. export its path in ~/.bashrc for Ubuntu). It will be deployed by Trimesh.

    2. Change the ISCNet_test.yaml as below for evaluation.

       test:
         evaluate_mesh_mAP: True
       generation:
         dump_results: False
    

    Run below to report the evaluation results.

    python main.py --config configs/config_files/ISCNet_test.yaml --mode test
    

    The log file will saved in out/iscnet/a_folder_named_with_script_time/log.txt


Differences to the paper

  1. The original paper was implemented with Pytorch 1.1.0, and we reconfigure our code to fit with Pytorch 1.7.1.
  2. A post processing step to align the reconstructed shapes to the input scan is supported. We have verified that it can improve the evaluation performance by a small margin. You can switch on/off it following demo.py.
  3. A different learning rate scheduler is adopted. The learning rate decreases to 0.1x if there is no gain within 20 steps, which is much more efficient.

Citation

If you find our work helpful, please consider citing

@inproceedings{Nie_2021_CVPR,
    title={RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction},
    author={Nie, Yinyu and Hou, Ji and Han, Xiaoguang and Nie{\ss}ner, Matthias},
    booktitle={Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
    year={2021}
}


License

RfD-Net is relased under the MIT License. See the LICENSE file for more details.

Owner
Yinyu Nie
currently a Ph.D. student at NCCA, Bournemouth University.
Yinyu Nie
HAT: Hierarchical Aggregation Transformers for Person Re-identification

HAT: Hierarchical Aggregation Transformers for Person Re-identification

11 Sep 05, 2022
SPTAG: A library for fast approximate nearest neighbor search

SPTAG: A library for fast approximate nearest neighbor search SPTAG SPTAG (Space Partition Tree And Graph) is a library for large scale vector approxi

Microsoft 4.3k Jan 01, 2023
Pytorch implementation of face attention network

Face Attention Network Pytorch implementation of face attention network as described in Face Attention Network: An Effective Face Detector for the Occ

Hooks 312 Dec 09, 2022
OCR-D wrapper for detectron2 based segmentation models

ocrd_detectron2 OCR-D wrapper for detectron2 based segmentation models Introduction Installation Usage OCR-D processor interface ocrd-detectron2-segm

Robert Sachunsky 13 Dec 06, 2022
This is the repository for the NeurIPS-21 paper [Contrastive Graph Poisson Networks: Semi-Supervised Learning with Extremely Limited Labels].

CGPN This is the repository for the NeurIPS-21 paper [Contrastive Graph Poisson Networks: Semi-Supervised Learning with Extremely Limited Labels]. Req

10 Sep 12, 2022
[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

REval Table of Contents Introduction Overview Requirements Installation Probing Usage Citation License 🎓 Introduction REval is a simple framework for

13 Jan 06, 2023
A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. IEEE ISM 2021

PGL-SUM: Combining Global and Local Attention with Positional Encoding for Video Summarization PyTorch Implementation of PGL-SUM From "PGL-SUM: Combin

Evlampios Apostolidis 35 Dec 22, 2022
NaturalProofs: Mathematical Theorem Proving in Natural Language

NaturalProofs: Mathematical Theorem Proving in Natural Language NaturalProofs: Mathematical Theorem Proving in Natural Language Sean Welleck, Jiacheng

Sean Welleck 83 Jan 05, 2023
[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training Code for NeurIPS 2021 paper "Better Safe Than Sorry: Preventing Delu

Lue Tao 29 Sep 20, 2022
Playing around with FastAPI and streamlit to create a YoloV5 object detector

FastAPI-Streamlit-based-YoloV5-detector Playing around with FastAPI and streamlit to create a YoloV5 object detector It turns out that a User Interfac

2 Jan 20, 2022
PyTorch implementation of Pay Attention to MLPs

gMLP PyTorch implementation of Pay Attention to MLPs. Quickstart Clone this repository. git clone https://github.com/jaketae/g-mlp.git Navigate to th

Jake Tae 34 Dec 13, 2022
3D-aware GANs based on NeRF (arXiv).

CIPS-3D This repository will contain the code of the paper, CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent Pixel Synthesis.

Peterou 563 Dec 31, 2022
Starter code for the ICCV 2021 paper, 'Detecting Invisible People'

Detecting Invisible People [ICCV 2021 Paper] [Website] Tarasha Khurana, Achal Dave, Deva Ramanan Introduction This repository contains code for Detect

Tarasha Khurana 28 Sep 16, 2022
Object-Centric Learning with Slot Attention

Slot Attention This is a re-implementation of "Object-Centric Learning with Slot Attention" in PyTorch (https://arxiv.org/abs/2006.15055). Requirement

Untitled AI 72 Jan 02, 2023
Implementation of Pix2Seq in PyTorch

pix2seq-pytorch Implementation of Pix2Seq paper Different from the paper image input size 1280 bin size 1280 LambdaLR scheduler used instead of Linear

Tony Shin 9 Dec 15, 2022
Gray Zone Assessment

Gray Zone Assessment Get started Clone github repository git clone https://github.com/andreanne-lemay/gray_zone_assessment.git Build docker image dock

1 Jan 08, 2022
Pytorch implementation of CVPR2020 paper “VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation”

VectorNet Re-implementation This is the unofficial pytorch implementation of CVPR2020 paper "VectorNet: Encoding HD Maps and Agent Dynamics from Vecto

120 Jan 06, 2023
Code for Fold2Seq paper from ICML 2021

[ICML2021] Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design Environment file: environment.yml Data and Feat

International Business Machines 43 Dec 04, 2022
The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

GCoNet The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection . Trained model Download final_gconet.pth

Qi Fan 46 Nov 17, 2022
SGoLAM - Simultaneous Goal Localization and Mapping

SGoLAM - Simultaneous Goal Localization and Mapping PyTorch implementation of the MultiON runner-up entry, SGoLAM: Simultaneous Goal Localization and

10 Jan 05, 2023