Towards Part-Based Understanding of RGB-D Scans

Last update: Nov 23, 2022

Overview

Towards Part-Based Understanding of RGB-D Scans (CVPR 2021)

We propose the task of part-based scene understanding of real-world 3D environments: from an RGB-D scan of a scene, we detect objects, and for each object predict its decomposition into geometric part masks, which composed together form the complete geometry of the observed object.

Download Paper (.pdf)

Demo samples

Get started

The core of this repository is a network, which takes as input preprocessed scan voxel crops and produces voxelized part trees. However, data preparation is very massive step before launching actual training and inference. That's why we release already prepared data for training and checkpoint to perform inference. If you want to launch training with our data, please follow the steps below:

Clone repo: git clone https://github.com/alexeybokhovkin/part-based-scan-understanding.git
Download data and/or checkpoint:
ScanNet MLCVNet crops (finetune) [894M]
ScanNet clean crops (pretraining) [995M]
PartNet GT trees [103M]
Parts priors [169M]
Checkpoint [19M]
For training, prepare augmented version of ScanNet crops with script dataproc/prepare_rot_aug_data.py. After this, create a folder with all necessary dataset metadata using script dataproc/gather_all_shapes.py
Create config file similar to configs/config_gnn_scannet_allshapes.yaml (you need to provide paths to some directories and files)
Launch training with train_gnn_scannet.py

Citation

If you use this framework please cite:

@article{Bokhovkin2020TowardsPU,
  title={Towards Part-Based Understanding of RGB-D Scans},
  author={Alexey Bokhovkin and V. Ishimtsev and Emil Bogomolov and D. Zorin and A. Artemov and Evgeny Burnaev and Angela Dai},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.02094}
}

You might also like...

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

PN-Net We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single

1 Oct 2, 2021

PoseCamera is python based SDK for human pose estimation through RGB webcam.

PoseCamera PoseCamera is python based SDK for human pose estimation through RGB webcam. Install install posecamera package through pip pip install pos

7 Jul 20, 2021

Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

CenterPose Overview This repository is the official implementation of the paper "Single-stage Keypoint-based Category-level Object Pose Estimation fro

188 Dec 27, 2022

OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

OcclusionFusion (CVPR'2022) Project Page | Paper | Video Overview This repository contains the code for the CVPR 2022 paper OcclusionFusion, where we

193 Dec 15, 2022

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow

18 Oct 6, 2022

The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

Comments

scannet_shape_ids files and part segmentation
First of all, thanks for the great work! I have two questions about this repo and your paper:

It seems that txt files for scannet_shape_ids are required for prepare_rot_aug_data.py. But I cannot find them in the provided dataset files.

Could you explain more details about part segmentation on 3D scans? I'm confused if the part segmentation labels for 3d scans are generated by 1) aligning PartNet data, 2) assigning part labels to overlapped regions. Do you provide point-wise (or voxel-wise) part segmentation annotation?
opened by jeonghyunkeem 0

Towards Part-Based Understanding of RGB-D Scans

Related tags

Overview

Towards Part-Based Understanding of RGB-D Scans (CVPR 2021)

Demo samples

Get started

Citation

You might also like...

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

PoseCamera is python based SDK for human pose estimation through RGB webcam.

Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

Comments

scannet_shape_ids files and part segmentation

Releases(v0.1)

v0.1(Jun 18, 2021)

Owner

The reference baseline of final exam for XMU machine learning course

Multi-Glimpse Network With Python

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

An air quality monitoring service with a Raspberry Pi and a SDS011 sensor.

Contrastive Loss Gradient Attack (CLGA)

Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness

A Tensorflow implementation of BicycleGAN.

StyleGAN2-ADA-training-jupyter - Training custom datasets in styleGAN2-ADA by NVIDIA using Jupyter

J.A.R.V.I.S is an AI virtual assistant made in python.

In this project I played with mlflow, streamlit and fastapi to create a training and prediction app on digits

This program was designed to detect whether someone is wearing a facemask through a live video stream.

This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"

Deep Q Learning with OpenAI Gym and Pokemon Showdown

Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021)

Official implementation of VQ-Diffusion

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

RGB-stacking 🛑 🟩 🔷 for robotic manipulation

Python implementation of O-OFDMNet, a deep learning-based optical OFDM system,