MonoScene: Monocular 3D Semantic Scene Completion

Overview

MonoScene: Monocular 3D Semantic Scene Completion

MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page]
Anh-Quan Cao, Raoul de Charette
Inria, Paris, France

If you find this work useful, please cite our paper:

@misc{cao2021monoscene,
      title={MonoScene: Monocular 3D Semantic Scene Completion}, 
      author={Anh-Quan Cao and Raoul de Charette},
      year={2021},
      eprint={2112.00726},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Code and models will be released soon. Please watch this repo for updates.

Demo

SemanticKITTI KITTI-360
(Trained on SemanticKITTI)

NYUv2

Comments
  • TypeError: 'int' object is not subscriptable

    TypeError: 'int' object is not subscriptable

    (monoscene) [email protected]:~/workplace/MonoScene$ python monoscene/scripts/train_monoscene.py dataset=kitti enable_log=true kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS kitti_logdir=$KITTI_LOG n_gpus=2 batch_size=2 ^[[Dexp_kitti_1_FrusSize_8_nRelations4_WD0.0001_lr0.0001_CEssc_geoScalLoss_semScalLoss_fpLoss_CERel_3DCRP_Proj_2_4_8 n_relations (32, 32, 4) Traceback (most recent call last): File "monoscene/scripts/train_monoscene.py", line 118, in main class_weights=class_weights, File "/home/ruidong/workplace/MonoScene/monoscene/models/monoscene.py", line 80, in init context_prior=context_prior, File "/home/ruidong/workplace/MonoScene/monoscene/models/unet3d_kitti.py", line 62, in init self.feature * 4, self.feature * 4, size_l3, bn_momentum=bn_momentum File "/home/ruidong/workplace/MonoScene/monoscene/models/CRP3D.py", line 15, in init self.flatten_size = size[0] * size[1] * size[2] TypeError: 'int' object is not subscriptable

    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

    opened by DipDipPotatoChips 21
  • Questions about cross-entropy loss

    Questions about cross-entropy loss

    Dear authors, thanks for your great works! In your paper, you say that "the losses are computed only where y is defined". I wonder if this means you do not add supervision on non-occupied voxels and only use multi-class classification loss on occupied voxels ? If this holds true, why the model can identify which voxels are occupied ?

    opened by weiyithu 13
  • about test

    about test

    FileNotFoundError: [Errno 2] No such file or directory: '/home/ruidong/workplace/MonoScene/trained_models/monoscene_kitti.ckpt'

    the last printing of trainning is: Epoch 29: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2325/2325 [1:06:52<00:00, 1.73s/it, loss=3.89, v_num=]

    opened by DipDipPotatoChips 13
  • Cuda out of memory

    Cuda out of memory

    Dear author, you said that Use smaller 2D backbone by chaning the basemodel_name and num_features The pretrained model name is here. You can try the efficientnet B5 can reduces the memory, I want to know the B5 weight and the value of num_features?

    opened by lulianLiu 12
  • Pretrained models on other dataset: NuScenes

    Pretrained models on other dataset: NuScenes

    Hi @anhquancao,

    Thanks so much for your paper and your implementation. Do you have your pretrained model on the NuScenes? If yes, could you share it? The reason is that I want to build upon your work on the NuScenes dataset but there exists a large domain gap between the two (SemanticKITTI and NuScenes) so the pretrained on SemanticKITTI works does not well on the NuScenes.

    Thanks!

    opened by ducminhkhoi 11
  • failed to run test

    failed to run test

    When I try to run this script, it crashed without giving any information: python monoscene/scripts/generate_output.py +output_path=$MONOSCENE_OUTPUT dataset=kitti_360 +kitti_360_root=$KITTI_360_ROOT +kitti_360_sequence=2013_05_28_drive_0028_sync n_gpus=1 batch_size=1

    image

    Any suggestion will be much appreciated.

    opened by ChiyuanFeng 9
  • cannot find calib

    cannot find calib

    PS F:\Studying\CY-Workspace\MonoScene-master> python monoscene/scripts/eval_monoscene.py dataset=kitti kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS n_gpus=1 batch_size= 1 GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs n_relations 4 Using cache found in C:\Users\DELL/.cache\torch\hub\rwightman_gen-efficientnet-pytorch_master Loading base model ()...Done. Removing last two layers (global_pool & classifier). Building Encoder-Decoder model..Done. Traceback (most recent call last): File "monoscene/scripts/eval_monoscene.py", line 71, in main data_module.setup() File "F:\anaconda\envs\monoscene\lib\site-packages\pytorch_lightning\core\datamodule.py", line 440, in wrapped_fn fn(*args, **kwargs) File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dm.py", line 34, in setup color_jitter=(0.4, 0.4, 0.4), File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dataset.py", line 60, in init os.path.join(self.root, "dataset", "sequences", sequence, "calib.txt") File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dataset.py", line 193, in read_calib with open(calib_path, "r") as f: FileNotFoundError: [Errno 2] No such file or directory: 'dataset\sequences\00\calib.txt'

    opened by cyaccpect 9
  • about visualization

    about visualization

    (monoscene) [email protected]:~/workplace/MonoScene$ python monoscene/scripts/visualization/kitti_vis_pred.py +file=/home/ruidong/workplace/MonoScene/outputs/kitti/08/000000.pkl +dataset=kitt monoscene/scripts/visualization/kitti_vis_pred.py:23: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations coords_grid = coords_grid.astype(np.float) Traceback (most recent call last): File "monoscene/scripts/visualization/kitti_vis_pred.py", line 196, in main d=7, File "monoscene/scripts/visualization/kitti_vis_pred.py", line 75, in draw grid_coords = np.vstack([grid_coords.T, voxels.reshape(-1)]).T AttributeError: 'tuple' object has no attribute 'T'

    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

    opened by DipDipPotatoChips 9
  • Porting the work of this paper to a new dataset

    Porting the work of this paper to a new dataset

    Hello author, first of all thank you for your great work. I want to directly apply your work to the nuscenes dataset, is it possible? Does the nuscenes dataset need point cloud data to assist in generating voxel data?

    opened by yukaizhou 8
  • Can you help me in another paper?

    Can you help me in another paper?

    Hello! Last year, when you reproduced the code SISC(https://github.com/OPEN-AIR-SUN/SISC), you found a bug and solve it! Now, I get the same problem too,can you tell me how to solve it ! Thank you very much!

    opened by WkangLiu 8
  • ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

    ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

    there is something wrong with my machine and I reinstall my ubuntu. I re-gitclone the code and just keep the data.but when I follow the readme to do installation,it print:

    (monoscene) [email protected]:~/workplace/MonoScene$ pip install -e ./ Obtaining file:///home/potato/workplace/MonoScene Installing collected packages: monoscene Running setup.py develop for monoscene Successfully installed monoscene-0.0.0 (monoscene) [email protected]:~/workplace/MonoScene$ python monoscene/scripts/train_monoscene.py dataset=kitti enable_log=true kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS kitti_logdir=$KITTI_LOG n_gpus=1 batch_size=1 sem_scal_loss=False Traceback (most recent call last): File "monoscene/scripts/train_monoscene.py", line 1, in from monoscene.data.semantic_kitti.kitti_dm import KittiDataModule File "/home/potato/workplace/MonoScene/monoscene/data/semantic_kitti/kitti_dm.py", line 3, in import pytorch_lightning as pl File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in from pytorch_lightning import metrics # noqa: E402 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in from pytorch_lightning.metrics.classification import ( # noqa: F401 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in from pytorch_lightning.metrics.utils import deprecated_metrics, void File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in from torchmetrics.utilities.data import get_num_classes as _get_num_classes ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

    opened by DipDipPotatoChips 7
Releases(v0.1)
Owner
Codes from Computer Vision group of RITS Team, Inria
Supervised multi-SNE (S-multi-SNE): Multi-view visualisation and classification

S-multi-SNE Supervised multi-SNE (S-multi-SNE): Multi-view visualisation and classification A repository containing the code to reproduce the findings

Theodoulos Rodosthenous 3 Apr 15, 2022
Attentional Focus Modulates Automatic Finger‑tapping Movements

"Attentional Focus Modulates Automatic Finger‑tapping Movements", in Scientific Reports

Xingxun Jiang 1 Dec 02, 2021
This repository contains datasets and baselines for benchmarking Chinese text recognition.

Benchmarking-Chinese-Text-Recognition This repository contains datasets and baselines for benchmarking Chinese text recognition. Please see the corres

FudanVI Lab 254 Dec 30, 2022
Pytorch implementation of OCNet series and SegFix.

openseg.pytorch News 2021/09/14 MMSegmentation has supported our ISANet and refer to ISANet for more details. 2021/08/13 We have released the implemen

openseg-group 1.1k Dec 23, 2022
ML models and internal tensors 3D visualizer

The free Zetane Viewer is a tool to help understand and accelerate discovery in machine learning and artificial neural networks. It can be used to ope

Zetane Systems 787 Dec 30, 2022
PyTorch implementation of MulMON

MulMON This repository contains a PyTorch implementation of the paper: Learning Object-Centric Representations of Multi-object Scenes from Multiple Vi

NanboLi 16 Nov 03, 2022
ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees This repository is the official implementation of the empirica

Kuan-Lin (Jason) Chen 2 Oct 02, 2022
A MNIST-like fashion product database. Benchmark

Fashion-MNIST Table of Contents Why we made Fashion-MNIST Get the Data Usage Benchmark Visualization Contributing Contact Citing Fashion-MNIST License

Zalando Research 10.5k Jan 08, 2023
Depth-Aware Video Frame Interpolation (CVPR 2019)

DAIN (Depth-Aware Video Frame Interpolation) Project | Paper Wenbo Bao, Wei-Sheng Lai, Chao Ma, Xiaoyun Zhang, Zhiyong Gao, and Ming-Hsuan Yang IEEE C

Wenbo Bao 7.7k Dec 31, 2022
Tools for the Cleveland State Human Motion and Control Lab

Introduction This is a collection of tools that are helpful for gait analysis. Some are specific to the needs of the Human Motion and Control Lab at C

CSU Human Motion and Control Lab 88 Dec 16, 2022
This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

CSP_Deep_EEG This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning" {https://www

Seyed Mahdi Roostaiyan 2 Nov 08, 2022
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

695 Jan 05, 2023
Lightweight, Python library for fast and reproducible experimentation :microscope:

Steppy What is Steppy? Steppy is a lightweight, open-source, Python 3 library for fast and reproducible experimentation. Steppy lets data scientist fo

minerva.ml 134 Jul 10, 2022
CM building dataset Timisoara

CM_building_dataset_Timisoara Date created: Febr-2020 The Timi\c{s}oara Building Dataset - TMBuD - is composed of 160 images with the resolution of 76

Orhei Ciprian 5 Sep 07, 2022
Data and code from COVID-19 machine learning paper

Machine learning approaches for localized lockdown, subnotification analysis and cases forecasting in São Paulo state counties during COVID-19 pandemi

Sara Malvar 4 Dec 22, 2022
Face-Recognition-Attendence-System - This face recognition Attendence system using Python

Face-Recognition-Attendence-System I have developed this face recognition Attend

Riya Gupta 4 May 10, 2022
Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper

Divide and Remaster Utility Tools Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper The DnR d

Darius Petermann 46 Dec 11, 2022
My 1st place solution at Kaggle Hotel-ID 2021

1st place solution at Kaggle Hotel-ID My 1st place solution at Kaggle Hotel-ID to Combat Human Trafficking 2021. https://www.kaggle.com/c/hotel-id-202

Kohei Ozaki 18 Aug 19, 2022
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Website | ICCV paper | arXiv | Twitter This repository contains the official i

Ajay Jain 73 Dec 27, 2022
Official Code for "Non-deep Networks"

Non-deep Networks arXiv:2110.07641 Ankit Goyal, Alexey Bochkovskiy, Jia Deng, Vladlen Koltun Overview: Depth is the hallmark of DNNs. But more depth m

Ankit Goyal 567 Dec 12, 2022