Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Last update: Dec 13, 2022

Related tags

Deep Learning DistDepth

Overview

Toward Practical Monocular Indoor Depth Estimation

Cho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su

[arXiv] [project site]

DistDepth

Our DistDepth is a highly robust monocular depth estimation approach for generic indoor scenes.

Trained with stereo sequences without their groundtruth depth
Structured and metric-accurate
Run in an interactive rate with Laptop GPU
Sim-to-real: trained on simulation and becomes transferrable to real scenes

Single Image Inference Demo

We test on Ubuntu 20.04 LTS with an laptop NVIDIA 2080 GPU (only GPU mode is supported).

Install packages

Use conda

conda create --name distdepth python=3.8 conda activate distdepth
Install pre-requisite common packages. Go to https://pytorch.org/get-started/locally/ and install pytorch that is compatible to your computer. We test on pytorch v1.9.0 and cudatoolkit-11.1. (The codes should work under other v1.0+ versions)

conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=11.3 -c pytorch -c conda-forge
Install other dependencies: opencv-python and matplotlib.

pip install opencv-python, matplotlib

Download pretrained models

Download pretrained models [here] (ResNet152, 246MB).
Move the downloaded item under this folder, and then unzip it. You should be able to see a new folder 'ckpts' that contains the pretrained models.
Run

python demo.py
Results will be stored under results/

Data

Download SimSIN [here]. For UniSIN and VA, please download at the [project site].

Depth-aware AR effects

Virtual object insertion:

Dragging objects along a trajectory:

Citation

@inproceedings{wu2022toward,
title={Toward Practical Monocular Indoor Depth Estimation},
author={Wu, Cho-Ying and Wang, Jialiang and Hall, Michael and Neumann, Ulrich and Su, Shuochen},
booktitle={CVPR},
year={2022}
}

License

DistDepth is CC-BY-NC licensed, as found in the LICENSE file.

Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Related tags

Overview

Toward Practical Monocular Indoor Depth Estimation

DistDepth

Single Image Inference Demo

Data

Depth-aware AR effects

Citation

License

Owner

Meta Research

PyTorch Implementation of Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation.

Implementation of Kaneko et al.'s MaskCycleGAN-VC model for non-parallel voice conversion.

Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular Depth Estimation in the Dark (ICCV 2021)

We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

FCN (Fully Convolutional Network) is deep fully convolutional neural network architecture for semantic pixel-wise segmentation

PyTorch implementation for Convolutional Networks with Adaptive Inference Graphs

A toy compiler that can convert Python scripts to pickle bytecode 🥒

Official Repsoitory for "Mish: A Self Regularized Non-Monotonic Neural Activation Function" [BMVC 2020]

A fast python implementation of Ray Tracing in One Weekend using python and Taichi

ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images

Tensorforce: a TensorFlow library for applied reinforcement learning

Code for the CVPR2021 workshop paper "Noise Conditional Flow Model for Learning the Super-Resolution Space"

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

PyTorch code for Composing Partial Differential Equations with Physics-Aware Neural Networks

Spatiotemporal resampling methods for mlr3

Tutorial for the PERFECTING FACTORY 5.0 WITH EDGE-POWERED AI workshop

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

The Most Efficient Temporal Difference Learning Framework for 2048