Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

Overview

IterMVS

official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

Introduction

IterMVS is a novel learning-based MVS method combining highest efficiency and competitive reconstruction quality. We propose a novel GRU-based estimator that encodes pixel-wise probability distributions of depth in its hidden state. Ingesting multi-scale matching information, our model refines these distributions over multiple iterations and infers depth and confidence. Extensive experiments on DTU, Tanks & Temples and ETH3D show highest efficiency in both memory and run-time, and a better generalization ability than many state-of-the-art learning-based methods.

If you find this project useful for your research, please cite:

@misc{wang2021itermvs,
      title={IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo}, 
      author={Fangjinhua Wang and Silvano Galliani and Christoph Vogel and Marc Pollefeys},
      year={2021},
      eprint={2112.05126},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Installation

Requirements

  • python 3.6
  • CUDA 10.1
pip install -r requirements.txt

Reproducing Results

root_directory
├──scan1 (scene_name1)
├──scan2 (scene_name2) 
      ├── images                 
      │   ├── 00000000.jpg       
      │   ├── 00000001.jpg       
      │   └── ...                
      ├── cams_1                   
      │   ├── 00000000_cam.txt   
      │   ├── 00000001_cam.txt   
      │   └── ...                
      └── pair.txt  

Camera file cam.txt stores the camera parameters, which includes extrinsic, intrinsic, minimum depth and maximum depth:

extrinsic
E00 E01 E02 E03
E10 E11 E12 E13
E20 E21 E22 E23
E30 E31 E32 E33

intrinsic
K00 K01 K02
K10 K11 K12
K20 K21 K22

DEPTH_MIN DEPTH_MAX 

pair.txt stores the view selection result. For each reference image, 10 best source views are stored in the file:

TOTAL_IMAGE_NUM
IMAGE_ID0                       # index of reference image 0 
10 ID0 SCORE0 ID1 SCORE1 ...    # 10 best source images for reference image 0 
IMAGE_ID1                       # index of reference image 1
10 ID0 SCORE0 ID1 SCORE1 ...    # 10 best source images for reference image 1 
...

Evaluation on DTU:

  • For DTU's evaluation set, first download our processed camera parameters from here. Unzip it and replace all the old camera files in the folders cams_1 with new files for all the scans.
  • In eval_dtu.sh, set DTU_TESTING as the root directory of corresponding dataset, set --outdir as the directory to store the reconstructed point clouds.
  • CKPT_FILE is the path of checkpoint file (default as our pretrained model which is trained on DTU, the path is checkpoints/dtu/model_000015.ckpt).
  • Test on GPU by running bash eval_dtu.sh. The code includes depth map estimation and depth fusion. The outputs are the point clouds in ply format.
  • For quantitative evaluation, download SampleSet and Points from DTU's website. Unzip them and place Points folder in SampleSet/MVS Data/. The structure looks like:
SampleSet
├──MVS Data
      └──Points

In evaluations/dtu/BaseEvalMain_web.m, set dataPath as the path to SampleSet/MVS Data/, plyPath as directory that stores the reconstructed point clouds and resultsPath as directory to store the evaluation results. Then run evaluations/dtu/BaseEvalMain_web.m in matlab.

The results look like:

Acc. (mm) Comp. (mm) Overall (mm)
0.373 0.354 0.363

Evaluation on Tansk & Temples:

  • In eval_tanks.sh, set TANK_TESTING as the root directory of the dataset and --outdir as the directory to store the reconstructed point clouds.
  • CKPT_FILE is the path of checkpoint file (default as our pretrained model which is trained on DTU, the path is checkpoints/dtu/model_000015.ckpt). We also provide our pretrained model trained on BlendedMVS (checkpoints/blendedmvs/model_000015.ckpt)
  • Test on GPU by running bash eval_tanks.sh. The code includes depth map estimation and depth fusion. The outputs are the point clouds in ply format.
  • For our detailed quantitative results on Tanks & Temples, please check the leaderboards (Tanks & Temples: trained on DTU, Tanks & Temples: trained on BlendedMVS).

Evaluation on ETH3D:

  • In eval_eth.sh, set ETH3D_TESTING as the root directory of the dataset and --outdir as the directory to store the reconstructed point clouds.
  • CKPT_FILE is the path of checkpoint file (default as our pretrained model which is trained on DTU, the path is checkpoints/dtu/model_000015.ckpt). We also provide our pretrained model trained on BlendedMVS (checkpoints/blendedmvs/model_000015.ckpt)
  • Test on GPU by running bash eval_eth.sh. The code includes depth map estimation and depth fusion. The outputs are the point clouds in ply format.
  • For our detailed quantitative results on ETH3D, please check the leaderboards (ETH3D: trained on DTU, ETH3D: trained on BlendedMVS).

Evaluation on custom dataset:

  • We support preparing the custom dataset from COLMAP's results. The script colmap_input.py (modified based on the script from MVSNet) converts COLMAP's sparse reconstruction results into the same format as the datasets that we provide.
  • Test on GPU by running bash eval_custom.sh.

Training

DTU

  • Download pre-processed DTU's training set (provided by PatchmatchNet). The dataset is already organized as follows:
root_directory
├──Cameras_1
├──Rectified
└──Depths_raw
  • Download our processed camera parameters from here. Unzip all the camera folders into root_directory/Cameras_1.
  • In train_dtu.sh, set MVS_TRAINING as the root directory of dataset; set --logdir as the directory to store the checkpoints.
  • Train the model by running bash train_dtu.sh.

BlendedMVS

  • Download the dataset.
  • In train_blend.sh, set MVS_TRAINING as the root directory of dataset; set --logdir as the directory to store the checkpoints.
  • Train the model by running bash train_blend.sh.

Acknowledgements

Thanks to Yao Yao for opening source of his excellent work MVSNet. Thanks to Xiaoyang Guo for opening source of his PyTorch implementation of MVSNet MVSNet-pytorch.

Owner
Fangjinhua Wang
Ph.D. sutdent in Computer Science; member of CVG; supervised by Prof. Marc Pollefeys
Fangjinhua Wang
AirLoop: Lifelong Loop Closure Detection

AirLoop This repo contains the source code for paper: Dasong Gao, Chen Wang, Sebastian Scherer. "AirLoop: Lifelong Loop Closure Detection." arXiv prep

Chen Wang 53 Jan 03, 2023
Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Smaller Multilingual Transformers This repository shares smaller versions of multilingual transformers that keep the same representations offered by t

Geotrend 79 Dec 28, 2022
Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho

Lightly 25 Dec 03, 2022
A python script to dump all the challenges locally of a CTFd-based Capture the Flag.

A python script to dump all the challenges locally of a CTFd-based Capture the Flag. Features Connects and logins to a remote CTFd instance. Dumps all

Podalirius 77 Dec 07, 2022
Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

567 Dec 26, 2022
PyTorch implementation for COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction (CVPR 2021)

Completer: Incomplete Multi-view Clustering via Contrastive Prediction This repo contains the code and data of the following paper accepted by CVPR 20

XLearning Group 72 Dec 07, 2022
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Differentiable Model Compression via Pseudo Quantization Noise DiffQ performs differentiable quantization using pseudo quantization noise. It can auto

Facebook Research 145 Dec 30, 2022
Datasets, tools, and benchmarks for representation learning of code.

The CodeSearchNet challenge has been concluded We would like to thank all participants for their submissions and we hope that this challenge provided

GitHub 1.8k Dec 25, 2022
Official implementation of MSR-GCN (ICCV 2021 paper)

MSR-GCN Official implementation of MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction (ICCV 2021 paper) [Paper] [Sup

LevonDang 42 Nov 07, 2022
Repository for the COLING 2020 paper "Explainable Automated Fact-Checking: A Survey."

Explainable Fact Checking: A Survey This repository and the accompanying webpage contain resources for the paper "Explainable Fact Checking: A Survey"

Neema Kotonya 42 Nov 17, 2022
Grammar Induction using a Template Tree Approach

Gitta Gitta ("Grammar Induction using a Template Tree Approach") is a method for inducing context-free grammars. It performs particularly well on data

Thomas Winters 36 Nov 15, 2022
Baseline powergrid model for NY

Baseline-powergrid-model-for-NY Table of Contents About The Project Built With Usage License Contact Acknowledgements About The Project As the urgency

Anderson Energy Lab at Cornell 6 Nov 24, 2022
DCA - Official Python implementation of Delaunay Component Analysis algorithm

Delaunay Component Analysis (DCA) Official Python implementation of the Delaunay

Petra Poklukar 9 Sep 06, 2022
Nodule Generation Algorithm Baseline and template code for node21 generation track

Nodule Generation Algorithm This codebase implements a simple baseline model, by following the main steps in the paper published by Litjens et al. for

node21challenge 10 Apr 21, 2022
Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

274 Dec 06, 2022
Federated_learning codes used for the the paper "Evaluation of Federated Learning Aggregation Algorithms" and "A Federated Learning Aggregation Algorithm for Pervasive Computing: Evaluation and Comparison"

Federated Distance (FedDist) This is the code accompanying the Percom2021 paper "A Federated Learning Aggregation Algorithm for Pervasive Computing: E

GETALP 8 Jan 03, 2023
Benchmarks for Object Detection in Aerial Images

Benchmarks for Object Detection in Aerial Images

Jian Ding 691 Dec 30, 2022
Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

naqs-for-quantum-chemistry This repository contains the codebase developed for the paper Autoregressive neural-network wavefunctions for ab initio qua

Tom Barrett 24 Dec 23, 2022
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dea

MIC-DKFZ 1.2k Jan 04, 2023
Real-Time Semantic Segmentation in Mobile device

Real-Time Semantic Segmentation in Mobile device This project is an example project of semantic segmentation for mobile real-time app. The architectur

708 Jan 01, 2023