当前位置：网站首页>Cvpr2021 | multi view stereo matching based on self supervised learning (cvpr2021)

Cvpr2021 | multi view stereo matching based on self supervised learning (cvpr2021)

2022-07-29 07:13:00 【51CTO】

Self-supervised Learning of Depth Inference for Multi-view Stereo (CVPR2021)

Code address ：Github: https://github.com/JiayuYANG/Self-supervised-CVP-MVSNet

Self-sup CVP-MVSNet brief introduction

Although in recent years, multi view stereo matching based on deep learning （Multi-view Stereo, MVS） Remarkable progress has been made , But these methods usually rely on a large amount of labeled data , However, the truth label data acquisition of multi view depth estimation is more challenging . Therefore, this paper proposes a self supervised learning framework for multi view stereo matching , This framework adopts a two-stage training strategy . In particular , The first stage is unsupervised learning of network model based on image reconstruction loss , The second stage is self supervised learning based on pseudo tags , The generated pseudo tags are fused by multi view depth estimation . Experimental results show that ,Self-sup. CVP-MVSNet The reconstruction performance of unsupervised learning method is the best , The reconstruction effect is equivalent to that of the supervised method .

Preliminary knowledge 1: CVP-MVSNet Network review

Cost Volume Pyramid Based Depth Inference for Multi-view Stereo (CVPR2020)

CVPR2021| Multi view stereo matching based on self supervised learning (CVPR2021)_ Multiview

chart 1 CVP-MVSNet Network architecture

CVP-MVSNet Based on coarse-to-fine Multi view depth estimation network architecture , First, the reference image and the source image are down sampled to obtain the image pyramid , Then the feature extraction network with shared weights is used to extract features for all scales of images , The extracted features are transformed into cost volume by homography . The low resolution depth map is regressed through the cost volume of the lowest resolution , Then in the subsequent stage, the residual of the depth map is predicted by the cost volume .CVP-MVSNet For cost volume regularization in 3D ConvNet It is characteristic between different scales , And support the use of low resolution image training , High resolution image test .

Preliminary knowledge 2: Unsupervised / Self supervised learning MVS Method review

Learning Unsupervised Multi-view Stereopsis via Robust Photometric Consistency (CVPRW2019)

With UnsupMVS Methods as an example , Unsupervised multi view stereo matching means that there is no ground-truth In the case of depth maps , The method of using the photometric consistency of multiple views to supervise . In particular , In the process of network training, image reconstruction loss is used ： Depth map based on network prediction , Project the source image under the reference view , Calculate the photometric difference between the reference image and the projection transformed source image , The photometric difference between multi view images is the supervision signal of network training . Considering that there may be occlusion between multiple views , Therefore, when calculating the photometric loss TopK The strategy of . Computation M=6 Photometric loss between source images , But only choose K=3 Calculate the loss term with the smallest error .

CVPR2021| Multi view stereo matching based on self supervised learning (CVPR2021)_3d_02

chart 2 Robust photometric consistency loss calculation

Preliminary knowledge 3: DTU Data set preprocessing process

DTU Data sets provide point clouds with normal information ,MVSNet In this paper, the surface of point cloud is reconstructed in the data preprocessing stage , Get the surface mesh model , Again because DTU The data set provides the camera pose under each view , Therefore, the surface mesh can be rendered to each view to get the corresponding depth map , The rendered depth map is the result of network training ground-truth.

CVPR2021| Multi view stereo matching based on self supervised learning (CVPR2021)_ Multiview _03

chart 3 Self supervised learning CVP-MVSNet Algorithm framework

Self-sup CVP-MVSNet Method section

The first stage ： Use unsupervised learning to initialize the network

In the first stage of self supervised learning , Using the photometric consistency of multiple views to estimate the initial depth map . With CVP-MVSNet For backbone network , Probability based image synthesis method is used to calculate the loss of image reconstruction . Passing probability volume P And images intensity volume Perform view synthesis . Calculate the photometric difference between the composite image and the reference image . Unsupervised learning adopts view synthesis loss to ensure the photometric consistency between the reference image and the synthetic image and the smoothness of the predicted depth map . The photometric consistency loss of unsupervised learning is ：

Where is the image gradient loss , For the loss of image structural similarity , For image perception loss , Is the loss of smoothness of the depth map , Used to adjust the weight of different loss items .

CVPR2021| Multi view stereo matching based on self supervised learning (CVPR2021)_3d_04

chart 4 Probability based image synthesis

The second stage ： Self supervised iterative training based on pseudo tags

CVPR2021| Multi view stereo matching based on self supervised learning (CVPR2021)_ Multiview _05

chart 5 Schematic diagram of pseudo label generation process

Using high-resolution images for depth map prediction , The initial depth map is filtered by using the geometric consistency of multiple views , Perform point cloud fusion on the filtered depth map , Get the point cloud in three-dimensional space , Then reconstruct through Poisson surface , Get the surface mesh model of the reconstructed point cloud , Then the depth map under each perspective is obtained by rendering . The rendered depth map is the pseudo label of self-monitoring training of network model .

experimental result

The network model obtained through the two-stage self-monitoring training strategy in this paper Self-sup. CVP-MVSNet The performance of the supervised learning method is equivalent , stay DTU and Tanks and Temples The reconstruction effect of the dataset is shown in the following table .

CVPR2021| Multi view stereo matching based on self supervised learning (CVPR2021)_ Multiview _06

CVPR2021| Multi view stereo matching based on self supervised learning (CVPR2021)_3d_07

reference

1、Self-supervised Learning of Depth Inference for Multi-view Stereo, CVPR2021

2、Cost volume pyramid based depth inference for multi-view stereo, CVPR2020

3、Learning unsupervised multi-view stereopsis via robust photometric consistency, CVPRW2019

4、Mvsnet: Depth inference for unstructured multi-view stereo, ECCV2018

Multi view stereo matching based on deep learning （Multi-view Stereo, MVS） Algorithm summary ????
https://github.com/XYZ-qiyh/Awesome-Learning-MVS

remarks ： The author is also us 「3D Vision goes from beginner to proficient 」 Special guests ： A super dry 3D Visual learning community

This article is only for academic sharing , If there is any infringement , Please contact to delete .

▲ Long press and add wechat group or contribute
▲ The official account of long click attention

Feel useful , Please give me a compliment ~

原网站

版权声明
本文为[51CTO]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/210/202207290625333052.html

当前位置：网站首页>Cvpr2021 | multi view stereo matching based on self supervised learning (cvpr2021)

Cvpr2021 | multi view stereo matching based on self supervised learning (cvpr2021)

reference

▲ Long press and add wechat group or contribute
▲ The official account of long click attention

Feel useful , Please give me a compliment ~

边栏推荐

猜你喜欢

随机推荐

当前位置：网站首页>Cvpr2021 | multi view stereo matching based on self supervised learning (cvpr2021)

Cvpr2021 | multi view stereo matching based on self supervised learning (cvpr2021)

reference

▲ Long press and add wechat group or contribute ▲ The official account of long click attention

Feel useful , Please give me a compliment ~

边栏推荐

猜你喜欢

随机推荐

▲ Long press and add wechat group or contribute
▲ The official account of long click attention