当前位置:网站首页>Cvpr2021 | multi view stereo matching based on self supervised learning (cvpr2021)
Cvpr2021 | multi view stereo matching based on self supervised learning (cvpr2021)
2022-07-29 07:13:00 【51CTO】
Self-supervised Learning of Depth Inference for Multi-view Stereo (CVPR2021)
Code address :Github: https://github.com/JiayuYANG/Self-supervised-CVP-MVSNet
Self-sup CVP-MVSNet brief introduction
Although in recent years, multi view stereo matching based on deep learning (Multi-view Stereo, MVS) Remarkable progress has been made , But these methods usually rely on a large amount of labeled data , However, the truth label data acquisition of multi view depth estimation is more challenging . Therefore, this paper proposes a self supervised learning framework for multi view stereo matching , This framework adopts a two-stage training strategy . In particular , The first stage is unsupervised learning of network model based on image reconstruction loss , The second stage is self supervised learning based on pseudo tags , The generated pseudo tags are fused by multi view depth estimation . Experimental results show that ,Self-sup. CVP-MVSNet The reconstruction performance of unsupervised learning method is the best , The reconstruction effect is equivalent to that of the supervised method .
Preliminary knowledge 1: CVP-MVSNet Network review
Cost Volume Pyramid Based Depth Inference for Multi-view Stereo (CVPR2020)
chart 1 CVP-MVSNet Network architecture
CVP-MVSNet Based on coarse-to-fine Multi view depth estimation network architecture , First, the reference image and the source image are down sampled to obtain the image pyramid , Then the feature extraction network with shared weights is used to extract features for all scales of images , The extracted features are transformed into cost volume by homography . The low resolution depth map is regressed through the cost volume of the lowest resolution , Then in the subsequent stage, the residual of the depth map is predicted by the cost volume .CVP-MVSNet For cost volume regularization in 3D ConvNet It is characteristic between different scales , And support the use of low resolution image training , High resolution image test .
Preliminary knowledge 2: Unsupervised / Self supervised learning MVS Method review
Learning Unsupervised Multi-view Stereopsis via Robust Photometric Consistency (CVPRW2019)
With UnsupMVS Methods as an example , Unsupervised multi view stereo matching means that there is no ground-truth In the case of depth maps , The method of using the photometric consistency of multiple views to supervise . In particular , In the process of network training, image reconstruction loss is used : Depth map based on network prediction , Project the source image under the reference view , Calculate the photometric difference between the reference image and the projection transformed source image , The photometric difference between multi view images is the supervision signal of network training . Considering that there may be occlusion between multiple views , Therefore, when calculating the photometric loss TopK The strategy of . Computation M=6 Photometric loss between source images , But only choose K=3 Calculate the loss term with the smallest error .
chart 2 Robust photometric consistency loss calculation
Preliminary knowledge 3: DTU Data set preprocessing process
DTU Data sets provide point clouds with normal information ,MVSNet In this paper, the surface of point cloud is reconstructed in the data preprocessing stage , Get the surface mesh model , Again because DTU The data set provides the camera pose under each view , Therefore, the surface mesh can be rendered to each view to get the corresponding depth map , The rendered depth map is the result of network training ground-truth.
chart 3 Self supervised learning CVP-MVSNet Algorithm framework
Self-sup CVP-MVSNet Method section
The first stage : Use unsupervised learning to initialize the network
In the first stage of self supervised learning , Using the photometric consistency of multiple views to estimate the initial depth map . With CVP-MVSNet For backbone network , Probability based image synthesis method is used to calculate the loss of image reconstruction . Passing probability volume P And images intensity volume Perform view synthesis . Calculate the photometric difference between the composite image and the reference image . Unsupervised learning adopts view synthesis loss to ensure the photometric consistency between the reference image and the synthetic image and the smoothness of the predicted depth map . The photometric consistency loss of unsupervised learning is :
Where is the image gradient loss , For the loss of image structural similarity , For image perception loss , Is the loss of smoothness of the depth map , Used to adjust the weight of different loss items .
chart 4 Probability based image synthesis
The second stage : Self supervised iterative training based on pseudo tags
chart 5 Schematic diagram of pseudo label generation process
Using high-resolution images for depth map prediction , The initial depth map is filtered by using the geometric consistency of multiple views , Perform point cloud fusion on the filtered depth map , Get the point cloud in three-dimensional space , Then reconstruct through Poisson surface , Get the surface mesh model of the reconstructed point cloud , Then the depth map under each perspective is obtained by rendering . The rendered depth map is the pseudo label of self-monitoring training of network model .
experimental result
The network model obtained through the two-stage self-monitoring training strategy in this paper Self-sup. CVP-MVSNet The performance of the supervised learning method is equivalent , stay DTU and Tanks and Temples The reconstruction effect of the dataset is shown in the following table .
reference
1、Self-supervised Learning of Depth Inference for Multi-view Stereo, CVPR2021
2、Cost volume pyramid based depth inference for multi-view stereo, CVPR2020
3、Learning unsupervised multi-view stereopsis via robust photometric consistency, CVPRW2019
4、Mvsnet: Depth inference for unstructured multi-view stereo, ECCV2018
Multi view stereo matching based on deep learning (Multi-view Stereo, MVS) Algorithm summary ????
https://github.com/XYZ-qiyh/Awesome-Learning-MVS
remarks : The author is also us 「3D Vision goes from beginner to proficient 」 Special guests : A super dry 3D Visual learning community
This article is only for academic sharing , If there is any infringement , Please contact to delete .
▲ Long press and add wechat group or contribute
▲ The official account of long click attention
Feel useful , Please give me a compliment ~
边栏推荐
- Cesium反射
- [Charles' daily problems] when you open Charles, you can't use nails
- CAN&CANFD综合测试分析软件LKMaster与PCAN-Explorer 6分析软件的优势对比
- 基于C语言实现图书借阅管理系统
- gin 模版
- Vscode remote debugging PHP solution through remotessh and Xdebug
- Spark Learning Notes (VII) -- spark core core programming - RDD serialization / dependency / persistence / partition / accumulator / broadcast variables
- [solution] error: lib/bridge_ generated. dart:837:9: Error: The parameter ‘ptr‘ of the method ‘FlutterRustB
- dba
- 解决CSDN因版权不明而无法发布博客的问题
猜你喜欢
后缀自动机(SAM)讲解 + Luogu p3804【模板】后缀自动机 (SAM)
WPF简单登录页面的完成案例
Leetcode 879. profit plan
buck电路boot和ph引脚实测
网上传说软件测试培训真的那么黑心吗?都是骗局?
Junda technology | applicable to "riyueyuan" brand ups wechat cloud monitoring card
CAN&CANFD综合测试分析软件LKMaster与PCAN-Explorer 6分析软件的优势对比
Vmware16 create virtual machine: win11 cannot be installed
数组的子集不能累加出的最小正数
Windows 上 php 7.4 连接 oracle 配置
随机推荐
resize2fs: 超级块中的幻数有错(Bad magic number in super-block )
Problems encountered in vmware16 installing virtual machines
约瑟夫环问题
Dbasql interview questions
Analog volume leetcode [normal] 093. Restore IP address
后缀自动机(SAM)讲解 + Luogu p3804【模板】后缀自动机 (SAM)
gin 模版
gin 参数验证
Student status management system based on C language design
线程同步—— 生产者与消费者、龟兔赛跑、双线程打印
leetcode-592:分数加减运算
建木持续集成平台v2.5.2发布
Cesium reflection
解决CSDN因版权不明而无法发布博客的问题
spark学习笔记(七)——sparkcore核心编程-RDD序列化/依赖关系/持久化/分区器/累加器/广播变量
基于C语言实现图书借阅管理系统
DM data guard cluster setup
Leetcode 879. profit plan
Teacher Wu Enda's machine learning course notes 02 univariate linear regression
Can MySQL export tables regularly?