当前位置:网站首页>Cvpr2022 stereo matching of asymmetric resolution images
Cvpr2022 stereo matching of asymmetric resolution images
2022-06-26 23:23:00 【ScarLeTzzz】
CVPR2022- Stereo matching of asymmetric resolution images

0. Abstract
- Research questions : Stereo matching of a pair of images with different resolutions ;
- Method : Unsupervised learning 、 Feature metric consistency 、 Self reinforcing optimization strategy ;
- verification : In a variety of degraded Simulation data set And self collected Real data sets Experimental results on show that the proposed method is superior to the existing solutions .
1. Introduce
Research background and significance : at present , By two ( Or more ) Long distance camera systems composed of lenses with different focal lengths are widely used in smart phones . Such systems usually generate a pair of... In one shot ( Or a group ) Images with different resolutions , This makes many ideal applications possible , for example Continuous optical zoom and Image quality enhancement . For these applications , The corresponding point estimation of stereo images with asymmetric resolution is a key step , Usually by Traditional symmetric stereo matching algorithm ( Such as SGM) and Sampling on image To carry out . However, this method is easily affected by the artifacts introduced by sampling on the image , This effect is more obvious when the upsampling range is large .
Contribution summary :
- First use Unsupervised learning methods The corresponding points are estimated from the resolution asymmetric stereo pairs ;
- Realization Feature metric consistency , To avoid photometric inconsistencies due to unknown degradation ;
- A method to enhance the consistency of feature metrics by progressive loss updating Self reinforcing strategy ;
- stay Analog datasets and real datasets On , Compared with the comparison method, it has obvious performance improvement .
Introduction to research methods :
Unsupervised learning : For the case of asymmetric resolution , Supervised methods require not only the true value of parallax but also high-quality degraded views as labels , It is also necessary to define the degenerate form to learn the network parameters , This makes it difficult to apply to various complex real-world systems . therefore , We turn to unsupervised learning .
Feature metric consistency : For unsupervised stereo matching , The most widely used assumption is photometric consistency .
Pictured 1(a) Shown , The corresponding pixels in the symmetric stereo pair have I L [ p L ] = I R [ p R ] I_{L}[p_{L}]=I_{R}[p_{R}] IL[pL]=IR[pR];
Pictured 1(b) Shown , The corresponding pixels in the asymmetric stereo pair may not have the same intensity or color , This photometric inconsistency will bring new difficulties to the corresponding point learning ;
The existing solution is through super-resolution (SR) Technology will LR View reverts to HR View , However , The existing SR Methods are mostly degenerate specific , The actual degradation is different from the assumed degradation , Performance will decline ;
This paper puts forward the idea of The feature space Instead of imposing consistency between two views in image space , be called Feature metric consistency , Specific consistent features can be generated through the feature extractor , namely , chart 1(b) Medium F L [ p L ] = F R [ p R ] F_{L}[p_{L}]=F_{R}[p_{R}] FL[pL]=FR[pR]. These features can then be used to formulate a feature to measure the loss , To avoid photometric inconsistencies .
Feature metric consistency It was discovered through experiments . Although it is not the best to train the network with luminosity loss , But trained The feature extractor can extract consistent features .

- Self reinforcing optimization strategy : When the stereo matching network is optimized through the loss of feature measurement , The feature extractor is also optimized , It can further enhance the consistency of feature metrics . So , A self enhancement strategy is introduced to Iterative optimization Feature extractor . To be specific , We use the feature extractor learned from the previous stage to form a new feature metric loss in the current stage , The new feature metric loss is used in the next stage of network training to learn a new feature extractor , Iterative optimization in turn . such , This method is still effective even for large degradation .
2. Resolution asymmetric stereo matching method is introduced
The method flow chart of this paper :

2.1 Photometric consistency learning
Align the stereo pair I L I_{L} IL and I r ↑ I_{r \uparrow} Ir↑ As input , Unsupervised stereo matching network Φ ( ⋅ ; θ ) \Phi(\cdot ; \theta) Φ(⋅;θ) The forecast is relative to the left view I L I_{L} IL Parallax map d L = Φ ( I L , I r ↑ ; θ ) d_{L}=\Phi(I_{L},I_{r \uparrow} ; \theta) dL=Φ(IL,Ir↑;θ), Training based on the photometric consistency of the corresponding points , namely :
I L [ p L ] = I r ↑ [ p r ↑ ] (1) I_{L}[p_{L}]=I_{r \uparrow}[p_{r \uparrow}] \tag {1} IL[pL]=Ir↑[pr↑](1)
If parallax d L [ p L ] d_{L}[p_{L}] dL[pL] Get an accurate estimate , So the left picture I L [ p L ] I_{L}[p_{L}] IL[pL] It can be seen from the I r ↑ [ p L ] I_{r \uparrow}[p_{L}] Ir↑[pL] Combined with parallax transform, we get , namely :
I r ↑ → L [ p L ] = I r ↑ [ p L − d L [ p L ] ] (2) I_{r \uparrow \rightarrow L}[p_{L}]=I_{r \uparrow}[p_{L}-d_{L}[p_{L}]] \tag {2} Ir↑→L[pL]=Ir↑[pL−dL[pL]](2)
therefore , The photometric loss can be determined by I L I_{L} IL And its reconstruction results I r ↑ → L I_{r \uparrow \rightarrow L} Ir↑→L The error between , It is generally weighted α \alpha α Of L 1 L_{1} L1 and S S I M SSIM SSIM A combination of distances , namely :
L p m = ∥ I L − I r ↑ → L ∥ 1 + α ( 1 − S S I M ( I L , I r ↑ → L ) ) (3) \mathcal{L}_{pm}=\| I_{L}-I_{r \uparrow \rightarrow L} \|_{1}+\alpha (1-SSIM(I_{L},I_{r \uparrow \rightarrow L})) \tag {3} Lpm=∥IL−Ir↑→L∥1+α(1−SSIM(IL,Ir↑→L))(3)
SSIM: Structural similarity index (Structural Similarity Index Measure)
First use photometric consistency loss L p m \mathcal{L}_{pm} Lpm Train an initial network Φ 0 \Phi^{0} Φ0 Including feature extraction network Φ F 0 \Phi^{0}_{F} ΦF0 And matching network Φ M 0 \Phi^{0}_{M} ΦM0
2.2 Feature measure consistency learning
- Given a stereo pair I L I_{L} IL and I r ↑ I_{r \uparrow} Ir↑, Φ ( ⋅ ; θ F ) \Phi (\cdot ;\theta _{F} ) Φ(⋅;θF) Extracted features F L = Φ F ( I L ; θ F ) F_{L}=\Phi_{F} (I_{L};\theta _{F}) FL=ΦF(IL;θF) and F r ↑ = Φ F ( I L ; θ F ) F_{r \uparrow}=\Phi_{F} (I_{L};\theta _{F}) Fr↑=ΦF(IL;θF), These two features are consistent on the corresponding points of asymmetric pixels , namely :
F L [ p L ] = F r ↑ [ p r ↑ ] (4) F_{L}[p_{L}]=F_{r \uparrow}[p_{r \uparrow}] \tag {4} FL[pL]=Fr↑[pr↑](4)
Will feature F L F_{L} FL and F r ↑ F_{r \uparrow} Fr↑ Concatenated into a cost body , And use Φ M ( ⋅ ; θ M ) \Phi_{M} (\cdot ; \theta _{M}) ΦM(⋅;θM) Regularize , Return to the parallax map d L d_{L} dL. In obtaining the basis d L d_{L} dL Transformed left view I r ↑ → L I_{r \uparrow \rightarrow L} Ir↑→L , Using a feature extractor Φ F ( ⋅ ; θ F ) \Phi_{F} (\cdot ;\theta _{F} ) ΦF(⋅;θF) take I L I_{L} IL and I r ↑ → L I_{r \uparrow \rightarrow L} Ir↑→L Project to feature space , obtain F L F_{L} FL and F r ↑ → L = Φ F ( I r ↑ → L ; θ F ) F_{r \uparrow \rightarrow L}=\Phi_{F} (I_{r \uparrow \rightarrow L} ;\theta _{F} ) Fr↑→L=ΦF(Ir↑→L;θF).
The characteristic measurement loss can be modeled after the photometric consistency loss L f m \mathcal{L}_{fm} Lfm:
L f m = ∥ F L − F r ↑ → L ∥ 1 + α ( 1 − S S I M ( F L , F r ↑ → L ) ) (5) \mathcal{L}_{fm}=\| F_{L}-F_{r \uparrow \rightarrow L} \|_{1}+\alpha (1-SSIM(F_{L},F_{r \uparrow \rightarrow L})) \tag {5} Lfm=∥FL−Fr↑→L∥1+α(1−SSIM(FL,Fr↑→L))(5)
Then we use the feature to measure the consistency loss L f m \mathcal{L}_{fm} Lfm Retraining the network gets Φ 1 \Phi^{1} Φ1 Including feature extraction network Φ F 1 \Phi^{1}_{F} ΦF1 And matching network Φ M 1 \Phi^{1}_{M} ΦM1
2.3 Self reinforcing strategy
Pictured 3(b) Shown , Given a stereo dataset with asymmetric resolution .
- use first L p m \mathcal{L}_{pm} Lpm Train a stereo matching network Φ ( ⋅ ; θ F 0 ; θ M 0 ) \Phi (\cdot ;\theta^{0} _{F} ; \theta^{0} _{M}) Φ(⋅;θF0;θM0)( abbreviation Φ 0 \Phi^{0} Φ0), Its feature extractor Φ F 0 \Phi^{0}_{F} ΦF0 Form characteristics to measure loss L f m 0 \mathcal{L}^{0}_{fm} Lfm0.
- And then use it L f m 0 \mathcal{L}^{0}_{fm} Lfm0 A new stereo matching network is optimized Φ 1 \Phi^{1} Φ1. stay Φ 1 \Phi^{1} Φ1 In the process of adjustment , Used to calculate L f m 0 \mathcal{L}^{0}_{fm} Lfm0 The feature extractor is fixed .
- After adjustment , Enhanced Φ F 1 \Phi^{1}_{F} ΦF1 It can also form better characteristics to measure the loss L f m 1 \mathcal{L}^{1}_{fm} Lfm1( Will be used in the next step to enhance ). Keep using L f m k − 1 \mathcal{L}^{k-1}_{fm} Lfmk−1 To adjust Φ k \Phi^{k} Φk, among k ∈ 1 , . . . , K k \in 1,...,K k∈1,...,K.
Be careful , We are only on the Internet Φ k \Phi^{k} Φk Converge to L f m k − 1 \mathcal{L}^{k-1}_{fm} Lfmk−1 Time to build new losses L f m k \mathcal{L}^{k}_{fm} Lfmk, Because frequently changing the lost space may make the training process unstable .
Through this self reinforcing strategy , A continuous optimization network with gradually enhanced feature metric consistency can be obtained
3. experiment
3.1 Experiments on simulated datasets
Data sets :
- Middlebury and KITTI2015;Inria_SLFD and HCI
- Degradation mode :
- Double triple down sampling (BIC)
- Isotropic Gaussian kernel down sampling (IG)
- Anisotropic Gaussian kernel down sampling (AG)
- Isotropic Gaussian kernel JPEG Compress down sampling (IG JPEG)
- Anisotropic Gaussian kernel JPEG Compress down sampling (AG JPEG)
Evaluation indicators :
- 3 Pixel error (3PE), The error of all areas exceeds 3 Pixel and exceeds the true value 5% Percentage of outliers of size ;
- End point error (EPE), The average absolute difference between the estimated parallax and the real parallax .
The method of comparison :
- SGM
- SR Preprocessing +BaseNet
- RCAN+BaseNet
- DAN+BaseNet
- BsaeNet+ Other feature extractors
- BaseNet+CL
- BaseNet+AE
SR: Super resolution recovery , Unblinded SR Method RCAN, blind SR Method DAN
Other feature extractors : To compare losses (Contrastive Loss,CL) Characteristic network of training , With an automatic encoder (Auto-Encoder,AE) As a feature network .
BaseNet They are all popular PSMNet. Use ADAM Solver optimizes the network ( β 1 = 0.9 , β 2 = 0.999 \beta 1=0.9,\beta 2=0.999 β1=0.9,β2=0.999) We set the learning rate to 0.001. The smoothing constraint of parallax is realized by weighted smoothness loss , namely :
KaTeX parse error: \tag works only in display equations
therefore , The total loss function for all learning based solutions can be written as :
KaTeX parse error: \tag works only in display equations
In style , λ \lambda λ It's the weighting factor , L p m / f m \mathcal{L}_{pm/fm} Lpm/fm Is the photometric loss of the first method , Or the corresponding characteristics of the second method and our method measure the loss . Phase in self enhancement strategy K The number is set to 3.
result
Quantitative results

Qualitative results

3.2 Experiments on real datasets
Data sets : Asymmetric stereo pairs are manufactured by Huawei P30 Smartphone capture . The asymmetry factor is approximately equal to 3. After camera calibration and stereo correction , We captured... For indoor and outdoor scenes 30 For asymmetric stereo pairs . We divided them randomly 5 Yes as a test set , Others as training sets .
result :

4. Limitations and conclusions
- Limit :
- In addition to resolution , There may also be other types of asymmetry ( Such as color and brightness ). Whether other types of asymmetric problems can be solved directly by extending the proposed method is still an open question .
- Conclusion :
- This paper reveals that the main challenge of unsupervised correspondence estimation from resolution asymmetric stereo images is photometric inconsistency . To overcome this challenge , We have achieved... In an efficient way Feature metric consistency , And introduced a Self reinforcing strategy To enhance this consistency . It is verified by comprehensive experiments , Our method shows excellent performance in dealing with various degradation between two views in practice .
边栏推荐
- 为什么我不推荐去SAP培训机构参加培训?
- Open world mecha games phantom Galaxy
- Extensions de l'éditeur d'unityeditor - fonctions de table
- leetcode 1143. Longest common subsequence (medium)
- VB. Net class library to obtain the color under the mouse in the screen (Advanced - 3)
- FPGA -vga display
- 分享三种在Excel表格中自动求和的方法
- [hybrid programming JNI] details of JNA in Chapter 11
- 从位图到布隆过滤器,C#实现
- 通过两个stack来实现Queue
猜你喜欢
随机推荐
Unity: the referenced script (unknown) on this behavior is missing“
Simple test lightweight expression calculator fly
简单测试轻量级表达式计算器Flee
300题 第三讲 向量组
Selenium电脑上怎么下载-Selenium下载和安装图文教程[超详细]
Color matching and related issues
颜色搭配和相关问题
Extensions de l'éditeur d'unityeditor - fonctions de table
Wechat applet automatically generates punch in Poster
C语言:简单计算器多次使用代码实现
[mixed programming JNI] Part 7: JNI command lines
6.24 学习内容
通过两个stack来实现Queue
Openpyxl module
用户在hander()goroutine,添加定时器功能,超时则强踢出
Microservices and container choreography in go
How to download on selenium computer -selenium download and installation graphic tutorial [ultra detailed]
浅谈分布式系统开发技术中的CAP定理
【Kotlin】关键词suspend 线程操作的学习和async理解
Wechat applet is authorized to log in wx getUserProfile









