当前位置:网站首页>A detailed light field depth estimation
A detailed light field depth estimation
2022-06-09 11:35:00 【3D vision workshop】
Source new machine vision Optical field imaging LightField
Depth estimation is a key problem to be solved in the process of 3D scene reconstruction , In traditional images , Common depth estimation methods are mainly divided into the following monocular depth estimation 、 Binocular depth estimation and multi - eye depth estimation . Monocular depth estimation is mainly based on the texture of the image 、 gradient 、 Defocusing / Fuzzy clues, etc ; Binocular or multiobjective depth estimation ( It can also be called stereoscopic vision ) It is based on the principle of triangulation , Extracting depth information from multiple two-dimensional images of the same scene taken from different perspectives . The method of light field depth estimation is the inheritance and development of traditional image depth estimation methods .
Classification of light field depth estimation algorithms
We know , The light field image contains multiple view angle information of the scene , It makes it possible for the light field camera to estimate the depth , At the same time, compared with traditional image depth estimation , Not easily affected by sports, etc . However, the baseline between multi view information is short , May lead to mismatching problems , At present, the depth estimation methods for light field at home and abroad can be roughly divided into the following categories :

chart 1 Classification diagram of depth estimation method
Optimization based approach
Depth estimation method based on multi view matching
The method based on multi view matching evolved from the traditional stereo matching of two-dimensional images , The specific theory will not be repeated here . The traditional stereo matching method needs two or more cameras to shoot a scene , In this process, you need to overcome camera jitter 、 Impact of human operation ; The light field camera takes pictures at one time, which is equivalent to multiple cameras taking pictures of the same scene , And hardly affected by camera jitter , It has great potential in depth estimation . The same light field camera also has some problems that need to be overcome , We mentioned earlier , The light field camera has a short baseline , It is easy to cause mismatching problems , The multi view matching method is used to calculate the depth , The key point is to get an accurate match , To solve the problem of mismatching , Scholars have also mentioned many methods to improve the accuracy as much as possible , The following table lists some representative methods :
Title of thesis | Accurate depth map estimation from a lenslet light field camera |
Author of the paper | Jeon, Hae Gon, et al. |
DOI | 10.1109/CVPR.2015.7298762 |
Title of thesis | Line Assisted Light Field Triangulation and Stereo Matching |
Author of the paper | Yu, Zhan, et al. |
DOI | 10.1109/ICCV.2013.347 |
Title of thesis | Shape from Light Field Meets Robust PCA |
Author of the paper | Heber, Stefan, and T. Pock |
DOI | 10.1007/978-3-319-10599-4_48 |
Title of thesis | Variational Shape from Light Field |
Author of the paper | Heber, Stefan, R. Ranftl, and T. Pock |
DOI | 10.1007/978-3-642-40395-8_6 |
Title of thesis | Light Field Stereo Matching Using Bilateral Statistics of Surface Cameras |
Author of the paper | Chen, Can, et al. |
DOI | 10.1109/CVPR.2014.197 |
among ,Yu Et al. Studied the geometric relationship in baseline space , The stereo matching effect of light field is improved [1]; But this method only uses the traditional matching method , It is difficult to estimate the depth in the scene with small parallax . Then, in order to overcome the mismatch problem of small baseline ,Jeon Et al. Proposed a cost quantity matching algorithm based on multiple perspectives , It is used to evaluate the matching cost between tags with different parallax values [2].Heber Et al. Proposed a new PCA The matching condition is used to create the angle of view bend , To solve the problem that the light field is insensitive to pixel level subtle differences caused by multi view narrow baseline [3].
be based on EPI Light field depth estimation method based on
Different from multi view stereo matching , be based on EPI The method is to estimate the depth by analyzing the data structure of the light field .
As shown in the figure below :

chart 2 Schematic diagram of space point projection
Above picture P Points are spatial points ,Ⅱ For camera plane ,Ω Is the image plane . As shown in the figure below , By fixing an angular direction and spatial direction of the light field image , We can get its EPI Images , The lower and right sides of the figure below are EPI Sketch Map :

chart 3 EPI Sketch Map

chart 4 EPI Slope diagram
further , We got EPI After the image , Can pass the test EPI The slope of the slash in the image d It is estimated that , As follows :
d = △u/△x
According to the principle of epipolar geometry , The depth can be derived Z And parallax d The relationship between , As follows :
Z = -f/d
Upper form f For focus ; in addition , According to the formula and the figure above , Target points at different depths , Corresponding EPI The slopes in the image are different .
The following table is a representative list based on EPI Light field depth estimation method based on :
Title of thesis | Robust depth estimation for light field via spinning parallelogram operator |
Author of the paper | Zhang S, Sheng H, et al. |
DOI | 10.1016/j.cviu.2015.12.007 |
Title of thesis | Variational light field analysis for disparity estimation and supersolution |
Author of the paper | Wanner S, Goldluecke B. |
DOI | 10.1109/TPAMI.2013.147 |
Title of thesis | Continuous depth map reconstruction from light fields |
Author of the paper | Li J, Lu M, Li Z N |
DOI | 10.1109/TIP.2015.2440760 |
Title of thesis | A Bidrectional Light Field Hologram Transform |
Author of the paper | Ziegler R, Bucheli S, et al. |
DOI | 10.1111/j.1467-8659.2007.01066.x |
Title of thesis | Epipolar-plane image analysis:An approach to determining structure from motion |
Author of the paper | Bolles R C, Baker H H, Marimont D H. |
DOI | 10.1007/BF00128525 |
EPI Image is a representation of the reconstruction of the four-dimensional coordinates of the light field , Combine an angular coordinate with a spatial coordinate , In the form of a two-dimensional image . Before the light field image is applied , The traditional method is based on the two-dimensional optical geometry, using the epipolar geometric relationship between images taken from different angles to achieve scene reconstruction . Earliest use EPI The work used for depth estimation is 1987 Year by year Bolles A kind of structure depth estimation in moving background proposed by et al , But this work is based on the assumption of color consistency , Therefore, it is not robust to occlusion and noise [7]. after ,Zhang With the help of rotating parallelogram operator, et al EPI The slope of , Improvement is based on EPI The method is not robust to strong occlusion and noise [8]. In this method, the operator is divided into two dimensions EPI In the integration , In this way, you only need to measure the partial distance between the two parts of the window .Wanner Used by others EPI The structure tensor in space estimates the local direction of the line , Then, smooth optimization is introduced to build the global depth [9]. Based on this work ,Li Using structural information, et al EPI Reliability diagram of [10]. This kind of method is applicable to EPI The slope is treated in sections , Therefore, it is easy to make wrong prediction at the segment .Ziegler The two-dimensional EPI Information expands to four dimensions , Excellent depth estimation results have been achieved [11].EPI The method has excellent effect when the depth changes continuously along a straight line in space , But when the line is interrupted by occlusion or noise , There will be mispredictions . However, if segment processing or other constraints are introduced , The complexity of the algorithm will increase .
Depth estimation method based on refocusing
A major feature of light field image is that it can be photographed first and then focused , We know that the parallax values of objects with different depths are different in multi view images , Then the multi view sub images can be translated and superimposed according to certain rules to show different focusing effects . So think in reverse : Objects at the focal plane are clearer , Objects on other planes will blur , According to this feature, we can deduce the depth according to the focus stack .
Current depth estimation methods based on refocusing , The general idea is to follow the following route :

chart 5 Flow chart of depth estimation based on refocusing
The depth cue and the original depth estimation are obtained through the focus stack , Then with the help of the defocusing amount ( There are also ways to combine other clues on the basis of defocusing amount ) To estimate the initial depth ; However, the initial depth can not fully reflect the real scene depth , Also introduce a new control variable —— Degree of confidence , To constrain the optimization depth , for example Tao Estimate the variation curve between defocus and depth to introduce a ratio of primary and secondary peaks as the confidence , To optimize the depth value , Finally, deep fusion [5]. Scholars have also proposed different estimation methods , The following table lists some representative methods :
Title of thesis | Depth from combining defocus and correspondence using light-field cameras |
Author of the paper | Tao M W, Hadap S, et al. |
DOI | 10.1109/ICCV.2013.89 |
Title of thesis | Robust light field depth estimation for noisy scene with occlusion |
Author of the paper | Williem W, Kyu Park, et al. |
DOI | 10.1109/CVPR.2016.476 |
Title of thesis | Occlusion-Aware Depth Estimation Using Light-Field Cameras |
Author of the paper | Wang, Ting Chun, et al. |
DOI | 10.1109/ICCV.2015.398 |
Title of thesis | Depth from shading, defocus, and, correspondence using light-field angular coherence |
Author of the paper | Tao, Michael W, et al. |
DOI | 10.1109/CVPR.2015.7298804 |
At present, the depth estimation method based on refocusing , It mainly focuses on the improvement of depth clues , Improve the robustness of the algorithm against occlusion and noise . The depth estimation method based on refocusing has a good effect in repeated texture and strong noise scenes , However, further tradeoffs are needed in terms of time and algorithm performance .
Learning based depth estimation method
With the rise of deep learning , More and more scholars apply it to depth estimation ; Compared with the optimization based approach , Deep learning only needs to complete the training in advance , You can quickly predict the scene depth , And for the noise 、 Occlusion is also more robust , And it is not affected by algorithm complexity or computing time , However, there are still few algorithms to apply it to the field of light , At the same time, the network training stage is easily restricted by the training data and network structure . The following table shows some representative algorithms :
Title of thesis | Neural EPI-volume networks for shape from light field |
Author of the paper | Heber S, Yu W, Pock T. |
DOI | 10.1109/ICCV.2017.247 |
Title of thesis | What sparse light field coding reveals about scene structure |
Author of the paper | Johannsen O, Sulc A, Goldluecke B. |
DOI | 10.1109/CVPR.2016.355 |
Title of thesis | EPINET: A fully-convolutional neural network using epipolar geometry for depth from light field images |
Author of the paper | Shin C, Jeon H G, ea al. |
DOI | 10.1109/CVPR.2018.00499 |
Title of thesis | Convolutional networks for shape from light field |
Author of the paper | Heber S, Pock T. |
DOI | 10.1109/CVPR.2016.407 |
Although depth learning can effectively learn the depth information of the predicted light field , But at present, it is restricted by the problem of hardware computing power and the scarcity of light field data resources , The final effect of the algorithm still has an improved frontal space , On the premise of certain data resources , Many scholars are committed to improving the learning efficiency of the network , To deal with the problem of large amount of information redundancy in the light field ; For example, try to preprocess the light field data , To reduce the parameters that the network needs to learn . In addition, the information in the light field is complicated , Can you effectively screen important steps and suppress unimportant parts , It is also the key to network performance and efficiency .
Some representative algorithm results are shown
Multi view stereo matching
Here's to show Jeon A subpixel multi view stereo matching algorithm based on phase shift theory was proposed [2]. The idea of this method is :
Transform the image to frequency domain for matching ( To overcome the problem of short baseline )——> Match cost build ——> Build an iterative optimization model ——> Depth map optimization
In this method , The author has designed two optimization models , One is the multi label optimization model , One is the iterative optimization model , The following figure shows the results of different methods :

chart 6 Multi view stereo matching result display
chart a Center view of subaperture image , chart b Depth map based on initial cost volume , chart c Weighted mean filter optimized depth map , chart d Optimized depth map based on multi label optimization model , chart e Depth map after iterative optimization
Method based on refocusing
Here we show Tao The method proposed by et al , Its algorithm mainly does 2 thing , The first is to design two depth cues and estimate the original depth ; Then confidence analysis and deep fusion are carried out [5]; The algorithm framework and step-by-step results are as follows :

chart 7 Algorithm structure diagram based on refocusing

chart 8 Result display diagram based on refocusing
be based on EPI Methods
Here we show Wanner The method proposed by et al , The method relies on EPI Based on the disparity estimation, a continuous total variational framework is designed , In this paper, we propose a structural tensor estimation EPI The slope in the image , To generate a parallax map , The method assumes that the subpixel deviation between subaperture image sequences is fixed , Can quickly generate parallax map , But the estimation error of this method in the edge part is large [9]. The following figure shows the disparity estimation results under different data sets :

chart 9 be based on EPI The results are shown in the figure
( chart a Is the parallax background map , chart b Optimize parallax map before money , chart c The parallax map optimized by the full variational framework ; chart d、e Is the pixel deviation ratio )
A learning based approach
Here are some representative methods ——EPINET Network framework diagram and results ; The characteristic of the network structure is that the input light field data is 4 Stacking at different angles , Then stack and convolute the features of each angle , Expect to get the relationship between different angle features , Finally used ConvReLU-Conv Structure to obtain sub-pixel estimation accuracy [14].

chart 10 Network structure chart

chart 11 EPINET Comparison with the results of different methods
Conclusion
This issue of light field news introduces the mainstream methods of light field depth estimation at home and abroad , At the same time, a representative method is selected from the mainstream methods to show the results ; At present, there are still few depth estimation methods based on light field images at home and abroad , At the same time, the existing methods have their own advantages and disadvantages , I hope the new knowledge of light field in this issue can bring some help to you in the future research , If you feel that this issue of light field news is helpful to you , You might as well tick your little finger to see something like .
reference
[1] Yu Z, Guo X, Lin H, et al. Line assisted light field triangulation and stereo matching[C]//Proceedings of the IEEE International Conference on Computer Vision. 2013: 2792-2799.
[2] Jeon H G, Park J, Choe G, et al. Accurate depth map estimation from a lenslet light field camera[C]//Proceedings of the IEEE conference on computervision and pattern recognition. 2015: 1547-1555.
[3] Heber S, Pock T. Shape from Light Field Meets Robust PCA[C]. european conference on computer vision, 2014: 751-767.
[4] Ng, Ren, Levoy, Marc, Duval, Gene,et al. Light Field Photography with a Hand-held Plenoptic Camera[J]. stanford university cstr, 2005.
[5] Tao M W, Hadap S, Malik J, et al. Depth from combining defocus and correspondence using light-field cameras[C]//Proceedings of the IEEE International Conference on Computer Vision. 2013: 673-680.
[6] Williem W, Kyu Park I. Robust light field depth estimation for noisy scene with occlusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 4396-4404.
[7] Bolles R C, Baker H H, Marimont D H. Epipolar-plane image analysis: An approach to determining structure from motion[J]. International journal of computer vision, 1987, 1(1): 7-55.
[8] Zhang S, Sheng H, Li C, et al. Robust depth estimation for light field via spinning parallelogram operator[J]. Computer Vision and Image Understanding, 2016, 145: 148-159.
[9] Wanner S, Goldluecke B. Variational light field analysis for disparity estimation and super-resolution[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(3): 606-619.
[10] Li J, Lu M, Li Z N. Continuous depth map reconstruction from light fields[J]. IEEE Transactions on Image Processing, 2015, 24(11): 3257-3265.
[11] Ziegler R, Bucheli S, Ahrenberg L, et al. A Bidirectional Light Field ‐ Hologram Transform[C]//Computer Graphics Forum. Oxford, UK: Blackwell Publishing Ltd, 2007, 26(3): 435-446.
[12] Heber S, Yu W, Pock T. Neural EPI-volume networks for shape from light field[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 2252-2260.
[13] Johannsen O, Sulc A, Goldluecke B. What sparse light field coding reveals about scene structure[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3262-3270.
[14] Shin C, Jeon H G, Yoon Y, et al. Epinet: A fully-convolutional neural network using epipolar geometry for depth from light field images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4748-4757.
[15] Heber S, Pock T. Convolutional networks for shape from light field[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3746-3754.
This article is only for academic sharing , If there is any infringement , Please contact to delete .
3D Visual workshop boutique course official website :3dcver.com
1. Multi sensor data fusion technology for automatic driving field
2. For the field of automatic driving 3D Whole stack learning route of point cloud target detection !( Single mode + Multimodal / data + Code )
3. Thoroughly understand the visual three-dimensional reconstruction : Principle analysis 、 Code explanation 、 Optimization and improvement
4. China's first point cloud processing course for industrial practice
5. laser - Vision -IMU-GPS The fusion SLAM Algorithm sorting and code explanation
6. Thoroughly understand the vision - inertia SLAM: be based on VINS-Fusion The class officially started
7. Thoroughly understand based on LOAM Framework of the 3D laser SLAM: Source code analysis to algorithm optimization
8. Thorough analysis of indoor 、 Outdoor laser SLAM Key algorithm principle 、 Code and actual combat (cartographer+LOAM +LIO-SAM)
10. Monocular depth estimation method : Algorithm sorting and code implementation
11. Deployment of deep learning model in autopilot
12. Camera model and calibration ( Monocular + Binocular + fisheye )
13. blockbuster ! Four rotor aircraft : Algorithm and practice
14.ROS2 From entry to mastery : Theory and practice
15. The first one in China 3D Defect detection tutorial : theory 、 Source code and actual combat
blockbuster !3DCVer- Academic paper writing contribution Communication group Established
Scan the code to add a little assistant wechat , can Apply to join 3D Visual workshop - Academic paper writing and contribution WeChat ac group , The purpose is to communicate with each other 、 Top issue 、SCI、EI And so on .
meanwhile You can also apply to join our subdivided direction communication group , At present, there are mainly 3D Vision 、CV& Deep learning 、SLAM、 Three dimensional reconstruction 、 Point cloud post processing 、 Autopilot 、 Multi-sensor fusion 、CV introduction 、 Three dimensional measurement 、VR/AR、3D Face recognition 、 Medical imaging 、 defect detection 、 Pedestrian recognition 、 Target tracking 、 Visual products landing 、 The visual contest 、 License plate recognition 、 Hardware selection 、 Academic exchange 、 Job exchange 、ORB-SLAM Series source code exchange 、 Depth estimation Wait for wechat group .
Be sure to note : Research direction + School / company + nickname , for example :”3D Vision + Shanghai Jiaotong University + quietly “. Please note... According to the format , Can be quickly passed and invited into the group . Original contribution Please also contact .

▲ Long press and add wechat group or contribute

▲ The official account of long click attention
3D Vision goes from entry to mastery of knowledge : in the light of 3D In the field of vision Video Course cheng ( 3D reconstruction series 、 3D point cloud series 、 Structured light series 、 Hand eye calibration 、 Camera calibration 、 laser / Vision SLAM、 Automatically Driving, etc )、 Summary of knowledge points 、 Introduction advanced learning route 、 newest paper Share 、 Question answer Carry out deep cultivation in five aspects , There are also algorithm engineers from various large factories to provide technical guidance . meanwhile , The planet will be jointly released by well-known enterprises 3D Vision related algorithm development positions and project docking information , Create a set of technology and employment as one of the iron fans gathering area , near 4000 Planet members create better AI The world is making progress together , Knowledge planet portal :
Study 3D Visual core technology , Scan to see the introduction ,3 Unconditional refund within days

There are high quality tutorial materials in the circle 、 Answer questions and solve doubts 、 Help you solve problems efficiently
Feel useful , Please give me a compliment ~
边栏推荐
- 对象的实例化和访问
- MySQL learning notes - Part 3 - indexes, stored procedures and functions, views, triggers
- [SystemVerilog data type] ~ data type, logic type, array
- Tidb cloud launched Google cloud marketplace, empowering global developers with a new stack of real-time HTAP databases
- Lecture 4: data warehouse construction (II)
- Leetcode 2048. 下一个更大的数值平衡数(有点意思,已解决)
- 使用 KubeKey 搭建 Kubernetes/KubeSphere 环境的“心路(累)历程“
- Object instantiation and access
- Perfdog releases new indicators, tailored to the game
- CVPR 2022 | PTTR: 基于Transformer的三维点云目标跟踪
猜你喜欢

What are the application advantages of 3D digital sand table display

The most complete knowledge summary, which must be read by beginners

最新版,最新资料

redis中的string类型是怎么组织的?

【SignalR全套系列】之在.Net Core 中实现SignalR实时通信

【Homeassistant驱动舵机servo】

CTFshow之web89~web96---php特性(1)

Multi engine database management tool DataGrid 2022.1.5 Chinese version

母带编辑制作工具WaveLab 11 Pro

精智达冲刺科创板:年营收4.58亿 中小企业基金是股东
随机推荐
Computer selection 1
flex:1不等分的问题
[buuctf.reverse] 105_ [FlareOn6]Memecat Battlestation
BUUCTF之HardSQL[极客大挑战 2019]
CVPR 2022 | PTTR: 基于Transformer的三维点云目标跟踪
[buuctf.reverse] 103_ [CFI-CTF 2018]powerPacked
Gaussdb (DWS) data migration [Gauss is not a mathematician this time]
建议收藏:数据标准的概念,分类,价值及6大实施步骤解析
Learning notes of data structure in redis
Network wide detailed interface test apipost detailed tutorial (actual combat), hematemesis sorting
MySQL 学习笔记-第五篇-数据备份与恢复、MySQL 日志
P5482 [JLOI2011]不等式组,cckk
About CSRF and its defense
无法在debug时进入ArrayList底层解决方案
移动端拉起键盘后遮挡问题
Actual combat of Nacos configuration center, standard components of Pangu microservice development
Master tape editing tool Wavelab 11 Pro
Execution engine - (compiler, JIT)
What are the preparations for building your own website
flutter setState() called after dispose()