当前位置:网站首页>A detailed light field depth estimation

A detailed light field depth estimation

2022-06-09 11:35:00 3D vision workshop

Source new machine vision    Optical field imaging LightField

Depth estimation is a key problem to be solved in the process of 3D scene reconstruction , In traditional images , Common depth estimation methods are mainly divided into the following monocular depth estimation 、 Binocular depth estimation and multi - eye depth estimation . Monocular depth estimation is mainly based on the texture of the image 、 gradient 、 Defocusing / Fuzzy clues, etc ; Binocular or multiobjective depth estimation ( It can also be called stereoscopic vision ) It is based on the principle of triangulation , Extracting depth information from multiple two-dimensional images of the same scene taken from different perspectives . The method of light field depth estimation is the inheritance and development of traditional image depth estimation methods .

Classification of light field depth estimation algorithms

We know , The light field image contains multiple view angle information of the scene , It makes it possible for the light field camera to estimate the depth , At the same time, compared with traditional image depth estimation , Not easily affected by sports, etc . However, the baseline between multi view information is short , May lead to mismatching problems , At present, the depth estimation methods for light field at home and abroad can be roughly divided into the following categories :

887be055010e082fbe38630816c440b0.png

chart 1 Classification diagram of depth estimation method

Optimization based approach

Depth estimation method based on multi view matching

The method based on multi view matching evolved from the traditional stereo matching of two-dimensional images , The specific theory will not be repeated here . The traditional stereo matching method needs two or more cameras to shoot a scene , In this process, you need to overcome camera jitter 、 Impact of human operation ; The light field camera takes pictures at one time, which is equivalent to multiple cameras taking pictures of the same scene , And hardly affected by camera jitter , It has great potential in depth estimation . The same light field camera also has some problems that need to be overcome , We mentioned earlier , The light field camera has a short baseline , It is easy to cause mismatching problems , The multi view matching method is used to calculate the depth , The key point is to get an accurate match , To solve the problem of mismatching , Scholars have also mentioned many methods to improve the accuracy as much as possible , The following table lists some representative methods :

Title of thesis

Accurate depth map estimation from a lenslet light field camera

Author of the paper

Jeon, Hae Gon, et al.

DOI

10.1109/CVPR.2015.7298762

Title of thesis

Line Assisted Light Field Triangulation and Stereo Matching

Author of the paper

Yu, Zhan, et al.

DOI

10.1109/ICCV.2013.347

Title of thesis

Shape from Light Field Meets Robust PCA

Author of the paper

Heber, Stefan, and T. Pock

DOI

10.1007/978-3-319-10599-4_48

Title of thesis

Variational Shape from Light Field

Author of the paper

Heber, Stefan, R. Ranftl, and T. Pock

DOI

10.1007/978-3-642-40395-8_6

Title of thesis

Light Field Stereo Matching Using Bilateral Statistics of Surface Cameras

Author of the paper

Chen, Can, et al.

DOI

10.1109/CVPR.2014.197

among ,Yu Et al. Studied the geometric relationship in baseline space , The stereo matching effect of light field is improved [1]; But this method only uses the traditional matching method , It is difficult to estimate the depth in the scene with small parallax . Then, in order to overcome the mismatch problem of small baseline ,Jeon Et al. Proposed a cost quantity matching algorithm based on multiple perspectives , It is used to evaluate the matching cost between tags with different parallax values [2].Heber Et al. Proposed a new PCA The matching condition is used to create the angle of view bend , To solve the problem that the light field is insensitive to pixel level subtle differences caused by multi view narrow baseline [3].

be based on EPI Light field depth estimation method based on

Different from multi view stereo matching , be based on EPI The method is to estimate the depth by analyzing the data structure of the light field .

As shown in the figure below :

2de7a2a534abd68e9f8847f7f413d9c4.png

chart 2 Schematic diagram of space point projection

Above picture P Points are spatial points ,Ⅱ For camera plane ,Ω Is the image plane . As shown in the figure below , By fixing an angular direction and spatial direction of the light field image , We can get its EPI Images , The lower and right sides of the figure below are EPI Sketch Map :

d7d206e1c4eccb247f9600811d4f1ee5.png

chart 3 EPI Sketch Map

48ecad06f237efa9439d4728aac1321c.png

chart 4 EPI Slope diagram

further , We got EPI After the image , Can pass the test EPI The slope of the slash in the image d It is estimated that , As follows :

d = △u/△x

According to the principle of epipolar geometry , The depth can be derived Z And parallax d The relationship between , As follows :

Z = -f/d

Upper form f For focus ; in addition , According to the formula and the figure above , Target points at different depths , Corresponding EPI The slopes in the image are different .

The following table is a representative list based on EPI Light field depth estimation method based on :

Title of thesis

Robust depth estimation for light field via spinning parallelogram operator

Author of the paper

Zhang S, Sheng H, et al.

DOI

10.1016/j.cviu.2015.12.007

Title of thesis

Variational light field analysis for disparity estimation and supersolution

Author of the paper

Wanner S, Goldluecke B.

DOI

10.1109/TPAMI.2013.147

Title of thesis

Continuous depth map reconstruction from light fields

Author of the paper

Li J, Lu M, Li Z N

DOI

10.1109/TIP.2015.2440760

Title of thesis

A Bidrectional Light Field Hologram Transform

Author of the paper

Ziegler R, Bucheli S, et al.

DOI

10.1111/j.1467-8659.2007.01066.x

Title of thesis

Epipolar-plane image analysis:An approach to determining structure from motion

Author of the paper

Bolles R C, Baker H H, Marimont D H.

DOI

10.1007/BF00128525

EPI Image is a representation of the reconstruction of the four-dimensional coordinates of the light field , Combine an angular coordinate with a spatial coordinate , In the form of a two-dimensional image . Before the light field image is applied , The traditional method is based on the two-dimensional optical geometry, using the epipolar geometric relationship between images taken from different angles to achieve scene reconstruction . Earliest use EPI The work used for depth estimation is 1987 Year by year Bolles A kind of structure depth estimation in moving background proposed by et al , But this work is based on the assumption of color consistency , Therefore, it is not robust to occlusion and noise [7]. after ,Zhang With the help of rotating parallelogram operator, et al EPI The slope of , Improvement is based on EPI The method is not robust to strong occlusion and noise [8]. In this method, the operator is divided into two dimensions EPI In the integration , In this way, you only need to measure the partial distance between the two parts of the window .Wanner Used by others EPI The structure tensor in space estimates the local direction of the line , Then, smooth optimization is introduced to build the global depth [9]. Based on this work ,Li Using structural information, et al EPI Reliability diagram of [10]. This kind of method is applicable to EPI The slope is treated in sections , Therefore, it is easy to make wrong prediction at the segment .Ziegler The two-dimensional EPI Information expands to four dimensions , Excellent depth estimation results have been achieved [11].EPI The method has excellent effect when the depth changes continuously along a straight line in space , But when the line is interrupted by occlusion or noise , There will be mispredictions . However, if segment processing or other constraints are introduced , The complexity of the algorithm will increase .

Depth estimation method based on refocusing

A major feature of light field image is that it can be photographed first and then focused , We know that the parallax values of objects with different depths are different in multi view images , Then the multi view sub images can be translated and superimposed according to certain rules to show different focusing effects . So think in reverse : Objects at the focal plane are clearer , Objects on other planes will blur , According to this feature, we can deduce the depth according to the focus stack .

Current depth estimation methods based on refocusing , The general idea is to follow the following route :

a8746896d93ccebeb872a4b861b33369.png

chart 5 Flow chart of depth estimation based on refocusing

The depth cue and the original depth estimation are obtained through the focus stack , Then with the help of the defocusing amount ( There are also ways to combine other clues on the basis of defocusing amount ) To estimate the initial depth ; However, the initial depth can not fully reflect the real scene depth , Also introduce a new control variable —— Degree of confidence , To constrain the optimization depth , for example Tao Estimate the variation curve between defocus and depth to introduce a ratio of primary and secondary peaks as the confidence , To optimize the depth value , Finally, deep fusion [5]. Scholars have also proposed different estimation methods , The following table lists some representative methods :

Title of thesis

Depth from combining defocus and correspondence using light-field cameras

Author of the paper

Tao M W, Hadap S, et al.

DOI

10.1109/ICCV.2013.89

Title of thesis

Robust light field depth estimation for noisy scene with occlusion

Author of the paper

 Williem W, Kyu Park, et al.

DOI

10.1109/CVPR.2016.476

Title of thesis

Occlusion-Aware Depth Estimation Using Light-Field Cameras

Author of the paper

Wang, Ting Chun, et al.

DOI

10.1109/ICCV.2015.398

Title of thesis

Depth from shading, defocus, and, correspondence using light-field angular coherence

Author of the paper

Tao, Michael W, et al.

DOI

10.1109/CVPR.2015.7298804

At present, the depth estimation method based on refocusing , It mainly focuses on the improvement of depth clues , Improve the robustness of the algorithm against occlusion and noise . The depth estimation method based on refocusing has a good effect in repeated texture and strong noise scenes , However, further tradeoffs are needed in terms of time and algorithm performance .

Learning based depth estimation method

With the rise of deep learning , More and more scholars apply it to depth estimation ; Compared with the optimization based approach , Deep learning only needs to complete the training in advance , You can quickly predict the scene depth , And for the noise 、 Occlusion is also more robust , And it is not affected by algorithm complexity or computing time , However, there are still few algorithms to apply it to the field of light , At the same time, the network training stage is easily restricted by the training data and network structure . The following table shows some representative algorithms :

Title of thesis

Neural EPI-volume networks for shape from light field

Author of the paper

Heber S, Yu W, Pock T.

DOI

10.1109/ICCV.2017.247

Title of thesis

What sparse light field coding reveals about scene structure

Author of the paper

Johannsen O, Sulc A, Goldluecke B.

DOI

10.1109/CVPR.2016.355

Title of thesis

EPINET: A fully-convolutional neural network using epipolar geometry for depth from light field images

Author of the paper

Shin C, Jeon H G, ea al.

DOI

10.1109/CVPR.2018.00499

Title of thesis

Convolutional networks for shape from light field

Author of the paper

Heber S, Pock T.

DOI

10.1109/CVPR.2016.407

Although depth learning can effectively learn the depth information of the predicted light field , But at present, it is restricted by the problem of hardware computing power and the scarcity of light field data resources , The final effect of the algorithm still has an improved frontal space , On the premise of certain data resources , Many scholars are committed to improving the learning efficiency of the network , To deal with the problem of large amount of information redundancy in the light field ; For example, try to preprocess the light field data , To reduce the parameters that the network needs to learn . In addition, the information in the light field is complicated , Can you effectively screen important steps and suppress unimportant parts , It is also the key to network performance and efficiency .

Some representative algorithm results are shown

Multi view stereo matching

Here's to show Jeon A subpixel multi view stereo matching algorithm based on phase shift theory was proposed [2]. The idea of this method is :

Transform the image to frequency domain for matching ( To overcome the problem of short baseline )——> Match cost build ——> Build an iterative optimization model ——> Depth map optimization

In this method , The author has designed two optimization models , One is the multi label optimization model , One is the iterative optimization model , The following figure shows the results of different methods :

d23deda0fb70dea507957d8df62973d5.png

chart 6 Multi view stereo matching result display

chart a Center view of subaperture image , chart b Depth map based on initial cost volume , chart c Weighted mean filter optimized depth map , chart d Optimized depth map based on multi label optimization model , chart e Depth map after iterative optimization

Method based on refocusing

Here we show Tao The method proposed by et al , Its algorithm mainly does 2 thing , The first is to design two depth cues and estimate the original depth ; Then confidence analysis and deep fusion are carried out [5]; The algorithm framework and step-by-step results are as follows :

4adf17b672f2540bc2700e1bfa663949.png

chart 7 Algorithm structure diagram based on refocusing

f49c5d3317a5d1e34cda262dd7ab100a.png

chart 8 Result display diagram based on refocusing

be based on EPI Methods

Here we show Wanner The method proposed by et al , The method relies on EPI Based on the disparity estimation, a continuous total variational framework is designed , In this paper, we propose a structural tensor estimation EPI The slope in the image , To generate a parallax map , The method assumes that the subpixel deviation between subaperture image sequences is fixed , Can quickly generate parallax map , But the estimation error of this method in the edge part is large [9]. The following figure shows the disparity estimation results under different data sets :

108b09b4049429696232e0a98fb1ca3c.png

  chart 9 be based on EPI The results are shown in the figure  

( chart a Is the parallax background map , chart b Optimize parallax map before money , chart c The parallax map optimized by the full variational framework ; chart d、e Is the pixel deviation ratio )

A learning based approach

Here are some representative methods ——EPINET Network framework diagram and results ; The characteristic of the network structure is that the input light field data is 4 Stacking at different angles , Then stack and convolute the features of each angle , Expect to get the relationship between different angle features , Finally used ConvReLU-Conv Structure to obtain sub-pixel estimation accuracy [14].

24f834538f5a3bce162f98c878ad9f1e.png

chart 10 Network structure chart

2891de4816c1012930156c5b54d26ff7.png

chart 11  EPINET Comparison with the results of different methods

Conclusion

This issue of light field news introduces the mainstream methods of light field depth estimation at home and abroad , At the same time, a representative method is selected from the mainstream methods to show the results ; At present, there are still few depth estimation methods based on light field images at home and abroad , At the same time, the existing methods have their own advantages and disadvantages , I hope the new knowledge of light field in this issue can bring some help to you in the future research , If you feel that this issue of light field news is helpful to you , You might as well tick your little finger to see something like .

reference

[1] Yu Z, Guo X, Lin H, et al. Line assisted light field triangulation and stereo matching[C]//Proceedings of the IEEE International Conference on Computer Vision. 2013: 2792-2799.

[2] Jeon H G, Park J, Choe G, et al. Accurate depth map estimation from a lenslet light field camera[C]//Proceedings of the IEEE conference on computervision and pattern recognition. 2015: 1547-1555.

[3] Heber S, Pock T. Shape from Light Field Meets Robust PCA[C]. european conference on computer vision, 2014: 751-767.

[4] Ng, Ren, Levoy, Marc, Duval, Gene,et al. Light Field Photography with a Hand-held Plenoptic Camera[J]. stanford university cstr, 2005.

[5] Tao M W, Hadap S, Malik J, et al. Depth from combining defocus and correspondence using light-field cameras[C]//Proceedings of the IEEE International Conference on Computer Vision. 2013: 673-680.

[6] Williem  W,  Kyu  Park  I.  Robust  light  field  depth  estimation  for  noisy  scene  with occlusion[C]//Proceedings  of  the  IEEE  Conference  on  Computer  Vision  and  Pattern Recognition. 2016: 4396-4404.

[7] Bolles  R  C,  Baker  H  H,  Marimont  D  H.  Epipolar-plane image analysis: An approach to determining structure from motion[J]. International journal of computer vision, 1987, 1(1): 7-55.

[8] Zhang S, Sheng H, Li C, et al. Robust depth estimation for light field via spinning parallelogram operator[J]. Computer Vision and Image Understanding, 2016, 145: 148-159.

[9] Wanner S, Goldluecke B. Variational light field analysis for disparity estimation and super-resolution[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(3): 606-619.

[10] Li  J,  Lu  M,  Li  Z  N.  Continuous  depth  map  reconstruction  from  light  fields[J].  IEEE Transactions on Image Processing, 2015, 24(11): 3257-3265.

[11] Ziegler  R,  Bucheli  S,  Ahrenberg  L,  et  al.  A  Bidirectional  Light  Field ‐ Hologram Transform[C]//Computer Graphics Forum. Oxford, UK: Blackwell Publishing Ltd, 2007, 26(3): 435-446.

[12] Heber S, Yu W, Pock T. Neural EPI-volume networks for shape from light field[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 2252-2260.

[13] Johannsen  O,  Sulc A,  Goldluecke  B.  What  sparse  light  field  coding  reveals  about  scene structure[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3262-3270.

[14] Shin C, Jeon H G, Yoon Y, et al. Epinet: A fully-convolutional neural network using epipolar geometry for depth from light field images[C]//Proceedings  of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4748-4757.

[15] Heber S, Pock T. Convolutional networks for shape from light field[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3746-3754.

This article is only for academic sharing , If there is any infringement , Please contact to delete .

3D Visual workshop boutique course official website :3dcver.com

1. Multi sensor data fusion technology for automatic driving field

2. For the field of automatic driving 3D Whole stack learning route of point cloud target detection !( Single mode + Multimodal / data + Code )
3. Thoroughly understand the visual three-dimensional reconstruction : Principle analysis 、 Code explanation 、 Optimization and improvement
4. China's first point cloud processing course for industrial practice
5. laser - Vision -IMU-GPS The fusion SLAM Algorithm sorting and code explanation
6. Thoroughly understand the vision - inertia SLAM: be based on VINS-Fusion The class officially started
7. Thoroughly understand based on LOAM Framework of the 3D laser SLAM: Source code analysis to algorithm optimization
8. Thorough analysis of indoor 、 Outdoor laser SLAM Key algorithm principle 、 Code and actual combat (cartographer+LOAM +LIO-SAM)

9. Build a set of structured light from zero 3D Rebuild the system [ theory + Source code + practice ]

10. Monocular depth estimation method : Algorithm sorting and code implementation

11. Deployment of deep learning model in autopilot

12. Camera model and calibration ( Monocular + Binocular + fisheye )

13. blockbuster ! Four rotor aircraft : Algorithm and practice

14.ROS2 From entry to mastery : Theory and practice

15. The first one in China 3D Defect detection tutorial : theory 、 Source code and actual combat

blockbuster !3DCVer- Academic paper writing contribution   Communication group Established

Scan the code to add a little assistant wechat , can Apply to join 3D Visual workshop - Academic paper writing and contribution   WeChat ac group , The purpose is to communicate with each other 、 Top issue 、SCI、EI And so on .

meanwhile You can also apply to join our subdivided direction communication group , At present, there are mainly 3D Vision CV& Deep learning SLAM Three dimensional reconstruction Point cloud post processing Autopilot 、 Multi-sensor fusion 、CV introduction 、 Three dimensional measurement 、VR/AR、3D Face recognition 、 Medical imaging 、 defect detection 、 Pedestrian recognition 、 Target tracking 、 Visual products landing 、 The visual contest 、 License plate recognition 、 Hardware selection 、 Academic exchange 、 Job exchange 、ORB-SLAM Series source code exchange 、 Depth estimation Wait for wechat group .

Be sure to note : Research direction + School / company + nickname , for example :”3D Vision  + Shanghai Jiaotong University + quietly “. Please note... According to the format , Can be quickly passed and invited into the group . Original contribution Please also contact .

dafe83a5ee08e7abe2ba286687860de2.png

▲ Long press and add wechat group or contribute

ef8cf4f2bf05bf91db0477fb33b46a2f.png

▲ The official account of long click attention

3D Vision goes from entry to mastery of knowledge : in the light of 3D In the field of vision Video Course cheng ( 3D reconstruction series 3D point cloud series Structured light series Hand eye calibration Camera calibration laser / Vision SLAM Automatically Driving, etc )、 Summary of knowledge points 、 Introduction advanced learning route 、 newest paper Share 、 Question answer Carry out deep cultivation in five aspects , There are also algorithm engineers from various large factories to provide technical guidance . meanwhile , The planet will be jointly released by well-known enterprises 3D Vision related algorithm development positions and project docking information , Create a set of technology and employment as one of the iron fans gathering area , near 4000 Planet members create better AI The world is making progress together , Knowledge planet portal :

Study 3D Visual core technology , Scan to see the introduction ,3 Unconditional refund within days

bdd2ab0fa52a7b3db384c101b1919448.png

  There are high quality tutorial materials in the circle 、 Answer questions and solve doubts 、 Help you solve problems efficiently

Feel useful , Please give me a compliment ~  

原网站

版权声明
本文为[3D vision workshop]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206091046398609.html