当前位置：网站首页>Binocular 3D perception (I): preliminary understanding of binocular

Binocular 3D perception (I): preliminary understanding of binocular

2022-06-25 15:36:00 【anthony-36】

Binocular 3D perception （ One ）： A preliminary understanding of binocular

advantage ：

Monocular 3D Perception depends on prior knowledge and geometric constraints
Deep learning algorithms are very dependent on the size of the data set 、 Quality and diversity
The binocular system solves the ambiguity caused by perspective transformation
Binocular perception does not depend on the results of object detection , It is effective for any obstacle

Inferiority ：

Hardware ： The camera needs to be accurately registered , The correctness of registration shall also be maintained during vehicle operation
Software ： The algorithm needs to process data from two cameras at the same time , High computational complexity

Binocular depth estimation

Please add a picture description

The basic principle

1. Concepts and formulas

B： Baseline length （ The distance between two cameras ）
f： The focal length of the camera
d： parallax ( The same one on the left and right images 3D The distance between the points )
f and B Is constant , Required solution depth z, Just estimate the parallax d（xl-xr）

$\begin{cases} f/z=xl/x\\ f/z=xr/x-B \end{cases} \\ Only x and z It's an unknown variable$

We get the following formula ：

$Z = f B / d$

2. Disparity estimation ： For each pixel in the left figure . You need to find the matching point in the right figure .

For each possible parallax （ Limited scope ）, Calculate matching error , Therefore, the obtained three-dimensional error data is called Cost Volume.、
When calculating the matching error, consider the local area near the pixel , For example, sum the differences of all corresponding pixel values in the local area .
adopt Cost Volume You can get the parallax at each pixel ( Corresponding to the minimum matching error ), So we can get the depth value .

Please add a picture description

The key of binocular depth estimation is to calculate the matching error The key to calculate the matching error is feature extraction

PSMNET

1. The shared convolution network is used for feature extraction on the left and right images

Including down sampling , Pyramid structure and hole convolution are used to extract multi-resolution information and expand receptive field

2. Left and right feature map construction Cost Volume

3.3D Convolution is used to extract information between left and right feature maps and different parallax levels

4. Upsampling to original resolution , Find the parallax value with the smallest matching error

5. The process

Please add a picture description

6. Result analysis （KITTI Data sets ）

There is an error between the object and the background

Cause analysis ： Although features contain neighborhood information , But it lacks the supervisory signal of high-level semantic information , Unable to understand the scene .

How to improve ： The results of object detection and semantic segmentation are used for post-processing , Or multiple tasks

Error due to long distance

Please add a picture description

distance	0-10m	10-30m	30-60m	60-inf	0-inf
Depth error （RMSE）	0.268	1.203	6.056	16.604	2.605

Cause analysis ： The parallax value at a long distance is small , It is difficult to distinguish between discrete image pixels
$Z = f B / d$
How to improve ：① Improve the spatial resolution of the image （ long-focus ）, It makes the distant objects have more pixel coverage

② Increase baseline length , Thus increasing the range of parallax

Areas of low texture or low light , The error of depth estimation is large

Please add a picture description

Cause analysis ： Features cannot be effectively extracted in this region , Used to calculate the matching error

How to improve ： Improve the dynamic range of the camera , Or use a sensor that can measure distance

The specific simulation process is recorded in the next chapter .

原网站

版权声明
本文为[anthony-36]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/176/202206251455190980.html

当前位置：网站首页>Binocular 3D perception (I): preliminary understanding of binocular

Binocular 3D perception (I): preliminary understanding of binocular

Binocular 3D perception （ One ）： A preliminary understanding of binocular

advantage ：

Inferiority ：

Binocular depth estimation

The basic principle

1. Concepts and formulas

2. Disparity estimation ： For each pixel in the left figure . You need to find the matching point in the right figure .

PSMNET

1. The shared convolution network is used for feature extraction on the left and right images

2. Left and right feature map construction Cost Volume

3.3D Convolution is used to extract information between left and right feature maps and different parallax levels

4. Upsampling to original resolution , Find the parallax value with the smallest matching error

5. The process

6. Result analysis （KITTI Data sets ）

边栏推荐

猜你喜欢

随机推荐