当前位置:网站首页>Binocular 3D perception (I): preliminary understanding of binocular
Binocular 3D perception (I): preliminary understanding of binocular
2022-06-25 15:36:00 【anthony-36】
Binocular 3D perception ( One ): A preliminary understanding of binocular
advantage :
- Monocular 3D Perception depends on prior knowledge and geometric constraints
- Deep learning algorithms are very dependent on the size of the data set 、 Quality and diversity
- The binocular system solves the ambiguity caused by perspective transformation
- Binocular perception does not depend on the results of object detection , It is effective for any obstacle
Inferiority :
- Hardware : The camera needs to be accurately registered , The correctness of registration shall also be maintained during vehicle operation
- Software : The algorithm needs to process data from two cameras at the same time , High computational complexity
Binocular depth estimation

The basic principle
1. Concepts and formulas
B: Baseline length ( The distance between two cameras )
f: The focal length of the camera
d: parallax ( The same one on the left and right images 3D The distance between the points )
f and B Is constant , Required solution depth z, Just estimate the parallax d(xl-xr)
root According to the phase like 3、 ... and horn shape , have to To { f / z = x l / x f / z = x r / x − B only Yes x and z yes not know change The amount According to the similar triangle , obtain \begin{cases} f/z=xl/x\\ f/z=xr/x-B \end{cases} \\ Only x and z It's an unknown variable root According to the phase like 3、 ... and horn shape , have to To { f/z=xl/xf/z=xr/x−B only Yes x and z yes not know change The amount
We get the following formula :
Z = f B / d Z=fB/d Z=fB/d
2. Disparity estimation : For each pixel in the left figure . You need to find the matching point in the right figure .
- For each possible parallax ( Limited scope ), Calculate matching error , Therefore, the obtained three-dimensional error data is called Cost Volume.、
- When calculating the matching error, consider the local area near the pixel , For example, sum the differences of all corresponding pixel values in the local area .
- adopt Cost Volume You can get the parallax at each pixel ( Corresponding to the minimum matching error ), So we can get the depth value .

PSMNET
1. The shared convolution network is used for feature extraction on the left and right images
- Including down sampling , Pyramid structure and hole convolution are used to extract multi-resolution information and expand receptive field
2. Left and right feature map construction Cost Volume
3.3D Convolution is used to extract information between left and right feature maps and different parallax levels
4. Upsampling to original resolution , Find the parallax value with the smallest matching error
5. The process




6. Result analysis (KITTI Data sets )
- There is an error between the object and the background

Cause analysis : Although features contain neighborhood information , But it lacks the supervisory signal of high-level semantic information , Unable to understand the scene .
How to improve : The results of object detection and semantic segmentation are used for post-processing , Or multiple tasks
- Error due to long distance

| distance | 0-10m | 10-30m | 30-60m | 60-inf | 0-inf |
|---|---|---|---|---|---|
| Depth error (RMSE) | 0.268 | 1.203 | 6.056 | 16.604 | 2.605 |
Cause analysis : The parallax value at a long distance is small , It is difficult to distinguish between discrete image pixels
Z = f B / d Z=fB/d Z=fB/d
How to improve :① Improve the spatial resolution of the image ( long-focus ), It makes the distant objects have more pixel coverage
② Increase baseline length , Thus increasing the range of parallax
- Areas of low texture or low light , The error of depth estimation is large

Cause analysis : Features cannot be effectively extracted in this region , Used to calculate the matching error
How to improve : Improve the dynamic range of the camera , Or use a sensor that can measure distance
The specific simulation process is recorded in the next chapter .
边栏推荐
- About?: Notes for
- Boost listening port server
- The situation and suggestions of a software engineering associate graduate who failed in the postgraduate entrance examination
- About%*s and%* s
- Using reentrantlock and synchronized to implement blocking queue
- QT article outline
- Distributed transaction solution
- MySQL performance optimization - index optimization
- 免费送书啦!火遍全网的AI给老照片上色,这里有一份详细教程!
- Leetcode123 timing of buying and selling stocks III
猜你喜欢

Summary of regularization methods

Summary of four parameter adjustment methods for machine learning

Could not connect to redis at 127.0.0.1:6379 in Windows

04. binary tree
![[paper notes] contextual transformer networks for visual recognition](/img/e4/45185983e28664564bbf79023ccbf6.jpg)
[paper notes] contextual transformer networks for visual recognition

CPU over high diagnosis and troubleshooting

Globally unique key generation strategy - implementation principle of the sender

Talk about the creation process of JVM objects

剑指 Offer 03. 数组中重复的数字

剑指 Offer 06. 从尾到头打印链表
随机推荐
MySQL modify field statement
Using R language in jupyter notebook
golang reverse a slice
通过客户经理的开户链接开股票账户安全吗?
Pytorch | how to save and load pytorch models?
How to package rpm
Detailed description of crontab command format and summary of common writing methods
Detailed summary of reasons why alertmanager fails to send alarm messages at specified intervals / irregularly
QT animation loading and closing window
Leetcode121 timing of buying and selling stocks
Client development (electron) data store
55 specific ways to improve program design (2)
Boost listening port server
Solve valueerror: invalid literal for int() with base 10
Paddlepaddle paper reproduction course biggan learning experience
剑指 Offer 05. 替换空格
QT set process startup and self startup
Thread - learning notes
Simulating Sir disease transmission model with netlogo
Leetcode122 timing of buying and selling stocks II