当前位置:网站首页>Detailed reading of stereo r-cnn paper -- Experiment: detailed explanation and result analysis
Detailed reading of stereo r-cnn paper -- Experiment: detailed explanation and result analysis
2022-07-06 10:57:00 【Is it Wei Xiaobai】
In the past, I used to read the method part when reading papers , Then look at the performance of the test data . Recently, when I was writing my thesis, I found ,“ How to design the experiment ” It's also important , I will pay more attention to this part when I read the thesis in the future .
One 、 Details of the experiment
Introduce the conditions required for the test in detail
Network
Use five ranges (scale){32, 64, 128, 126, 512} And three proportions (ratios){0.5, 1, 2} Of archor. Adjust the size of the shorter edge of the original image to 600 Pixels . about Stereo-RPN, Due to the connection of left and right characteristic graphs , You need to have 1024 Input channels , instead of 512 Layers layer. Again , stay R-CNN Back to the head head Yes 512 Input channels . stay Titan XP GPU On ,Stereo R-CNN To a Stereo pair The reasoning time is about 0.28s.
Training
It's mainly about loss Explanation
Among them Express RPN and R-CNN, Subscript box、α、dim、key respectively stereo boxes Of loss,viewpoint Of loss、dimension Of loss and keypotint Of loss.
During training, the left and right images will also be flipped and exchanged ( Correspondingly, it will viewpoint angle and keypoint Mirror image ) To expand the data set . One per training batch Keep one in stereo and 512 individual RoIs.
Other conditions : Use SGD、 The weight decays to 0.0005、 Momentum is 0.9%、 The learning rate is initialized to 0.001 And each 5 individual epoch Reduce 0.1%. Total training 20 individual epoch.
Two 、 Result analysis
Stereo Recall and Stereo Detection
Stereo R-CNN The target of is to detect and correlate the targets in the left and right images at the same time . In addition to evaluating the left and right images 2D Average recall (AR) and 2D average precision (AP) Outside , Also defined stereo AR and stereo AP Measure , Only query stereo box Only when the following conditions are met can it be considered as true positive (TPS):
1. left GT The maximum size of the box IOU Greater than the given threshold ;
2. On the right side GT The maximum size of the box IOU Greater than the given threshold ;
3. Select the left and right GT The box belongs to the same object .
As shown in the table 1 Shown , And Faster RCNN comparison Stereo RCNN Have similar on a single image proposal recall and detection precision, At the same time, high-quality data association is generated in the left and right images without additional calculation .
although RPN Medium stereo AR Slightly smaller than left AR, But in R-CNN Left observed after 、 Right and right stereo AP Almost the same , This shows that the detection performance on the left and right images is consistent , And almost all the left images are true positive box There is a corresponding true positive box.
In addition, two left and right feature fusion strategies are tested : Element based Averaging Strategy and channel cascading strategy . As shown in the table 1 Described in , Because all the information is retained , Channel cascading shows better performance .
above , Proved accurate stereo detection and association Provide enough box-level constraint .
3D Detection and 3D Localization
Use Precision for bird’s eye view (APbv) and 3D box (AP3d) evaluation 3D Detection and positioning accuracy . It turns out that table2 in . The detailed comparative analysis will not be repeated , You can read the paper directly .
It is worth noting that ,Kitti 3D The detection reference is for image-based (image-based) The method is difficult , For this method ,3D Performance tends to decline as the distance from the target object increases . This phenomenon is shown in Figure 7 Can be observed intuitively , Although the method in this paper realizes subpixel disparity estimation ( Less than 0.5 Pixels ), But because parallax is inversely proportional to depth , The depth error increases with the increase of object distance . For targets with obvious parallax , Based on strict geometric constraints, this paper realizes high-precision depth estimation . That explains why IoU The higher the threshold , The easier it is for the target object to belong to , Compared with other methods , This article gets more improvements .
边栏推荐
- Moteur de stockage mysql23
- Use of dataset of pytorch
- Mysql33 multi version concurrency control
- Global and Chinese market of wafer processing robots 2022-2028: Research Report on technology, participants, trends, market size and share
- 虚拟机Ping通主机,主机Ping不通虚拟机
- MySQL35-主从复制
- Global and Chinese market of thermal mixers 2022-2028: Research Report on technology, participants, trends, market size and share
- [recommended by bloggers] C # generate a good-looking QR code (with source code)
- MySQL27-索引优化与查询优化
- [programmers' English growth path] English learning serial one (verb general tense)
猜你喜欢
csdn-Markdown编辑器
Mysql21 user and permission management
解决:log4j:WARN Please initialize the log4j system properly.
MySQL35-主从复制
[Li Kou 387] the first unique character in the string
Mysql33 multi version concurrency control
Pytorch RNN actual combat case_ MNIST handwriting font recognition
Valentine's Day is coming, are you still worried about eating dog food? Teach you to make a confession wall hand in hand. Express your love to the person you want
Navicat 导出表生成PDM文件
CSDN-NLP:基于技能树和弱监督学习的博文难度等级分类 (一)
随机推荐
MySQL30-事务基础知识
[recommended by bloggers] C # generate a good-looking QR code (with source code)
@Controller, @service, @repository, @component differences
Mysql23 storage engine
MySQL 20 MySQL data directory
csdn-Markdown编辑器
Solve the problem that XML, YML and properties file configurations cannot be scanned
API learning of OpenGL (2002) smooth flat of glsl
Advantages and disadvantages of evaluation methods
[leectode 2022.2.13] maximum number of "balloons"
FRP intranet penetration
【博主推荐】C#生成好看的二维码(附源码)
Mysql35 master slave replication
Opencv uses freetype to display Chinese
Mysql27 - Optimisation des index et des requêtes
MySQL22-逻辑架构
MySQL21-用戶與權限管理
npm一个错误 npm ERR code ENOENT npm ERR syscall open
Generate PDM file from Navicat export table
[BMZCTF-pwn] 11-pwn111111