当前位置:网站首页>Detailed reading of stereo r-cnn paper -- Experiment: detailed explanation and result analysis
Detailed reading of stereo r-cnn paper -- Experiment: detailed explanation and result analysis
2022-07-06 10:57:00 【Is it Wei Xiaobai】
In the past, I used to read the method part when reading papers , Then look at the performance of the test data . Recently, when I was writing my thesis, I found ,“ How to design the experiment ” It's also important , I will pay more attention to this part when I read the thesis in the future .
One 、 Details of the experiment
Introduce the conditions required for the test in detail
Network
Use five ranges (scale){32, 64, 128, 126, 512} And three proportions (ratios){0.5, 1, 2} Of archor. Adjust the size of the shorter edge of the original image to 600 Pixels . about Stereo-RPN, Due to the connection of left and right characteristic graphs , You need to have 1024 Input channels , instead of 512 Layers layer. Again , stay R-CNN Back to the head head Yes 512 Input channels . stay Titan XP GPU On ,Stereo R-CNN To a Stereo pair The reasoning time is about 0.28s.
Training
It's mainly about loss Explanation

Among them
Express RPN and R-CNN, Subscript box、α、dim、key respectively stereo boxes Of loss,viewpoint Of loss、dimension Of loss and keypotint Of loss.
During training, the left and right images will also be flipped and exchanged ( Correspondingly, it will viewpoint angle and keypoint Mirror image ) To expand the data set . One per training batch Keep one in stereo and 512 individual RoIs.
Other conditions : Use SGD、 The weight decays to 0.0005、 Momentum is 0.9%、 The learning rate is initialized to 0.001 And each 5 individual epoch Reduce 0.1%. Total training 20 individual epoch.
Two 、 Result analysis
Stereo Recall and Stereo Detection
Stereo R-CNN The target of is to detect and correlate the targets in the left and right images at the same time . In addition to evaluating the left and right images 2D Average recall (AR) and 2D average precision (AP) Outside , Also defined stereo AR and stereo AP Measure , Only query stereo box Only when the following conditions are met can it be considered as true positive (TPS):
1. left GT The maximum size of the box IOU Greater than the given threshold ;
2. On the right side GT The maximum size of the box IOU Greater than the given threshold ;
3. Select the left and right GT The box belongs to the same object .

As shown in the table 1 Shown , And Faster RCNN comparison Stereo RCNN Have similar on a single image proposal recall and detection precision, At the same time, high-quality data association is generated in the left and right images without additional calculation .
although RPN Medium stereo AR Slightly smaller than left AR, But in R-CNN Left observed after 、 Right and right stereo AP Almost the same , This shows that the detection performance on the left and right images is consistent , And almost all the left images are true positive box There is a corresponding true positive box.
In addition, two left and right feature fusion strategies are tested : Element based Averaging Strategy and channel cascading strategy . As shown in the table 1 Described in , Because all the information is retained , Channel cascading shows better performance .
above , Proved accurate stereo detection and association Provide enough box-level constraint .
3D Detection and 3D Localization
Use Precision for bird’s eye view (APbv) and 3D box (AP3d) evaluation 3D Detection and positioning accuracy . It turns out that table2 in . The detailed comparative analysis will not be repeated , You can read the paper directly .

It is worth noting that ,Kitti 3D The detection reference is for image-based (image-based) The method is difficult , For this method ,3D Performance tends to decline as the distance from the target object increases . This phenomenon is shown in Figure 7 Can be observed intuitively , Although the method in this paper realizes subpixel disparity estimation ( Less than 0.5 Pixels ), But because parallax is inversely proportional to depth , The depth error increases with the increase of object distance . For targets with obvious parallax , Based on strict geometric constraints, this paper realizes high-precision depth estimation . That explains why IoU The higher the threshold , The easier it is for the target object to belong to , Compared with other methods , This article gets more improvements .

边栏推荐
- [leectode 2022.2.13] maximum number of "balloons"
- Discriminant model: a discriminant model creation framework log linear model
- Mysql27 - Optimisation des index et des requêtes
- Use of dataset of pytorch
- frp内网穿透那些事
- Mysql23 storage engine
- Global and Chinese markets of static transfer switches (STS) 2022-2028: Research Report on technology, participants, trends, market size and share
- How to find the number of daffodils with simple and rough methods in C language
- Mysql34 other database logs
- MySQL 29 other database tuning strategies
猜你喜欢

Opencv uses freetype to display Chinese
![[recommended by bloggers] background management system of SSM framework (with source code)](/img/7f/a6b7a8663a2e410520df75fed368e2.png)
[recommended by bloggers] background management system of SSM framework (with source code)

MySQL34-其他数据库日志

Installation and use of MySQL under MySQL 19 Linux

MySQL26-性能分析工具的使用

Mysql35 master slave replication

CSDN问答模块标题推荐任务(一) —— 基本框架的搭建

Mysql24 index data structure

【博主推荐】C#MVC列表实现增删改查导入导出曲线功能(附源码)

CSDN博文摘要(一) —— 一个简单的初版实现
随机推荐
SSM整合笔记通俗易懂版
Mysql23 storage engine
API learning of OpenGL (2003) gl_ TEXTURE_ WRAP_ S GL_ TEXTURE_ WRAP_ T
MySQL21-用户与权限管理
MySQL29-数据库其它调优策略
35 is not a stumbling block in the career of programmers
MySQL25-索引的创建与设计原则
@Controller, @service, @repository, @component differences
CSDN question and answer module Title Recommendation task (II) -- effect optimization
[leectode 2022.2.13] maximum number of "balloons"
Generate PDM file from Navicat export table
API learning of OpenGL (2004) gl_ TEXTURE_ MIN_ FILTER GL_ TEXTURE_ MAG_ FILTER
Timestamp with implicit default value is deprecated error in MySQL 5.6
Ansible实战系列二 _ Playbook入门
Case identification based on pytoch pulmonary infection (using RESNET network structure)
保姆级手把手教你用C语言写三子棋
CSDN问答标签技能树(二) —— 效果优化
Just remember Balabala
Windows cannot start the MySQL service (located on the local computer) error 1067 the process terminated unexpectedly
Solve the problem that XML, YML and properties file configurations cannot be scanned