当前位置:网站首页>Detailed reading of stereo r-cnn paper -- Experiment: detailed explanation and result analysis
Detailed reading of stereo r-cnn paper -- Experiment: detailed explanation and result analysis
2022-07-06 10:57:00 【Is it Wei Xiaobai】
In the past, I used to read the method part when reading papers , Then look at the performance of the test data . Recently, when I was writing my thesis, I found ,“ How to design the experiment ” It's also important , I will pay more attention to this part when I read the thesis in the future .
One 、 Details of the experiment
Introduce the conditions required for the test in detail
Network
Use five ranges (scale){32, 64, 128, 126, 512} And three proportions (ratios){0.5, 1, 2} Of archor. Adjust the size of the shorter edge of the original image to 600 Pixels . about Stereo-RPN, Due to the connection of left and right characteristic graphs , You need to have 1024 Input channels , instead of 512 Layers layer. Again , stay R-CNN Back to the head head Yes 512 Input channels . stay Titan XP GPU On ,Stereo R-CNN To a Stereo pair The reasoning time is about 0.28s.
Training
It's mainly about loss Explanation
Among them Express RPN and R-CNN, Subscript box、α、dim、key respectively stereo boxes Of loss,viewpoint Of loss、dimension Of loss and keypotint Of loss.
During training, the left and right images will also be flipped and exchanged ( Correspondingly, it will viewpoint angle and keypoint Mirror image ) To expand the data set . One per training batch Keep one in stereo and 512 individual RoIs.
Other conditions : Use SGD、 The weight decays to 0.0005、 Momentum is 0.9%、 The learning rate is initialized to 0.001 And each 5 individual epoch Reduce 0.1%. Total training 20 individual epoch.
Two 、 Result analysis
Stereo Recall and Stereo Detection
Stereo R-CNN The target of is to detect and correlate the targets in the left and right images at the same time . In addition to evaluating the left and right images 2D Average recall (AR) and 2D average precision (AP) Outside , Also defined stereo AR and stereo AP Measure , Only query stereo box Only when the following conditions are met can it be considered as true positive (TPS):
1. left GT The maximum size of the box IOU Greater than the given threshold ;
2. On the right side GT The maximum size of the box IOU Greater than the given threshold ;
3. Select the left and right GT The box belongs to the same object .
As shown in the table 1 Shown , And Faster RCNN comparison Stereo RCNN Have similar on a single image proposal recall and detection precision, At the same time, high-quality data association is generated in the left and right images without additional calculation .
although RPN Medium stereo AR Slightly smaller than left AR, But in R-CNN Left observed after 、 Right and right stereo AP Almost the same , This shows that the detection performance on the left and right images is consistent , And almost all the left images are true positive box There is a corresponding true positive box.
In addition, two left and right feature fusion strategies are tested : Element based Averaging Strategy and channel cascading strategy . As shown in the table 1 Described in , Because all the information is retained , Channel cascading shows better performance .
above , Proved accurate stereo detection and association Provide enough box-level constraint .
3D Detection and 3D Localization
Use Precision for bird’s eye view (APbv) and 3D box (AP3d) evaluation 3D Detection and positioning accuracy . It turns out that table2 in . The detailed comparative analysis will not be repeated , You can read the paper directly .
It is worth noting that ,Kitti 3D The detection reference is for image-based (image-based) The method is difficult , For this method ,3D Performance tends to decline as the distance from the target object increases . This phenomenon is shown in Figure 7 Can be observed intuitively , Although the method in this paper realizes subpixel disparity estimation ( Less than 0.5 Pixels ), But because parallax is inversely proportional to depth , The depth error increases with the increase of object distance . For targets with obvious parallax , Based on strict geometric constraints, this paper realizes high-precision depth estimation . That explains why IoU The higher the threshold , The easier it is for the target object to belong to , Compared with other methods , This article gets more improvements .
边栏推荐
- Csdn-nlp: difficulty level classification of blog posts based on skill tree and weak supervised learning (I)
- Other new features of mysql18-mysql8
- MySQL22-逻辑架构
- 35 is not a stumbling block in the career of programmers
- Global and Chinese markets for aprotic solvents 2022-2028: Research Report on technology, participants, trends, market size and share
- Opencv uses freetype to display Chinese
- 虚拟机Ping通主机,主机Ping不通虚拟机
- Swagger、Yapi接口管理服务_SE
- Global and Chinese market of thermal mixers 2022-2028: Research Report on technology, participants, trends, market size and share
- Global and Chinese market of wafer processing robots 2022-2028: Research Report on technology, participants, trends, market size and share
猜你喜欢
Mysql21 user and permission management
Postman environment variable settings
CSDN问答标签技能树(一) —— 基本框架的构建
Mysql26 use of performance analysis tools
Moteur de stockage mysql23
CSDN Q & a tag skill tree (V) -- cloud native skill tree
CSDN question and answer module Title Recommendation task (I) -- Construction of basic framework
[Li Kou 387] the first unique character in the string
Mysql28 database design specification
MySQL 20 MySQL data directory
随机推荐
API learning of OpenGL (2002) smooth flat of glsl
Swagger、Yapi接口管理服务_SE
[paper reading notes] - cryptographic analysis of short RSA secret exponents
[reading notes] rewards efficient and privacy preserving federated deep learning
Why is MySQL still slow to query when indexing is used?
C language advanced pointer Full Version (array pointer, pointer array discrimination, function pointer)
MySQL25-索引的创建与设计原则
该不会还有人不懂用C语言写扫雷游戏吧
Water and rain condition monitoring reservoir water and rain condition online monitoring
[recommended by bloggers] C MVC list realizes the function of adding, deleting, modifying, checking, importing and exporting curves (with source code)
A trip to Macao - > see the world from a non line city to Macao
Anaconda3 installation CV2
Mysql36 database backup and recovery
Invalid default value for 'create appears when importing SQL_ Time 'error reporting solution
MySQL36-数据库备份与恢复
La table d'exportation Navicat génère un fichier PDM
windows下同时安装mysql5.5和mysql8.0
Generate PDM file from Navicat export table
Ansible实战系列三 _ task常用命令
Global and Chinese market of operational amplifier 2022-2028: Research Report on technology, participants, trends, market size and share