当前位置:网站首页>Polar Parametrization for Vision-based Surround-View 3D Detection Paper Notes
Polar Parametrization for Vision-based Surround-View 3D Detection Paper Notes
2022-08-02 06:32:00 【byzy】
原文链接:https://arxiv.org/abs/2206.10965
1 引言
Currently there are two main types of object location parameterized method,Based on parametric and Descartes parametric image.

Based on image parameterized(左图):Estimated object pixels in the image index and depth
,To use the camera's internal and external joining the coordinates are transferred to the3D空间.Usually for monocular image.To look around the image,The method independently in each perspective image regression boundary box position,And then the projection to the public3D空间.Finally using cross view post-processing such asNMSFilter out duplicate detection.
The disadvantage is that depth error estimate,In adjacent view and view method overlap to provide additional information is not used;Across the view post-processing method is difficult and unstable.
Descartes parameterized(中图):Usually detection range for rectangular.Combined with the correlation of multiple view,Combination forecast object3D坐标.
But this method there is a problem,如下图所示:Set the object
和
In the different images in the same position,And have the same image mode.

(1)Because of the detection range for rectangular(That is only testing within the scope of the object will be marked),Training only consider
,而
被丢弃(The two views are not same),The convergence of the network have an adverse effect.
(2)This method ignores the view symmetry.Produced by the above two images,If use parameterized based on image,Learn the model only needs to predict the location of the same
;And using a model to predict different cartesian parameters of chemical3D坐标,Will no doubt increase the complexity of the model,And the optimization model is more difficult.
In this paper, ring view3D检测transformer(PolarDETR),Using cylindrical coordinates
(The radial distance、Horizontal Angle and height)参数化(Is called a parametric;右图)物体位置,And speed parameters into the object of radial velocity and tangential velocity.此外,检测范围、Loss function are defined under polar coordinate.
PolarDETRCan achieve center-Context features polymerization,Enhance the information interaction between the target query and image,In pixels and ray as position encoding,Provide three dimensional space prior,Help predict azimuth
.本文的PolarDETRAchieved good performance-速度平衡.
3 PolarDETR
3.1 概述
如下图所示.
A different view of the first image input to the SharedCNN提取特征,Target query is then used to detect objects.Each target query coding the semantic characteristics of the corresponding object and location information,And then a series of decoding layer from around a figure in polymerization characteristics,Iteratively update the target query.前馈网络(FFN)基于这些查询,预测类别,And bounding box and speed of polar code.

3.2 A parameterized
Each boundary box parameter is polar code for9元组
,Can be estimated according to its boundary box polar parameters
.其中




和
For highly detection range,
For maximum detection range;
是sigmoid函数.The return of the horizontal Angle and heading Angle cosine of positive for
,To ensure the continuity of regression space.
Location estimation of polar decomposition:A parametric object position decoupling of the radial distance and horizontal Angle.距离
Associated with the object size,Can learn from image mode;水平角
That is associated with pixel index,Can learn from location coding.
Polar decomposition speed estimation:The radial velocity associated with object size rate of change,The movement of the object in the image plane and tangential velocity associated.
A parameterized explicitly set up image schema and prediction target association,The explicit fragments-in detector can have better convergence and performance.
3.3 解码层
Iteratively decoding layer convergence and update queries.The first to use a long since attention module(MHSA)To query the information interaction between,Then use the linear layer from the query to extract the object position:

转换为3D坐标
即可.
中心-Context features polymerization:Polymerization ring view characteristic figure characteristics.先将3DCenter of the projection to the image plane,得到2D中心点:

其中
和
Respectively by the first
A camera the projection matrix derived from the inside and outside.Using bilinear interpolation obtained from image characteristics in the center of the characteristics of(如果2DCenter position outside the range image,The feature set to0).
Introduction of context features enhanced query and ring view interaction to promote localization.Based on the center features
And the query embedded
Forecast and center offset,A collection of generated Wen Dian up and down
:

Finally using bilinear interpolation to get up and down Wen Dian characteristics.
像素射线:如下图所示,Pixel rays from optical center through pixel arrive3D点,Directly establish the relationship between the pixels and points,Contains a horizontal Angle of explicit information.

This article USES the pixel ray for the location of the additional code,For each center or Wen Dian up and down,Pixels ray direction vector
As an additional feature dimension and the original characteristics of joining together.
查询更新:

The updated query embedded coding more accurate location information,So as to make the better in the next decoding layer characteristics of polymerization.
3.4 感知范围、标签分配和损失函数
感知范围:Since the car centered round area.
标签分配:First converts mark labels to polar:
,Then use the bidirectional matching method for the real boundary box only forecasts.By matching the price is as follows:


其中
是DETRDefined in the category of.
Calculation of each pair of prediction and matching the price after get the price of the real boundary box matrix
,Then use the Hungarian algorithm to find the optimal allocation.
损失函数:The bidirectional matching loss by classification loss(focal损失)And polar boundary box/速度损失(L1损失)组成.
3.5 时序信息
将PolarDETR扩展为PolarDETR-TTo accept the input of the sequential images.The object of the current frame center
Is projected to before the images to obtain,以第
帧为例:

其中
For the attitude transformation matrix,Response from the car from the first
到
The frame posture change.Similar to the way,From the previous frame sampling center and context characteristics.All sampling characteristics were finally polymerization,Used to update the query embedded.
For the purpose of efficient inference,Figure of image features can be cached in the past,So only need to deal with the current frame image,从而PolarDETR-TThe inference speed close toPolarDETR.
4 实验
4.2 实验设置
Use test tracking algorithm will bePolarDETR扩展为3D目标跟踪,According to the current frame rate,Will object to a frame on the,Then the closest matching method is used to match the target.
4.4 主要结果
PolarDETR-T的性能比PolarDETR要高,Especially for the speed estimation on.
4.5 消融研究
关键组件:A parameterized、Wen Dian and pixel ray up and down all the performance improved,And computational cost can be ignored.
The speed of the polar decomposition:Compared with the cartesian decomposition,Polar decomposition can improve the estimation precision of speed.
Up and down Wen Dian:Performance is stronger with the increase of the number of fluctuation Wen Dian,But after a certain range increase has a negative effect.Used to generate the upper and lower Wen Dian query embedded and center features are helpful to performance improvement are.
解码层:The better the performance of decoding the layer number of the more,But tend to saturation.
边栏推荐
- Redis-cluster mode (master-slave replication mode, sentinel mode, clustering mode)
- 【合集- 行业解决方案】如何搭建高性能的数据加速与数据编排平台
- 制作web3d动态产品展示的优点
- How Navicat Connects to MySQL
- swinIR论文阅读笔记
- 保证家里和企业中的WIFI安全-附AC与AP组网实验
- 淘系资深工程师整理的300+项学习资源清单(2021最新版)
- C语言基础知识梳理总结:零基础入门请看这一篇
- How much does a test environment cost? Start with cost and efficiency
- 整合ssm(一)
猜你喜欢

Browser onload event

Review: image saturation calculation formula and image signal-to-noise (PSNR) ratio calculation formula

非关系型数据库MongoDB的特点及安装

Detailed explanation of interface in Go language

关于web应用的目录结构

机器学习——支持向量机原理

ApiPost is really fragrant and powerful, it's time to throw away Postman and Swagger

VMTK环境配置记录
![[PSQL] 窗口函数、GROUPING运算符](/img/95/5c9dc06539330db907d22f84544370.png)
[PSQL] 窗口函数、GROUPING运算符

双重for循环案例(用js打印九九乘法表)
随机推荐
BGP实验(路由反射器,联邦,路由优化)
Cyber Security Learning - Intranet Penetration 4
kubernetes affinity, anti-affinity, taint, tolerance
Redis database
整合ssm(一)
navicat connects to MySQL and reports an error: 1045 - Access denied for user 'root'@'localhost' (using password YES)
Timing task library in the language use Cron, rounding
关于 VS Code 优化启动性能的实践
C语言基础知识梳理总结:零基础入门请看这一篇
Brush LeetCode topic series - 10. Regular expression match
虚拟现实房产展示系统提前预见未来装修效果
配合蓝牙打印的encoding-indexes.js文件内容:
Stress testing and performance analysis of node projects
Block elements, inline elements (
elements, span elements)Detailed installation and configuration of golang environment
Redis数据库
C 竞赛——捕鱼
golang generics
leetcode 204. Count Primes 计数质数 (Easy)
本周大新闻|苹果MR已进行Pre-EVT测试,Quest 2涨价100美元