当前位置:网站首页>Point Density-Aware Voxels for LiDAR 3D Object Detection Paper Notes
Point Density-Aware Voxels for LiDAR 3D Object Detection Paper Notes
2022-08-02 06:32:00 【byzy】
原文链接:https://arxiv.org/abs/2203.05662
1 引言
The laser radar is a problem of point cloud with the change of distance far and become thin.
The method based on voxel ignore a little bit of the density of the,Use the voxel center features(左图);But for a wide range of input,Memory limit the voxel resolution,Point with an alignment problem not result in loss of voxel object details,从而导致性能下降.
Based on the method USES the apogee sampling point(中图),But the computational complexity increases with the number of sampling points,Limit the number of the elaboration phase sampling point.
In addition because the target surface area of the small such as pedestrians or bicycles,For the laser radar positioning difficult,The present methods are focused mainly on the detection of vehicle class.
This paper put forward the point density perception voxel network(PDV)Using spot centroid localization voxel and consider point density feature coding,解决上述问题.

主要贡献:
(1)Spot centroid localization voxel:For each non-null voxel calculation of center of mass point(右图);Through the elaboration using peer centroid to locate the voxel characteristics,PDVUsing point density distribution in the feature coding retain fine-grained location information.
(2)Density of perceptionRoIThe grid pooling:在RoIGrid when pooling code as an additional feature point density.First using kernel density estimation(KDE)At each grid point inside the spherical neighborhood coding characteristics of local voxel density,Then use a point density encoding since attention.The method to capture the local point density in the regional proposal information,For the second phase refinement.
(3)Density confidence predict:Using barycentric position of the bounding box and the boundary points in the frame as an additional feature to refine the bounding box confidence predict.To using the lidar point density and distance have more according to the confidence of the inherent relationship between the predict.
3 方法
如下图所示,PDVUsing two phase detection network,第一阶段使用3DSparse convolution backbone generate proposal,Second stage using voxel characteristics of each individual element layer and the original point cloud data to refine.
3.1 3DVoxel trunk
类似SECOND,Namely voxel after use3D稀疏卷积,再用RPN生成提案.Each individual element layer, in turn, increases the sampling resolution,And all can be used in the second phase of refining.
3.2 Spot centroid localization voxel
The module positioning on the space space voxel characteristics,For the density of perceptionRoIThe grid pooling polymerization.
For a certain individual element,All of these point coordinates to calculate the mean,Or get the point of center of mass of the voxel.Use hash map the mass center of each individual element points to the corresponding eigenvectors.Voxel point mass center and sparse voxel characteristics are associated with the same individual element index.
Using convolution nuclear size、步长、填充值,The next layer of voxel point of center of mass can be a layer to calculate the result of the(即加权求和).这样可以避免重复计算,So that the method can effectively extend to the larger point cloud.
3.3 Density of perceptionRoIThe grid pooling
使用KDEAnd since the combination of attention for each proposal coding point density characteristics.Within each proposal first sampling的网格点.
Local characteristics of density
使用KDEEstimated that each grid point density of the local characteristics of spherical neighborhood in.Density of perceptionRoICoding for pooling will estimate the probability density of additional features.
首先,Press type for each grid point()Spherical neighborhood(
)Centroid feature voxel point in:
其中是第
层第
A non-empty voxel characteristics,
Is the barycentric coordinates.
是KDEEstimate the probability density of(似然值):
其中为带宽,
为在
Coordinates are the independent nuclear(This article USES the gaussian kernel).
得到特征后,使用PointNet多尺度分组(MSG)Module from each grid point
获取特征向量:
MSGUsing multiple radius(Is the radius of spherical neighborhood)For each grid point capture multi-scale feature density,The output pieced together.
The final feature is characteristic of all layers together:
Grid point since the attention
The characteristics of the different grid point no relationship,Can use the attention grabbing grid point distance dependence.如下图所示,The masterpiece of note for the empty grid point feature,使用标准transformer的编码器
和残差连接,即:
.
对于的网格点,Don't enter the attention module,特征不变.

Point density encoding
Only add attention module lack of lidar point cloud of geometric information,So consider the location of the point cloud density encoding.The code USES local grid position and proposals within the original point,Proposal is divided into的体素(Each individual element corresponding to a grid point),The characteristics of each grid location coding for:
其中是
Location and bounding box center
的相对位置,
是以
As the center of the voxel point number,
For the constant bias.这样,RoIGrid pooling can capture area proposal midpoint density.
3.4 Density confidence predict
Use of the lidar points on the object distance and to predict the confidence of bounding box.
First the density of perceptionRoIOutput characteristics of grid pooling module level,使用共享的FFN编码得到;然后两个FFNBranch respectively for encoding characteristics of bounding box refinement and confidence estimation.
When confidence estimation,Will eventually bounding box centerAnd the final boundary points in the frame
附加到
:
3.5 训练损失
Using regional proposal lossAnd bill refining losses
联合训练.
其中为focal损失,
To predict the category probability vector,
For real category;
为SmoothL1损失,
为预测RoIAnchor box residual,
Is true of anchor box residual.
其中是由3D RoIAnd its associated confidence level of the real boundary box zoom training goal(见PV-RCNN);
为SmoothL1损失,
和
Are predicting boundary box and the real boundary box residual.
4 实验
使用X/Y轴翻转、全局缩放、全局旋转,And copy and paste augmented method.
后处理时,Using the maximum inhibition to remove redundant boundary box.
实验结果:PDVCan capture of voxel missing detailed information,Implementation of the second stage of the precise refinement.
If the voxel grid has higher voxel resolution,PDVMethod of ascension may be limited.
4.3 消融研究
组件
Using the voxel point centroid localization features than using voxel center positioning performance better,Especially for small objects,Because the voxel center may is not aligned with point cloud.Voxel centroid localization makes features a bit closer to the surface,Provide more meaningful for proposal detailed geometry information.
使用KDECapture the characteristics of the relationship between density can also help,Especially for deformable target such as pedestrian.
Use attention mechanism inRoIGrid point distance dependence between,To detect pedestrians and cyclists have better performance.
Using density confidence level prediction method can further improve the detection accuracy and bikes.
Point density encoding
Using sine encoding can improve the detection precision of the pedestrians and cyclists(But for auto detection accuracy down);Use only the grid point coordinates asFFN的输入,With minimal performance;Use only the density characteristics also have similar performance improvement;Combination of both can achieve better performance.
4.4 运行时间分析
运行时间比PV-RCNNSlightly faster(性能也更好).But this method still can not meet the real-time requirements.
4.5 Under different distanceFP数
随着距离增大,FP数增大.But on the whole thanPV-RCNN的FP少,And the greater the distance,差距越大.Refinement bounding box may be using a point density and degree of confidence is helpful to detect objects at a distance.
5 结论
In this paper, methods for large input range by its,Because of the point cloud sampling are expensive and low resolution voxel,And the method can effectively deal with these two problems.
附录
B.局限性
体素分辨率:分辨率越高,The smaller the performance.Because the voxel center and the point distance to center of mass of the smaller,And each is not empty voxels containing points close to1,The density of a little not empty voxel approximation.
泛化性:The method in the second stage depends on the density of points.If the test point on the distribution of training are very different(Such as extreme weather from the point of the object will be a lot less),可能导致严重的性能下降.
E.点密度-具体图像
Map image can be seen that,By using the distance-Point density relationship,PDVEffectively reduce the outside of the training sample distribution under different distanceFP数.
边栏推荐
- Review: image saturation calculation formula and image signal-to-noise (PSNR) ratio calculation formula
- Difference and analysis of CPU usage and load
- leetcode一步解决链表合并问题
- Redis-cluster mode (master-slave replication mode, sentinel mode, clustering mode)
- Constructors, member variables, local variables
- 对node工程进行压力测试与性能分析
- 聪明人的游戏提高篇:第三章第二课:“桐桐数”(number)
- 测试环境要多少?从成本与效率说起
- 软件测试在职2年跳槽4次,你还在怪老板不给你涨薪?
- What is the most important ability of a programmer?
猜你喜欢
随机推荐
Constructors, member variables, local variables
Detailed explanation of interface in Go language
Meta公司新探索 | 利用Alluxio数据缓存降低Presto延迟
HCIP第十七天
线程基础(一)
关于 VS Code 优化启动性能的实践
[PSQL] 窗口函数、GROUPING运算符
flex布局(弹性布局)
驱动页面性能优化的3个有效策略
高防服务器防御的原理是什么
C语言中i++和++i在循环中的差异性
coredns介绍
在腾讯做外包测试的那些日子.....
Say good woman programmers do testing have an advantage?More than a dozen interview, abuse of cry ~ ~ by the interviewer
Machine learning -- - theory of support vector machine (SVM)
ATM系统
保证家里和企业中的WIFI安全-附AC与AP组网实验
程序员最重要的能力是什么?
国际顶会OSDI首度收录淘宝系统论文,端云协同智能获大会主旨演讲推荐
C language entry combat (13): decimal number to binary