当前位置：网站首页>Not All Points Are Equal Learning Highly Efficient Point-based Detectors for 3D LiDAR Point

Not All Points Are Equal Learning Highly Efficient Point-based Detectors for 3D LiDAR Point

2022-07-07 03:22:00 【Master Ma】

One The core idea
In order to reduce memory and computing costs , Existing based on point Of pipeline Random sampling or FPS Sampling to gradually down sample the input point cloud , Although not all points are equally important to the target detection task . especially , In essence, the front scenic spot is more important to the target detector than the background point . Based on this , This paper presents an efficient single-stage point-based 3D detection——IA-SSD.

The key of this method is to use two learnable 、 Task oriented 、 Instance aware down sample strategy To choose hierarchically belonging to object Of foreground point. Besides , We also introduced contextual centroid perception module To further estimate the precise object center. Last , In order to improve efficiency , We only use encoder-only Architecture of IA-SSD.

The method of this paper draws lessons from 3DSSD Framework , The main contribution is to adjust the sampling strategy of each layer , Every time down sample Sampling , The former scenic spots account for the majority .

The specific framework is shown in the figure below ：
Insert picture description here

Two The core step
The existing point-based In its framework, the detector usually adopts task independent sampling methods , Such as random sampling or farthest point sampling . Although for reducing memory / Calculation cost effective , But in progressive down sampling , The most important foreground point It will also reduce . Besides , Because there are great differences in the size and geometry of different objects , Existing detectors usually train individual models with various carefully adjusted superparameters for different types of objects . However , This will inevitably affect the deployment of these models in practice . therefore , The goal of this paper is : Can we train a single point based model , This model can efficiently detect multiple kinds of targets at one time .

Based on this , This paper presents an efficient single-stage detector , By introducing instance-aware downsampling and contextual centroid perception module. As shown in the figure above , IA-SSD Adopted 3DSSD Feature extraction architecture in . First, enter the LiDAR Point cloud input to the network to extract point features, And then put forward instance-aware downsampling, To gradually reduce computing costs , At the same time, keep information rich foreground point. The learned potential characteristics are further input into contextual centroid perception module, Generate proposal And return to the final bounding box .

Therefore, this paper has two main points : instance-aware downsampling and contextual centroid perception module.

2.1 Instance-aware Downsampling Strategy
Insert picture description here
As shown in the figure above , In order to preserve as much as possible foreground point, We turn to exploiting the underlying semantics of each point , Because with hierarchical aggregation, it operates in each layer , The learned point features may contain richer semantic information . Based on this idea , We proposed class-aware sampling and centroid-aware sampling Two task oriented sampling methods , Integrating foreground semantic Apriori into network training pipeline in .

Class-aware sampling： It's in sampling , Prediction of scenic spots before joining head, The formula is as follows ：

Insert picture description here
Centroid-aware sampling（ Only in training Used in the process ）： Here we are class-aware sampling when , Considering the distance object The central point should be taken into account , Therefore, the weight of the predicted distance center is used head, The formula is as follows ：

Insert picture description here
This is also a method I want to predict the distance Center .

In this way, the loss function of the score can be changed to :
Insert picture description here
take soft point mask And foreground point Multiply the loss items of , Make the point close to the center have a higher probability . Be careful , stay inference The bounding box is no longer needed in the process , If the model is well trained , We just need to keep the highest score after sampling k A little bit .

After the detailed explanation of the above strategy , The comparison between our sampling strategy and other strategies is as follows :
Insert picture description here
2.2 Contextual Instance Centroid Perception

Insert picture description here
We tried to use bounding box Surrounding contextual clues, such as centroid prediction . say concretely , We followed VoteNet To explicitly predict object Offset of center .( That is, aggregate operation , The following figure VoteNet Shown , Just use FPS Choose the k Then aggregate the surrounding points .)
Insert picture description here

It is worth noting that , In this paper , We don't just use points in the bounding box or shift points to predict the center , We manually extend ground truth bounding box, Or enlarge the box proportionally , To cover more relevant context near the object . The offset is estimated by using the sampling points falling within the extended bounding box , Then offset .

2.3 Centroid-based Instance Aggregation
Insert picture description here
2.4 End-to-End Learning

The method proposed in this paper has advantages in speed , The detection accuracy is still not achieved SOTA.

原网站

版权声明
本文为[Master Ma]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/188/202207062006532710.html

当前位置：网站首页>Not All Points Are Equal Learning Highly Efficient Point-based Detectors for 3D LiDAR Point

Not All Points Are Equal Learning Highly Efficient Point-based Detectors for 3D LiDAR Point

边栏推荐

猜你喜欢

随机推荐