当前位置：网站首页>Deep Hough voting for 3D object detection in point clouds

Deep Hough voting for 3D object detection in point clouds

2022-06-12 21:26:00 【Wield the sword to break the clouds-】

Deep Hough Voting for 3D Object Detection in Point Clouds
In the clouds Three dimensional target detection The depth of the Hough laws and regulations governing balloting

PS:

pointnet： Because of its maxpooling Global features from the operation , Make the classification task effective ;

For split tasks , The global features are spliced with the local features of each point cloud learned before , Re pass mlp Get the classification result of each point .

Pointnet++: Yes, before pointnet A supplementary and upgraded version of ,pointnet The ability of local feature extraction is poor , This makes it difficult to analyze complex scenes .

pointnet++ Learn from it CNN The idea of multi-layer receptive field , First, the point cloud is sampled and divided into regions , In each small area pointnet Network feature extraction , Continuous iteration .

The network structure is as follows ：

1、Sample layer： It mainly samples the input points , Select several center points from these points , Utilization of FPS Farthest point sampling , Ensure that the sampling points are evenly distributed on the whole point cloud .

2、grouping layer： The point set is divided into several regions by using the center point obtained from the upper layer .

3、PointNet layer： For these points MLP Extract features and maximize pool aggregation to sample point coordinates .

stay set abstraction Inside , It uses multi-scale feature extraction to do an optimization , Combine small features with large ones （ Different radii ）, Improve the generalization ability .

Optimization of split tasks , What we need to do is to make a semantic segmentation label for each point , In the network , Let's do an up sampling first , How to do it? ？ This is achieved by doing an interpolation , A hierarchical propagation strategy based on distance interpolation and cross level hopping links is adopted , Among many interpolation options , We use based on k Inverse distance weighted average of the nearest neighbors （ As formula 2 , By default, we use p = 2,k = 3）. It will be based on K Make a weighted average of the distance between points and the characteristics of points , After interpolation, the global feature is restored , We also need to splice these features with the previous local features , Then continue to do some feature propagation in the future , Repeat the process , Until we propagate the feature to the original point set , Then do the semantic segmentation task , The effect will be better .

VoteNet:

What do you want to do? ：

It builds a as general as possible for point cloud data 3D Detection structure

Put forward the background ：

3D The target of object detection is to locate and recognize 3D Objects in the scene , More specifically , In this work , Our goal is to estimate orientation 3D Bounding boxes and semantic categories of objects from point clouds .

However , Current 3D Target detection methods are subject to 2D The effect of the detector is great , Some of the 2D The detection framework extends to 3D, For example, will Faster or Mask R-CNN etc. 2D The detection framework extends to 3D, Convert irregular point cloud voxels into regular ones 3D Grid and apply 3D CNN detector , This cannot take advantage of the sparsity in the data , And because of the expensive 3D Convolution is affected by high computing cost .

Or project the point cloud data into regular 2D Aerial view image , Then apply 2D The detector locates the object . However , This sacrifices geometric details that may be crucial in a cluttered indoor environment , Image visual conversion requires additional computational overhead .

This paper introduces a point cloud centered 3D Detection framework , The framework Direct processing of raw data , And don't rely on anything in the architecture or object proposal 2D detector . Our detection network VoteNet Based on point cloud 3D The latest development of deep learning model , And it is generalized for object detection Hough Inspired by the voting process

Problems encountered ：

However , Because of the sparsity of the data , There is a major challenge in predicting the bounding box parameters directly from the scene ： One 3D The center of mass of an object may be far away from any surface point , Therefore, it is difficult to accurately regress in one step .

Solution ：

Use Hoff to vote , First, a number of samples are taken on the input point cloud seed Point Union vote The central point of its target , In this way, you can get a lot of vote spot , And then in vote Point up bounding box The advice of , The defect of inaccuracy when the target center point is far from the surface point is well solved

Network architecture ：

First , adopt pointnet++ Extract an information of point cloud in the original scene , We need to find the target object bondingbox Words , To determine the center point of an object , Because our point cloud is a representation of object surface information , The center must be additionally defined , We use the Hoff voting mechanism to pick out these candidates , Get the proposal of some central points that did not exist in the point cloud data （ It's called proposal）, With these points , Just keep using pointnet++ Inside sampling and grouping Go to the farthest point to sample K Cluster centers , Divide the spherical space , utilize mlp The feature vectors representing these clusters are extracted , Then we predict a category label for these vectors , Include bondingbox Where should the box be .

To be improved ......

原网站

版权声明
本文为[Wield the sword to break the clouds-]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/163/202206122124141701.html

当前位置：网站首页>Deep Hough voting for 3D object detection in point clouds

Deep Hough voting for 3D object detection in point clouds

边栏推荐

猜你喜欢

随机推荐