当前位置：网站首页>[CVPR 2021] cylinder3d: cylindrical asymmetric 3D convolution network for LIDAR point cloud segmentation

[CVPR 2021] cylinder3d: cylindrical asymmetric 3D convolution network for LIDAR point cloud segmentation

2022-07-28 22:24:00 【Binary artificial intelligence】

List of articles

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

https://arxiv.org/abs/2011.10033

What did you do

For large-scale outdoor driving scenes LiDAR Point cloud segmentation , The common method is to project the point cloud into two-dimensional space , Re pass 2D Convolution process it . Although this has a good performance , But it discards the three-dimensional topology and geometric relationship information of the point cloud . One way to reduce this information loss is to use the voxelization of cube division on the point cloud and then pass 3D Convolution process it , But the improvement of this method is quite limited , Because outdoor point clouds also have sparsity and uneven density . This paper puts forward Cylinder3D：

For the sparsity and density inhomogeneity of point clouds , Voxelization of cylinder division of point cloud
Use asymmetric 3D Convolution network generates voxel level output
Introduce point by point refinement module , Reduce label interference caused by voxelization

Cylinder3D stay SemanticKITTI and nuScenes The performance of point cloud segmentation data set reached the first , And it can be well extended to LiDAR Panoramic segmentation and 3D detection .

Cylinder3D

The whole frame and its components

Cylinder3D The overall framework of the project is shown in the figure ：

There are four main components ：

Including asymmetric residual module (A)、 Asymmetric lower sample block (AD)、 Asymmetric up sampling module (AU) And context modeling based on dimension decomposition (DDCM).

LiDAR Point cloud first input MLP, Get some features . Then divide according to the cylinder ( Cylindrical Partition) Reassign the point features to obtain the cylindrical characteristics (Cylindrical Features). Finally, use asymmetric 3D Convolution network (Asymmetrical 3D Convolution Network,Asym-CNN) And context modeling based on dimension decomposition （DDCM） To generate voxel output , And a point by point refinement module is introduced (Point-wise Refinement Module,PR) To refine these outputs .

Cylinder Division ( Cylindrical Partition)

The flow of cylinder division is shown in the figure ：

picture source ：https://arxiv.org/pdf/2109.05441.pdf

First , Convert points from Cartesian coordinate system to cylindrical coordinate system . This step will point $(x, y, z)$ Convert to $\rho,\theta,z$ , This requires calculating the radius $\rho$ ( The point is x-y The distance between the projection on the plane and the origin ) And azimuth $\theta$ ( since $x$ Axis to $y$ The angle of the axis ). And then in $\rho-\theta-z$ Perform cylinder division on three dimensions .
On this basis , Yes MLP The obtained point by point feature is redistributed according to the corresponding coordinates , Then use for each unit max-pooling Get the cylinder feature .

Go through the above steps , from 0 Expand the cylinder by degrees to get a three-dimensional cylinder representation $\mathbb{R}∈C × H × W × L$ , among $C$ Represents the feature dimension , $H, W, L$ It's radius 、 Azimuth and altitude . Subsequent asymmetry 3D The convolution network will perform on this representation .

The points in the area far away from the origin are more sparse . And in cylindrical coordinates , The farther away the area , The larger the cell . This makes the point distribution of cylinder division more uniform than that of cube division ：

The following figure shows the two division methods. The proportion of non empty cells increases with distance . You can see , The non empty proportion of cylinder partition is higher than that of cube partition , And as the distance increases, the more obvious ：

Besides , It is different from the method of projecting points to a 2D view , The cylinder division maintains the three-dimensional topological and geometric relationship information of the point cloud to a certain extent .

Asymmetry 3D Convolution network (Asym-CNN)

Asymmetric residual block (Asymmetrical Residual Block)

suffer ACNet Inspired by the ,Cylinder3D Asymmetry is used 3D Convolution construction residual block . With Car and Motorcycle For example , The following figure shows... In the asymmetric residual block acting on a cylinder element 3D Convolution ：

Asymmetric residual blocks match the target well （ automobile 、 truck 、 The bus 、 Cubic objects such as motorcycles ） Point distribution of , It also saves computation and memory overhead .

Cylinder3D The lower sampling block and the upper sampling block are designed with asymmetric residual blocks ：

And superimpose multiple sampling blocks and down sampling blocks , Build a three-dimensional convolution network ：

Context modeling based on dimension decomposition （DDCM）

DDCM Use three ranks 1 Feature extraction with low rank convolution kernel , And then get together , The resulting $C_{class}×H×W×L$ Comprehensive characteristics of .

Refine the module point by point （PR）

Voxelization method （ Including methods based on cube partition and cylinder partition ） Predict a label for each unit , Although this effectively explores a wide range of point clouds , But it is inevitable that points of different categories are divided into the same unit , This leads to the loss of information .

The following figure shows the effect of different mark coding methods ：

Most of them are encoded (majority encoding) Indicates that most categories of points within a cell are used as cell labels , Minority coding (minority encoding) Indicates that a few categories are used as unit labels . Ideally , After the point cloud is divided, the labels of each point should remain unchanged , That is, with the original label point mIoU Should be 100%. But it can be observed that , Most codes and a few codes cannot achieve 100% Of mIoU.

therefore ,Cylinder3D Point by point refinement is introduced (point-wise refinement) Module to mitigate label interference caused by coding .

First, according to the inverse point - The voxel mapping table projects the cylinder feature back to each point （ Points in the same element unit will be assigned to the same feature ）. then , The point features before and after the three-dimensional convolution network are fused together and input into the point by point refinement module to refine the output .

Objective function

The final objective function includes voxel by voxel loss $L_{voxel}$ And point by point loss $L_{point}$

$L=L_{voxel}+L_{point}$

$L_{voxel}$ Is the weighted cross entropy loss weighted cross-entropy loss Add lovasz( Lovasz )-softmax. and $L_{point}$ Only weighted cross entropy loss .