当前位置：网站首页>[point cloud processing paper crazy reading classic version 9] - pointwise revolutionary neural networks

[point cloud processing paper crazy reading classic version 9] - pointwise revolutionary neural networks

2022-07-03 09:08:00 【LingbinBu】

Pointwise-CNN: Pointwise Convolutional Neural Networks

Abstract
Related work
Pointwise Convolution
experiment

Abstract

problem ： 3D The field of deep learning of data has attracted many people's attention , But based on CNN No one has noticed the point cloud learning of
Method ： This paper presents a convolutional neural network for point cloud semantic segmentation and target recognition
Technical details ： point-wise convolution Operations can be used at each point & fully convolutional network
Code ：https://github.com/hkust-vgd/pointwise TensorFlow edition

Related work

equivariance And invariance What's the difference ？

Pointwise Convolution

Convolution

One convolution kernel Take each point in the point cloud as the center .kernel Medium neighbor points It can be done to center point An impact . Every kernel There is one. size or radius, According to each convolution layer neighbor points Adjust the quantity of .pointwise convolution It can be expressed as ：
$x_{i}^{\ell}=\sum_{k} w_{k} \frac{1}{\left|\Omega_{i}(k)\right|} \sum_{p_{j} \in \Omega_{i}(k)} x_{j}^{\ell-1}$
among , $k$ Traverse kernel support All in sub-domains. $\Omega_{i}(k)$ With a little $i$ Centred kernel Of the $k$ individual sub-domain. $p_{i}$ Yes. $i$ Coordinates of . $|\cdot|$ yes sub-domain Number of all points in . $w_{k}$ It's No $k$ individual sub-domain Medium kernel The weight , $x_{i}$ and $x_{j}$ Indication point $i$ and spot $j$ Place the value of the , $\ell-1$ and $\ell$ Is the index of the input and output layers .

Binding graph 1 Better understanding , Divide the area near the center into grids , The features in each grid are first added and then normalized with density , Finally, multiply by the convolution weight in the grid to get the characteristics of a grid , The features of multiple grids are added to obtain new features .

Gradient backpropagation

In order to ensure pointwise convolution Can be trained , It is necessary to calculate the input data and kernel Weight dependent gradient . Make $L$ Is the loss function , The gradient associated with the input can be defined as ：
$\frac{\partial L}{\partial x_{j}^{\ell-1}}=\sum_{i \in \Omega_{j}} \frac{\partial L}{\partial x_{i}^{\ell}} \frac{\partial x_{i}^{\ell}}{\partial x_{j}^{\ell-1}}$
Where the given point $j$ , We traverse all of them neighbor points $i$ . Following chain rule, $\partial L / \partial x_{i}^{\ell}$ It's the upward layer of back propagation $\ell$ Gradient of , $\partial x_{i}^{\ell} / \partial x_{j}^{\ell-1}$ It can be written. ：

$\frac{\partial x_{i}^{\ell}}{\partial x_{j}^{\ell-1}}=\sum_{k} w_{k} \frac{1}{\left|\Omega_{i}(k)\right|} \sum_{p_{j} \in \Omega_{i}(k)} 1$
Similarly , And kernel Weight dependent gradients can be obtained by traversing all points $i$ Define ：
$\frac{\partial L}{\partial w_{k}}=\sum_{i} \frac{\partial L}{\partial x_{i}^{\ell}} \frac{\partial x_{i}^{\ell}}{\partial w_{k}}$
among ：
$\frac{\partial x_{i}^{\ell}}{\partial w_{k}}=\frac{1}{\left|\Omega_{i}(k)\right|} \sum_{p_{j} \in \Omega_{i}(k)} x_{j}^{\ell-1}$

This article uses convolution kernels The size is $\times 3 \times 3$ , Every kernel The weights of points in the cell are the same .

And in volumes The convolution in is different , The network in this article does not pooling, Don't use pooling The reason for this is ：

There is no need to lower and upper sample the point cloud , When the point cloud is mapped to a high-dimensional space , Down sampling and up sampling are troublesome
The search operation of adjacent points only needs to be established once

Point order

stay PointNet in , The input point cloud is disordered , The subsequent processing process will learn symmetric functions for processing .
In our approach , The input point cloud has a specific order ,XYZ or Morton curve. stay object recognition Tasks , The order of points will affect the final global eigenvector . stay semantic segmentation, The order doesn't matter .

`A-trous convolution

introduce stride Parameters , You can extend the kernel size , So as to expand the perception domain , There is no need to deal with too many points in convolution . This significantly improves the speed without sacrificing the accuracy demonstrated in our experiments .

Point attributes

For easier implementation convolution operator, The coordinates of points and others are stored respectively attributes ( Color 、 The normal vector 、 Or other high-dimensional features output from the previous layer ). No matter how deep the layers are , The coordinates of points can be used for the search of adjacent points .