当前位置：网站首页>[point cloud processing paper crazy reading cutting-edge version 12] - adaptive graph revolution for point cloud analysis

[point cloud processing paper crazy reading cutting-edge version 12] - adaptive graph revolution for point cloud analysis

2022-07-03 09:14:00 【LingbinBu】

Adaptive Graph Convolution for Point Cloud Analysis

Abstract
introduction
Method
Adaptive graph convolution
- Feature decisions
- Network architecture
experiment
Visualization and learned features

Abstract

problem ： The standard convolution operation cannot be performed in 3D The feature correspondence is distinguished between points
Method ： In this paper, Adaptive Graph Convolution(AdaptConv), according to 3D Point to the characteristics of dynamic learning Generate adaptive kernel
1. Use and fixing / Isotropic kernel comparison ,AdaptConv Improved point cloud Flexibility of convolution , Effectively and accurately get a variety of relationships between different semantic parts
2. Different from the method of using attention weight ,AdaptConv Make convolution operation more adaptive , Not simply for neighboring points Assign different weights
Code ：PyTorch edition

introduction

Graph CNNs According to the space between points / Feature similarity will point cloud Expressed as graph data , And will images Upper 2D Convolution is extended to 3D light .

The standard Graph CNNs Usually, the shared weight function is used to extract the corresponding edge features of each pair of points , This will lead to a fixed / Convolution in the same direction kernel, When applied to all point pairs , Will ignore the corresponding relationship between different characteristics .

The key contribution of this work is AdaptConv In the graph Use in convolution , Instead of a weight function based on the characteristics of the results .

Besides , Some feature convolution designs have also been developed , It can carry out adaptive convolution more flexibly .

Method

Adaptive graph convolution

remember $\mathcal{X}=\left\{x_{i} \mid i=\right.$ $\ldots, N\} \in \mathbb{R}^{N \times 3}$ Enter a point cloud for , $\mathcal{F}=\left\{f_{i} \mid i=1,2, \ldots, N\right\} \in \mathbb{R}^{N \times D}$ Is the corresponding feature , among $x_{i}$ Means the... Th $i$ Point $(\mathbf{x}, \mathbf{y}, \mathbf{z})$ coordinate , In other cases , It can also be combined with other features .

Then calculate the directed graph according to the given point cloud $\mathcal{G}(\mathcal{V}, \mathcal{E})$ , among $\mathcal{V}=\{1, \ldots, N\}$ and $\mathcal{E} \subseteq \mathcal{V} \times \mathcal{V}$ Represents a collection of vertices and edges . By including self-loop Of $k$ -nearest neighbors (KNN) structure graph.

In the given input $D$ After Witt's sign ,AdaptConv layer A new set of $M$ Whitman's sign , The number of points is the same as the input . And previous graph convolution Layer by layer , It can more accurately reflect the local structural characteristics .

remember $x_{i}$ yes graph convolution The center of , $\mathcal{N}(i)=\{j:(i, j) \in \mathcal{E}\}$ Is the index of adjacent points . Due to the irregularity of the point cloud , The previous method is usually in $x_{i}$ All of the neighbored points Apply fixed kernel function , Used to capture patch The geometric information of . however , Different neighbored points May get corresponding $x_{i}$ Different characteristics , Especially when $x_{i}$ Located in a prominent area , Such as corners or edges . under these circumstances , fixed kernel It may not be possible to graph convolution Get the geometric representation information for classification or segmentation .

In the method of this paper , Designed an adaptive kernel, Used to calculate the significant relationship between each pair of points . about $M$ Every channel of dimensional output features ,AdaptConv Will dynamically generate a kernel, It is applied in points features $\left(f_{i}, f_{j}\right)$ The function on ：
$\hat{e}_{i j m}=g_{m}\left(\Delta f_{i j}\right), j \in \mathcal{N}(i) .$
among $\ldots, M$ Express $M$ One of the output dimensions , Corresponds to a separate filter. $\Delta f_{i j}=\left[f_{i}, f_{j}-f_{i}\right]$ Used to capture global structure and local domain features , $[\cdot, \cdot]$ The splicing operation is , $g(\cdot)$ Is the feature mapping function , namely $M L P$ .

And 2D The calculation in convolution is the same , take $D$ Dimension input and corresponding filter The weight is convoluted to get $M$ One dimension in dimensional output , This article will adaptive kernel And the corresponding point $\left(x_{i}, x_{j}\right)$ Convolution ：
$h_{i j m}=\sigma\left\langle\hat{e}_{i j m}, \Delta x_{i j}\right\rangle,$
among $\Delta x_{i j}$ Is defined as $\left[x_{i}, x_{j}-x_{i}\right]$ similarity , $\langle\cdot, \cdot\rangle$ Represents the inner product of two vectors , Output is $h_{i j m} \in \mathbb{R}$ , $\sigma$ It's a nonlinear activation function .

Pictured 2 Shown , The first $m$ individual adaptive kernel $\hat{e}_{i j m}$ And corresponding points $x_{j} \in \mathbb{R}^{3}$ Of spatial relations $\Delta x_{i j}$ combination , Express kernel The size of should match the inner product , Feature mapping $g_{m}: \mathbb{R}^{2 D} \rightarrow \mathbb{R}^{6}$ . Store for each channel $h_{i j m}$ , Get the connection point $\left(x_{i}, x_{j}\right)$ Edge features between $h_{i j}=$ $\left[h_{i j 1}, h_{i j 2}, \ldots, h_{i j M}\right] \in \mathbb{R}^{M}$ .

Last , By using the aggregation function of all edge features in the neighborhood central point $x_{i}$ The output characteristics of ：
$f_{i}^{\prime}=\max _{j \in \mathcal{N}(i)} h_{i j},$
among max It is in the unit of channel max-pooling function . All in all ,AdaptConv Of convolution weights Is defined as defined as $\Theta=\left(g_{1}, g_{2}, \ldots, g_{M}\right)$ .

Feature decisions

Is to use features to find spatial relationships .

If input $x_i \in \mathbb{R}^E$ Contains more information , That's another option , In the experiment, we will consider .

Space information $\Delta x_{i j}$ Replace with characteristic information $\Delta f_{i j}$ , You'll get different $\hat{e}_{i j m}$ , This is also an option to consider .

This article chooses to use $\Delta x_{i j}$ As a transform domain, we have the following considerations ：

Point features have been used to generate adaptive kernel 了 , Convolution using features will lead to redundancy of feature information
The dimension of features is high ,MLP It's difficult to learn in high-dimensional space
Large memory consumption , High computational complexity