当前位置:网站首页>[point cloud processing paper crazy reading classic version 11] - mining point cloud local structures by kernel correlation and graph pooling
[point cloud processing paper crazy reading classic version 11] - mining point cloud local structures by kernel correlation and graph pooling
2022-07-03 09:09:00 【LingbinBu】
KCNet: Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling
Abstract
- problem : The disorder of point clouds (unorder) It is still very challenging for semantic learning , In existing methods ,PointNet Good results are obtained by learning directly on the point set , however PointNet Local neighborhood information of the point cloud is not fully utilized , The local neighborhood information of point cloud contains fine-grained structural information , It can contribute to better semantic learning .
- Method : Two new operations are proposed , In order to improve the PointNet Explore the efficiency of local structures .
- Technical details :
① The first operation focuses on local 3D The geometric (geometric) structure , take point-set kernel Define as a group of learnable 3D points, these points Respond to a set of adjacent points , adopt kernel correlation Measure the geometric relationship between these adjacent points
② The second operation utilizes local high-dimensional features (feature) structure , Through point cloud 3D The position is calculated nearest-neighbor-graph, And then in nearest-neighbor-graph Feature aggregation recursively on (feature aggregation) Implement the operation . - Code :
① http://www.merl.com/research/license#KCNet
② https://github.com/ftdlyc/KCNet_Pytorch PyTorch edition It may not work , It's part of the code

introduction
- PointNet Directly aggregate the features of all points into global features , indicate PointNet Local structure mining that does not make full use of points fine-grained patterns: Every point of MLP Output only for 3D In a particular nonlinear partition of space 3D Rough coding of points .
- If MLP It's not just about “wether” Point of existence for coding , It can also be used for nonlinear 3D The points existing in the space partition are determined “what type of”( for example , Corner and plane 、 Convex and concave points, etc ) Encoding , Is the more distinctive expression we expect .
- such “type” Information must come from 3D Learning in the local domain of the point on the surface of the target .
- Main contributions :
- kernel correlation layer ——> Mining local geometry
- graph-based pooling layer ——> Mining local feature structure

Method
Learning on Local Geometric Structure
In this paper , With a point cloud local neighborhood As source, With learnable kernel( A set of points ) As reference, Thus, it represents the local structure / The shape of the type. In the point cloud registration task ,source and reference The points in the are fixed , But this article allows reference The point in changes , Through the study ( Back propagation ) The way to go more “ Fit ” shape . Unlike point cloud registration , We want to learn in a non transformational way reference, Not through optimal transformation . You can learn in this way kernel Point and convolution kernel It's the same , It responds to each point in the adjacent area , And capture local geometry in this perception domain , This perception domain is through kernel Functions and kernel wedth Decisive . With this configuration , The process of learning can be seen as finding a set of local geometric structures that can encode the most efficient and useful reference spot , And combine other parameters in the network to obtain the best learning performance .
To be specific , Definition with M M M One can learn point-set kernel κ \boldsymbol{\kappa} κ And with N N N Current in point cloud of points anchor point x i \mathbf{x}_{i} xi Between kernel correlation (KC) by :
K C ( κ , x i ) = 1 ∣ N ( i ) ∣ ∑ m = 1 M ∑ n ∈ N ( i ) K σ ( κ m , x n − x i ) \mathrm{KC}\left(\boldsymbol{\kappa}, \mathbf{x}_{i}\right)=\frac{1}{|\mathcal{N}(i)|} \sum_{m=1}^{M} \sum_{n \in \mathcal{N}(i)} \mathrm{K}_{\sigma}\left(\boldsymbol{\kappa}_{m}, \mathbf{x}_{n}-\mathbf{x}_{i}\right) KC(κ,xi)=∣N(i)∣1m=1∑Mn∈N(i)∑Kσ(κm,xn−xi)
among κ m \boldsymbol{\kappa}_{m} κm yes kernel No m m m Learning points , N ( i ) \mathcal{N}(i) N(i) yes anchor point x i \mathbf{x}_{i} xi Neighborhood index of , x n \mathbf{x}_{n} xn yes x i \mathbf{x}_{i} xi One of the neighbor point. K σ ( ⋅ , ⋅ ) : ℜ D × ℜ D → \mathrm{K}_{\sigma}(\cdot, \cdot): \Re^{D} \times \Re^{D} \rightarrow Kσ(⋅,⋅):ℜD×ℜD→ ℜ \Re ℜ Is arbitrary kernel function . In order to effectively store the local neighborhood of a point , We take each point as a vertex , Only adjacent vertices are connected by edges , So as to construct a KNNG.
Without losing generality , This article chooses Gaussian kernel:
K σ ( k , δ ) = exp ( − ∥ k − δ ∥ 2 2 σ 2 ) K_{\sigma}(\mathbf{k}, \boldsymbol{\delta})=\exp \left(-\frac{\|\mathbf{k}-\boldsymbol{\delta}\|^{2}}{2 \sigma^{2}}\right) Kσ(k,δ)=exp(−2σ2∥k−δ∥2)
among ∥ ⋅ ∥ \|\cdot\| ∥⋅∥ Represents the Euclidean distance between two points , σ \sigma σ yes kernel Width , Control the influence of the distance between points .Gaussian kernel A good property of is that the distance between two points tends to be an exponential function , There is also one from each kernel Point to anchor point Of neighbor points Between soft-assignment( Derivable ).KC Yes kernel points and neighboring points Code the distance between , And increase the similarity of the two groups of point clouds in shape . It's worth noting that kernel width, Too big or too small will lead to unsatisfactory results , You can choose by experiment .
Given :
- Network loss function L \mathcal{L} L
- The loss function is relative to each anchor point x i \mathbf{x}_{i} xi Of K C \mathrm{KC} KC Derivative of back propagation from the top d i = ∂ L ∂ K C ( κ , x i ) d_{i}=\frac{\partial \mathcal{L}}{\partial \mathrm{KC}\left(\boldsymbol{\kappa}, \mathbf{x}_{i}\right)} di=∂KC(κ,xi)∂L
This article provides each kernel point κ m \kappa_{m} κm Back propagation formula of :
∂ L ∂ κ m = ∑ i = 1 N α i d i [ ∑ n ∈ N ( i ) v m , i , n exp ( − ∥ v m , i , n ∥ 2 2 σ 2 ) ] \frac{\partial \mathcal{L}}{\partial \boldsymbol{\kappa}_{m}}=\sum_{i=1}^{N} \alpha_{i} d_{i}\left[\sum_{n \in \mathcal{N}(i)} \mathbf{v}_{m, i, n} \exp \left(-\frac{\left\|\mathbf{v}_{m, i, n}\right\|^{2}}{2 \sigma^{2}}\right)\right] ∂κm∂L=i=1∑Nαidi⎣⎡n∈N(i)∑vm,i,nexp(−2σ2∥vm,i,n∥2)⎦⎤
Middle point x i x_{i} xi The normalization constant of is α i = − 1 ∣ N ( i ) ∣ σ 2 \alpha_{i}=\frac{-1}{|\mathcal{N}(i)| \sigma^{2}} αi=∣N(i)∣σ2−1, The local deviation vector is v m , i , n = κ m + x i − x n \mathbf{v}_{m, i, n}=\boldsymbol{\kappa}_{m}+\mathbf{x}_{i}-\mathbf{x}_{n} vm,i,n=κm+xi−xn.

Learning on Local Feature Structure
K C \mathrm{KC} KC It only acts on the front end of the network to extract local geometry . We also talked about before , By construction KNNG To get the information of adjacent points , This graph It can also be used to mine local feature structures in deeper layers . Affected convolution network can locally aggregate features , And through multiple pooling The layer gradually increases the inspiration of the receptive field , We walked along 3D neighborhood graph Edge recursively propagates and aggregates features , In order to extract local features in the upper layer .
One of our key ideas is neighbor points They tend to have similar geometric structures , So by neighborhood graph Propagation features can help learn more robust parts patterns. It is worth noting that , We specifically avoid changes in these layers above neighborhood graph Structure .
Make X ∈ ℜ N × K \mathbf{X} \in \Re^{N \times K} X∈ℜN×K Express graph pooling The input of , that KNNG Have adjacency matrix W ∈ \mathbf{W} \in W∈ ℜ N × N \Re^{N \times N} ℜN×N, If the vertex i i i And vertex j j j There is an edge between , that W ( i , j ) = 1 \mathbf{W}(i, j)=1 W(i,j)=1, otherwise W ( i , j ) = 0 \mathbf{W}(i, j)=0 W(i,j)=0. Intuitively , Forming a local surface neighboring points Usually share similar features pattern. therefore , adopt graph pooling The operation aggregates features in the neighborhood of each point :
Y = P X \mathbf{Y}=\mathbf{P X} Y=PX
pooling The operation can be average pooling, It can also be max pooling.
Graph average pooling By using P ∈ ℜ N × N \mathbf{P} \in \Re^{N \times N} P∈ℜN×N As a normalized adjacency matrix, average all features in the neighborhood of a point :
P = D − 1 W , \mathbf{P}=\mathbf{D}^{-1} \mathbf{W}, P=D−1W,
among D ∈ ℜ N × N \mathbf{D} \in \Re^{N \times N} D∈ℜN×N Is the degree matrix , The first ( i , j ) (i, j) (i,j) individual entry d i , j d_{i, j} di,j Defined as :
d i , j = { deg ( i ) , if i = j 0 , otherwise d_{i, j}= \begin{cases}\operatorname{deg}(i), & \text { if } \quad i=j \\ 0, & \text { otherwise }\end{cases} di,j={ deg(i),0, if i=j otherwise
among deg ( i ) \operatorname{deg}(i) deg(i) Is the vertex. i i i Degree , Represents all connected to vertices i i i The number of vertices of .
Graph max pooling(GM) Take the largest feature in the neighborhood of each vertex , stay K K K Operate independently on three dimensions . So output Y \mathbf{Y} Y Of the ( i , k ) (i, k) (i,k) individual entry by :
Y ( i , k ) = max n ∈ N ( i ) X ( n , k ) \mathbf{Y}(i, k)=\max _{n \in \mathcal{N}(i)} \mathbf{X}(n, k) Y(i,k)=n∈N(i)maxX(n,k)
among N ( i ) \mathcal{N}(i) N(i) Represent point set X i \mathbf{X}_i Xi Index of neighbors , from W \mathbf{W} W It is calculated that .
experiment
Shape Classification
Network Configuration

As shown in the figure above ,KCNet share 9 layer . It is worth noting that ,KNN-Group Don't take part in training , It can even be calculated in advance , Its function is to construct adjacent points and feature aggregation , Play a structural role .
- first floor ,kernel correlation, Take point coordinates as input , The output is local structural features , And then splice with the point coordinates .
- Then the feature passes through the first two MLP Carry out feature learning point by point .
- next graph pooling layer Aggregate the output point-by-point features into more robust local structural features , And the second two MLP The output is spliced .
The rest of the structure follows PointNet Very similar , except 1) Don't use Batchnorm,ReLU Used behind each full connection ;2)Dropout For the last full connection layer , Value is set to 0.5;
stay kernel computation and graph max pooling Use in 16-NN graph,kernel The number of L L L by 32, Every kernel There are 16 A little bit , The initial value is set to [ − 0.2 , 0.2 ] [-0.2, 0.2] [−0.2,0.2],kernel width σ = 0.005 \sigma=0.005 σ=0.005.
Results



Part Segmentation

Network Configuration
Semantic Web has 10 layer , The local features captured by the features of different layers will be spliced with the global features and shape information . It doesn't use Batchnorm,Dropout Layer is used for full connection layer , The value is 0.3.18-NN graph be used for kernel computation and graph max pooling.kernel The quantity of is 16, Every kernel There is 18 A little bit , The initial value is [ − 0.2 , 0.2 ] [-0.2, 0.2] [−0.2,0.2] Between ,kernel width σ = 0.005 \sigma=0.005 σ=0.005.
Results


Ablation Study
Effectiveness of Kernel Correlation

Symmetric Functions

Effectiveness of Local Structures

Choosing Hyper-parameters

Robustness Test

边栏推荐
- What is an excellent fast development framework like?
- On a un nom en commun, maître XX.
- The difference between if -n and -z in shell
- LeetCode 1089. 复写零
- Mortgage Calculator
- Try to reprint an article about CSDN reprint
- 22-05-26 Xi'an interview question (01) preparation
- 【点云处理之论文狂读经典版8】—— O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis
- How to check whether the disk is in guid format (GPT) or MBR format? Judge whether UEFI mode starts or legacy mode starts?
- 教育信息化步入2.0,看看JNPF如何帮助教师减负,提高效率?
猜你喜欢
传统办公模式的“助推器”,搭建OA办公系统,原来就这么简单!
Binary tree sorting (C language, int type)
LeetCode 532. K-diff number pairs in array
Excel is not as good as jnpf form for 3 minutes in an hour. Leaders must praise it when making reports like this!
Save the drama shortage, programmers' favorite high-score American drama TOP10
低代码起势,这款信息管理系统开发神器,你值得拥有!
State compression DP acwing 291 Mondrian's dream
I made mistakes that junior programmers all over the world would make, and I also made mistakes that I shouldn't have made
樹形DP AcWing 285. 沒有上司的舞會
Really explain the five data structures of redis
随机推荐
String splicing method in shell
STM32F103 can learning record
剑指 Offer II 091. 粉刷房子
AcWing 787. 归并排序(模板)
干货!零售业智能化管理会遇到哪些问题?看懂这篇文章就够了
Slice and index of array with data type
浅谈企业信息化建设
We have a common name, XX Gong
Find the combination number acwing 885 Find the combination number I
低代码前景可期,JNPF灵活易用,用智能定义新型办公模式
【点云处理之论文狂读前沿版12】—— Adaptive Graph Convolution for Point Cloud Analysis
Methods of checking ports according to processes and checking processes according to ports
Binary tree sorting (C language, int type)
Format - C language project sub file
Introduction to the basic application and skills of QT
【点云处理之论文狂读经典版10】—— PointCNN: Convolution On X-Transformed Points
Digital management medium + low code, jnpf opens a new engine for enterprise digital transformation
LeetCode 715. Range module
dried food! What problems will the intelligent management of retail industry encounter? It is enough to understand this article
What are the stages of traditional enterprise digital transformation?