当前位置:网站首页>[advanced feature learning on point clouds using multi resolution features and learning]
[advanced feature learning on point clouds using multi resolution features and learning]
2022-07-03 09:08:00 【LingbinBu】
PointStack:Advanced Feature Learning on Point Clouds using Multi-resolution Features and Learnable Pooling
Abstract
- problem : (1) The existing point cloud feature learning network is usually continuous sampling, neighborhood grouping, neighborhood-wise feature learning, feature aggregation Let's learn something about cloud global context, But this kind of processing process will be because sampling And lead to granular information A large number of missing ;(2) because max-pooling feature aggregation Completely abandoned non-maximum point features, Therefore, the loss of information is more serious ;(3) because granular information and non-maximum point features The loss of , Lead to the final high-semantic Point features cannot effectively represent the local context
- Method : A new point cloud feature learning network is proposed PointStack, Used multi-resolution feature learning and learnable pooling(LP) Two ways to deal with it
- Technical details :
① By aggregating point features with different resolutions among multiple layers multi-resolution feature learning, The final point feature will also contain high-semantic and high-resolution Information
②Learnable pooling It can be regarded as generalized pooling function , By carrying learnable queries Calculation of attention mechanism multi-resolution point features Weighted sum of - effect :
① At the loss of granular information and non-maximum point features Extract high-semantic Point feature
② Finally, the aggregated point features can represent the global and localcontext
③PointStack Of network head Be able to better understand point cloud globa The structure and local Shape details - Code :https://github.com/kaist-avelab/PointStack PyTorch edition
introduction
- The effect of point features from different resolutions on specific tasks head It is very helpful .
- Combine the generalized information from all point features pooling function (permutation invariant) It can improve the aggregation ability of point features
PointStack
Multi-resolution Feature Learning
And 2D Compared to the image ,3D The shape is more complex ,3D Some important textures and curves of shapes can only be observed at the highest granularity level . Existing methods sacrifice fine-grained construction high-semantic features , therefore multi-resolution point features It can collect enough semantic information , It can also retain fine-grained methods to a certain extent .

Through the existing based on MLP Method (PointMLP) obtain m m m Point features at different resolutions :
- adopt m m m The basic representation of repeated residual block learning points , Compared with input , The output resolution of each residual block is lower, but it has higher semantic information , Select the residual block instead of transformer The reason for the block is in memory consumption and computational complexity , The residual block is more dominant
- After learning the appropriate expression , Perform pooling operation . In the i i i Layer , The size is N i × C i N_i \times C_i Ni×Ci The characteristics of pass through P F i p o o l e d \mathbf{PF}_i^{pooled} PFipooled after , The magnitude is N m × C m N_m \times C_m Nm×Cm Of pooled features , This feature includes important features under the resolution of this layer
- Pass each layer through P F i p o o l e d \mathbf{PF}_i^{pooled} PFipooled After the feature is spliced , The magnitude is N i × m ⋅ C i N_i \times m \cdot C_i Ni×m⋅Ci Of stacked features , Re pass P F p o o l e d \mathbf{PF}^{pooled} PFpooled You can get the global eigenvector
Because the global eigenvector is from m m m Resolution , So it includes high-semantic and high-resolution features Information .
Originally on every floor pooling The size of the post feature can be variable , But we found through experiments that fixed pooling The size of the feature after is helpful to improve the classification performance . The reason may be m m m There are different numbers of point features at different resolutions entries, That is, higher resolution point features have more feature vectors than low resolution point features , The difference between different quantity eigenvectors may affect the last multi-resolution L P \mathrm{LP} LP.
Learnable Pooling

Structurally ,LP Used multi-head attention (MHA),MHA It is regarded as a process of information retrieval , A group of queries Is used to extract from values Search for information in ,values Is based on queries and keys Correlation obtained . take keys and values Set to the same point characteristic tensor , and queries It's a learnable parameter . Learn the right through the Internet queries, Then the retrieved point features (values) And learning objectives Will be highly relevant . because queries Directly be learning objectives supervise ,values It is obtained by the weighted sum of all point features , therefore LP Able to aggregate point features with minimal information loss .
nature 1. The proposed LP Is a symmetric function , For point clouds permutation-invariance Of
LP Of permutation invariant The key point of property is to use point-wise shared-MLP, also keys and values All choices are the same row-permuted feature matrix, that permutation matrix It is orthogonal. ,scaled dot-product attention mechanism That is to say permutation-invariance Of .
experiment
The residual block uses PointMLP,single-resolution pooling and multi-resolution pooling Can learn queries The sizes are 64 × 1024 64 \times 1024 64×1024 and 1 × 4096 1 \times 4096 1×4096, The number of residual blocks is set to 4, Learnable in each residual block queries Is not the same .
Shape Classification

Part Segmentation

Ablation Study

Permutation Invariant Property of the Learnable Pooling

Limitations on the Number of Training Samples
- PointStack stay ModelNet40 The reason for poor performance may be the lack of training samples
- take ScanObjectNN Sample reduction , The performance is not very good , As shown in the table 4 Shown
边栏推荐
- 状态压缩DP AcWing 291. 蒙德里安的梦想
- 【点云处理之论文狂读经典版11】—— Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling
- 网络安全必会的基础知识
- Methods of checking ports according to processes and checking processes according to ports
- Find the combination number acwing 885 Find the combination number I
- PIC16F648A-E/SS PIC16 8位 微控制器,7KB(4Kx14)
- DOM render mount patch responsive system
- AcWing 786. Number k
- Method of intercepting string in shell
- 数位统计DP AcWing 338. 计数问题
猜你喜欢
LeetCode 532. 数组中的 k-diff 数对
TP5 multi condition sorting
Sword finger offer II 029 Sorted circular linked list
干货!零售业智能化管理会遇到哪些问题?看懂这篇文章就够了
LeetCode 75. 颜色分类
Tree DP acwing 285 A dance without a boss
Common penetration test range
LeetCode 241. 为运算表达式设计优先级
求组合数 AcWing 886. 求组合数 II
【点云处理之论文狂读前沿版12】—— Adaptive Graph Convolution for Point Cloud Analysis
随机推荐
LeetCode 871. Minimum refueling times
LeetCode 30. 串联所有单词的子串
On the setting of global variable position in C language
20220630 learning clock in
Save the drama shortage, programmers' favorite high-score American drama TOP10
AcWing 787. 归并排序(模板)
Facial expression recognition based on pytorch convolution -- graduation project
LeetCode 871. 最低加油次数
Pic16f648a-e/ss PIC16 8-bit microcontroller, 7KB (4kx14)
推荐一个 yyds 的低代码开源项目
【点云处理之论文狂读前沿版9】—Advanced Feature Learning on Point Clouds using Multi-resolution Features and Learni
Six dimensional space (C language)
Markdown learning
Try to reprint an article about CSDN reprint
What is the difference between sudo apt install and sudo apt -get install?
TP5 multi condition sorting
Vs2019 configuration opencv3 detailed graphic tutorial and implementation of test code
浅谈企业信息化建设
LeetCode 30. Concatenate substrings of all words
State compression DP acwing 91 Shortest Hamilton path