当前位置:网站首页>[advanced feature learning on point clouds using multi resolution features and learning]
[advanced feature learning on point clouds using multi resolution features and learning]
2022-07-03 09:08:00 【LingbinBu】
PointStack:Advanced Feature Learning on Point Clouds using Multi-resolution Features and Learnable Pooling
Abstract
- problem : (1) The existing point cloud feature learning network is usually continuous sampling, neighborhood grouping, neighborhood-wise feature learning, feature aggregation Let's learn something about cloud global context, But this kind of processing process will be because sampling And lead to granular information A large number of missing ;(2) because max-pooling feature aggregation Completely abandoned non-maximum point features, Therefore, the loss of information is more serious ;(3) because granular information and non-maximum point features The loss of , Lead to the final high-semantic Point features cannot effectively represent the local context
- Method : A new point cloud feature learning network is proposed PointStack, Used multi-resolution feature learning and learnable pooling(LP) Two ways to deal with it
- Technical details :
① By aggregating point features with different resolutions among multiple layers multi-resolution feature learning, The final point feature will also contain high-semantic and high-resolution Information
②Learnable pooling It can be regarded as generalized pooling function , By carrying learnable queries Calculation of attention mechanism multi-resolution point features Weighted sum of - effect :
① At the loss of granular information and non-maximum point features Extract high-semantic Point feature
② Finally, the aggregated point features can represent the global and localcontext
③PointStack Of network head Be able to better understand point cloud globa The structure and local Shape details - Code :https://github.com/kaist-avelab/PointStack PyTorch edition
introduction
- The effect of point features from different resolutions on specific tasks head It is very helpful .
- Combine the generalized information from all point features pooling function (permutation invariant) It can improve the aggregation ability of point features
PointStack
Multi-resolution Feature Learning
And 2D Compared to the image ,3D The shape is more complex ,3D Some important textures and curves of shapes can only be observed at the highest granularity level . Existing methods sacrifice fine-grained construction high-semantic features , therefore multi-resolution point features It can collect enough semantic information , It can also retain fine-grained methods to a certain extent .

Through the existing based on MLP Method (PointMLP) obtain m m m Point features at different resolutions :
- adopt m m m The basic representation of repeated residual block learning points , Compared with input , The output resolution of each residual block is lower, but it has higher semantic information , Select the residual block instead of transformer The reason for the block is in memory consumption and computational complexity , The residual block is more dominant
- After learning the appropriate expression , Perform pooling operation . In the i i i Layer , The size is N i × C i N_i \times C_i Ni×Ci The characteristics of pass through P F i p o o l e d \mathbf{PF}_i^{pooled} PFipooled after , The magnitude is N m × C m N_m \times C_m Nm×Cm Of pooled features , This feature includes important features under the resolution of this layer
- Pass each layer through P F i p o o l e d \mathbf{PF}_i^{pooled} PFipooled After the feature is spliced , The magnitude is N i × m ⋅ C i N_i \times m \cdot C_i Ni×m⋅Ci Of stacked features , Re pass P F p o o l e d \mathbf{PF}^{pooled} PFpooled You can get the global eigenvector
Because the global eigenvector is from m m m Resolution , So it includes high-semantic and high-resolution features Information .
Originally on every floor pooling The size of the post feature can be variable , But we found through experiments that fixed pooling The size of the feature after is helpful to improve the classification performance . The reason may be m m m There are different numbers of point features at different resolutions entries, That is, higher resolution point features have more feature vectors than low resolution point features , The difference between different quantity eigenvectors may affect the last multi-resolution L P \mathrm{LP} LP.
Learnable Pooling

Structurally ,LP Used multi-head attention (MHA),MHA It is regarded as a process of information retrieval , A group of queries Is used to extract from values Search for information in ,values Is based on queries and keys Correlation obtained . take keys and values Set to the same point characteristic tensor , and queries It's a learnable parameter . Learn the right through the Internet queries, Then the retrieved point features (values) And learning objectives Will be highly relevant . because queries Directly be learning objectives supervise ,values It is obtained by the weighted sum of all point features , therefore LP Able to aggregate point features with minimal information loss .
nature 1. The proposed LP Is a symmetric function , For point clouds permutation-invariance Of
LP Of permutation invariant The key point of property is to use point-wise shared-MLP, also keys and values All choices are the same row-permuted feature matrix, that permutation matrix It is orthogonal. ,scaled dot-product attention mechanism That is to say permutation-invariance Of .
experiment
The residual block uses PointMLP,single-resolution pooling and multi-resolution pooling Can learn queries The sizes are 64 × 1024 64 \times 1024 64×1024 and 1 × 4096 1 \times 4096 1×4096, The number of residual blocks is set to 4, Learnable in each residual block queries Is not the same .
Shape Classification

Part Segmentation

Ablation Study

Permutation Invariant Property of the Learnable Pooling

Limitations on the Number of Training Samples
- PointStack stay ModelNet40 The reason for poor performance may be the lack of training samples
- take ScanObjectNN Sample reduction , The performance is not very good , As shown in the table 4 Shown
边栏推荐
- PHP uses foreach to get a value in a two-dimensional associative array (with instances)
- <, < <,>, > > Introduction in shell
- AcWing 788. Number of pairs in reverse order
- LeetCode 57. Insert interval
- 【点云处理之论文狂读前沿版10】—— MVTN: Multi-View Transformation Network for 3D Shape Recognition
- 一个优秀速开发框架是什么样的?
- LeetCode 508. 出现次数最多的子树元素和
- Sword finger offer II 029 Sorted circular linked list
- Use of sort command in shell
- 传统办公模式的“助推器”,搭建OA办公系统,原来就这么简单!
猜你喜欢

What is an excellent fast development framework like?

【点云处理之论文狂读前沿版10】—— MVTN: Multi-View Transformation Network for 3D Shape Recognition

Sword finger offer II 029 Sorted circular linked list

Query XML documents with XPath

Binary tree sorting (C language, char type)

DOM render mount patch responsive system

【点云处理之论文狂读经典版13】—— Adaptive Graph Convolutional Neural Networks

传统办公模式的“助推器”,搭建OA办公系统,原来就这么简单!

LeetCode 57. 插入区间

状态压缩DP AcWing 91. 最短Hamilton路径
随机推荐
The difference between if -n and -z in shell
Method of intercepting string in shell
Severity code description the project file line prohibits the display of status error c2440 "initialization": unable to convert from "const char [31]" to "char *"
Find the combination number acwing 886 Find the combination number II
What is the difference between sudo apt install and sudo apt -get install?
使用dlv分析golang进程cpu占用高问题
The "booster" of traditional office mode, Building OA office system, was so simple!
低代码前景可期,JNPF灵活易用,用智能定义新型办公模式
低代码起势,这款信息管理系统开发神器,你值得拥有!
Escape from heaven and forget what he suffered. In ancient times, it was called the punishment of escape from heaven. Article collection
【点云处理之论文狂读经典版12】—— FoldingNet: Point Cloud Auto-encoder via Deep Grid Deformation
Using DLV to analyze the high CPU consumption of golang process
I made mistakes that junior programmers all over the world would make, and I also made mistakes that I shouldn't have made
Parameters of convolutional neural network
What is an excellent fast development framework like?
Get the link behind? Parameter value after question mark
AcWing 786. Number k
Too many open files solution
String splicing method in shell
LeetCode 535. Encryption and decryption of tinyurl