当前位置:网站首页>[point cloud processing paper crazy reading frontier version 8] - pointview gcn: 3D shape classification with multi view point clouds
[point cloud processing paper crazy reading frontier version 8] - pointview gcn: 3D shape classification with multi view point clouds
2022-07-03 09:07:00 【LingbinBu】
Pointview-GCN: 3D Shape Classification With Multi-View Point Clouds
Abstract
- Capture part of the point cloud from multiple viewpoints around the object 3D shape classification
- Pointview-GCN have multi-level Of Graph Convolutional Networks (GCNs), With fine-to-coarse To aggregate the shape features of a single view point cloud , So as to achieve the right Geometric information of the object and Multi perspective relationship The purpose of coding
- The code can be found in the :https://github.com/SMohammadi89/PointView-GCN PyTorch edition
1. introduction
- The point cloud data captured in real life are all part of the point cloud obtained from different perspectives
- Graph Convolutional Networks (GCNs) It is proved that it is right to Semantic relationship coding for feature aggregation The power of
- Pointview-GCN A method with multi-level GCNs Network of , Aggregate shape features from partial point clouds of multiple views , With fine-to-coarse Mining semantic relationships in adjacent views
- In different layers GCNs Inter join skip connection
- A new data set is proposed , This data set contains point cloud data from a single perspective
2. Related work
MVCNN Use max-pooling Aggregate features from different views , Finally, we get a global shape descriptor, The disadvantage is that The semantic relationship between multi view data is not considered .
View-GCN A new method based on view Graph convolution network , Capture structural relationships in data , But all of the above methods are to aggregate features on the image .
data:image/s3,"s3://crabby-images/bde94/bde942d8fe6347310ea2a2804f00646f51526856" alt=""
3. Method
- First, take multiple partial point data from different perspectives of the object
- utilize backbone Extract the features of each part of the point cloud
- Create a with N N N Graph of nodes G = { v i } i ∈ N G=\left\{ {v_i} \right\}_{i \in N} G={ vi}i∈N, Pass the first i i i Shape features of single view point cloud data F i F_i Fi Representation node v i v_i vi, among F = { F i } i ∈ N \mathbf{F}=\left\{ {F_i} \right\}_{i \in N} F={ Fi}i∈N yes G G G All node characteristics of , v p v_p vp yes v i v_i vi Adjacent points of (kNN), G G G The adjacency matrix of is A \mathbf{A} A
data:image/s3,"s3://crabby-images/daf84/daf845aa098a13bfc9aafa9ebbf102213a7736f1" alt=""
It is proposed that the feature aggregation of network includes multiple level Of GCNs, Pictured 2 Shown ,level The optimal number of M M M Determined by experiment .
In the j j j individual level in , For the input G j G^j Gj perform graph convolution operation , Update node characteristics F i F_i Fi, Followed by an optional view-sampling, Get smaller graph G j + 1 G^{j+1} Gj+1, G j + 1 G^{j+1} Gj+1 It contains G j G^{j} Gj The most important view information .
G j + 1 G^{j+1} Gj+1 It is put into the... As input again j + 1 j+1 j+1 individual level in .
3.1. Graph convolution and Selective View Sampling
In the j j j individual level in , Perform the following three operations :
- local graph convolution
- non-local message passing
- selective view sampling (SVS)
Local graph convolution
Consider nodes v i j v_i^j vij And its adjacent nodes ,local graph convolution Update the node through the following formula v i j v_i^j vij Characteristics of :
F ~ j = L ( A j F j W j ; α j ) \tilde{\mathbf{F}}^{j}=\mathcal{L}\left(\mathbf{A}^{j} \mathbf{F}^{j} \mathbf{W}^{j} ; \alpha^{j}\right) F~j=L(AjFjWj;αj)
among L ( ⋅ ) \mathcal{L}(\cdot) L(⋅) Express LeakyReLU operation , α j \alpha^{j} αj and W j \mathbf{W}^{j} Wj Is the weight matrix .
non-local message passing
Then we have to pass non-local message passing to update F ~ j \tilde{\mathbf{F}}^{j} F~j, consider G j G^{j} Gj Long distance relationship between all nodes in . Every node v i v_i vi First, update its state to the edge between adjacent vertices :
m i , p j = R ( F ~ i j , F ~ p j ; β j ) i , p ∈ N j m_{i, p}^{j}=\mathcal{R}\left(\tilde{F}_{i}^{j}, \tilde{F}_{p}^{j} ; \beta^{j}\right)_{i, p \in N^{j}} mi,pj=R(F~ij,F~pj;βj)i,p∈Nj
among R ( ⋅ ) \mathcal{R}(\cdot) R(⋅) Represents the... Between a pair of views relation function, β j \beta^{j} βj yes related parameters.
Then update the feature of the vertex by the following formula :
F ~ i j = C ( F ~ i j , ∑ p = 1 , p ≠ i N j m i , p j ; γ j ) \tilde{F}_{i}^{j}=\mathcal{C}\left(\tilde{F}_{i}^{j}, \sum_{p=1, p \neq i}^{N_{j}} m_{i, p}^{j} ; \gamma^{j}\right) F~ij=C⎝⎛F~ij,p=1,p=i∑Njmi,pj;γj⎠⎞
among C ( ⋅ ) \mathcal{C}(\cdot) C(⋅) yes combination function, γ j \gamma^{j} γj yes related parameters.
Through non-local message passing after , The feature is updated by considering the relationship of the whole graph .
selective view sampling (SVS)
- Use Farthest Point Sampling (FPS) Yes G j G^{j} Gj Take the next sample
- Each node after down sampling v i v_i vi The nearest neighbor of V i j \mathbf{V}_{i}^{j} Vij in , Use view-selector choice softmax The node with the largest function response
- take coarsened G j + 1 G^{j+1} Gj+1 And updated F j + 1 \mathbf{F}^{j+1} Fj+1 Put it on the next layer to continue processing
3.2. Multi-level feature aggregation and training loss
In each layer graph convolution after , All have one floor max-pooling It works on F j \mathbf{F}^{j} Fj On , The goal is to get every level Global shape feature on F global F_{\text {global }} Fglobal .
The final global shape feature F global F_{\text {global }} Fglobal It's all level Middle quilt pool Stitching of back features .
From the first floor convolution level To the last floor convolution level Added a residual connection, Avoid when GCNs level The increase in the number of leads to the disappearance of the gradient .
Training losses consist of two elements , Global shape loss L global L_{\text {global }} Lglobal and selective-view Shape loss L selective L_{\text {selective }} Lselective :
L = L global ( S ( F global ) , y ) + ∑ j = 1 M ∑ i = 1 N j + 1 ∑ v s ∈ V i j L selective ( V ( F s j ; θ j ) , y ) \begin{aligned} L=& L_{\text {global }}\left(\mathcal{S}\left(F_{\text {global }}\right), y\right)+\\ & \sum_{j=1}^{M} \sum_{i=1}^{N^{j+1}} \sum_{v_{s} \in \mathbf{V}_{i}^{j}} L_{\text {selective }}\left(\mathcal{V}\left(F_{s}^{j} ; \theta^{j}\right), y\right) \end{aligned} L=Lglobal (S(Fglobal ),y)+j=1∑Mi=1∑Nj+1vs∈Vij∑Lselective (V(Fsj;θj),y)
among L global L_{\text {global }} Lglobal It's cross entropy loss , S \mathcal{S} S It includes the full connection layer and softmax Function classifier , y y y Is shape classification . L selective L_{\text {selective }} Lselective Is used for view selector Cross entropy of , Ensure that the selected view can recognize shape shape classification . V ( ⋅ ) \mathcal{V}(\cdot) V(⋅) Is used for view selector Function of , Parameter is θ j \theta^{j} θj. F s j F_{s}^{j} Fsj Is the node after down sampling .
During the training , Only L global L_{\text {global }} Lglobal Participate in .
4. experiment
Dataset generation
ModelNet40 Contains 12311 individual model,40 Categories
ScanObjectNN Contains 2909 individual model,15 Categories
Based on this, we build 4 Data sets :Model-D, Model-H, Scan-D and Scan-H
D Represents icosahedral (20 individual viewpoints),H It means hemisphere (12 individual viewpoints)
Implementation details
backbone:PointNet++ /DGCNN
4.1. Comparison against state-of-the-art methods
data:image/s3,"s3://crabby-images/af5b5/af5b5a87cd136be40135a7faf6d2b9a695e52fd7" alt=""
data:image/s3,"s3://crabby-images/38d97/38d971c2e1851be76bd004d7e7dcb07955f0aedf" alt=""
4.2. Ablation studies
Effects of levels of GCN and skip connection
data:image/s3,"s3://crabby-images/f2fad/f2fadc3a7d408b7891709ad60b649e33ee5d1c7e" alt=""
Effects of number of input views
data:image/s3,"s3://crabby-images/a7895/a78953b02b19b03a1ff49b9d10c5b597641192bd" alt=""
Hemispherical angle of view design accuracy is not as good as icosahedral , Maybe it's because the bottom doesn't collect enough information .
Effects of PointNet++ models of varying classification accuracy
data:image/s3,"s3://crabby-images/e01d6/e01d6c0cb54f3e784f1d08af7ee0bee288da872c" alt=""
边栏推荐
- Solution of 300ms delay of mobile phone
- I made mistakes that junior programmers all over the world would make, and I also made mistakes that I shouldn't have made
- LeetCode 438. Find all letter ectopic words in the string
- State compression DP acwing 291 Mondrian's dream
- 【点云处理之论文狂读经典版12】—— FoldingNet: Point Cloud Auto-encoder via Deep Grid Deformation
- 求组合数 AcWing 886. 求组合数 II
- LeetCode 532. K-diff number pairs in array
- 即时通讯IM,是时代进步的逆流?看看JNPF怎么说
- 【点云处理之论文狂读经典版10】—— PointCNN: Convolution On X-Transformed Points
- How to use Jupiter notebook
猜你喜欢
Query XML documents with XPath
LeetCode 241. Design priorities for operational expressions
Dom4j traverses and updates XML
LeetCode 532. 数组中的 k-diff 数对
数字化管理中台+低代码,JNPF开启企业数字化转型的新引擎
【点云处理之论文狂读前沿版9】—Advanced Feature Learning on Point Clouds using Multi-resolution Features and Learni
State compression DP acwing 91 Shortest Hamilton path
LeetCode 241. 为运算表达式设计优先级
Slice and index of array with data type
AcWing 785. 快速排序(模板)
随机推荐
拯救剧荒,程序员最爱看的高分美剧TOP10
What is the difference between sudo apt install and sudo apt -get install?
Basic knowledge of network security
Excel is not as good as jnpf form for 3 minutes in an hour. Leaders must praise it when making reports like this!
传统办公模式的“助推器”,搭建OA办公系统,原来就这么简单!
With low code prospect, jnpf is flexible and easy to use, and uses intelligence to define a new office mode
SQL statement error of common bug caused by Excel cell content that is not paid attention to for a long time
AcWing 787. 归并排序(模板)
On a un nom en commun, maître XX.
<, < <,>, > > Introduction in shell
DOM render mount patch responsive system
我们有个共同的名字,XX工
LeetCode 532. K-diff number pairs in array
Memory search acwing 901 skiing
Slice and index of array with data type
What are the stages of traditional enterprise digital transformation?
DOM 渲染系统(render mount patch)响应式系统
LeetCode 75. 颜色分类
网络安全必会的基础知识
On the difference and connection between find and select in TP5 framework