当前位置:网站首页>Pointnet understanding (step 4 of pointnet Implementation)
Pointnet understanding (step 4 of pointnet Implementation)
2022-07-02 07:38:00 【xiaobai_ Ry】
PointNet The first 4 Step ——PointNet understand
front , We talked about the challenge of point cloud , Challenges for point cloud ,PointNet The paper proposes the following solutions .
One 、 Point cloud solution
1.1 Permutation invariance
The designed network must meet the permutation invariance ,N It's just a piece of data N! A permutation invariance . Symmetric functions can satisfy the above permutation invariance , as follows :
The picture comes from Qi Rui Zhongtai, a doctoral student at Stanford University : Deep learning on point cloud and its application in 3D scene understanding
Directly perform symmetry operations on data , Although it satisfies the permutation invariance , It is easy to lose a lot of geometry and meaningful information . For example, when taking the maximum value , Only get the farthest point , Average. , Only get the center of gravity .
How not to lose
Map every point to high-dimensional space , Do symmetry operations on data in higher dimensional space . For the expression of three-dimensional points in high-dimensional space , It must be redundant , But because of the redundancy of information , We synthesize through symmetry operation , It can reduce the loss of information , Keep enough point cloud information . thus , You can design this PointNet The prototype of , be called PointNet(vanilla):
The picture comes from Qi Rui Zhongtai, a doctoral student at Stanford University : Deep learning on point cloud and its application in 3D scene understanding
adopt MLP Project each point into high-dimensional space , adopt max Do symmetry .
MLP Why can it be projected into high-dimensional space ( This is an explanation for Xiaobai , Click here to )
PointNet Can arbitrarily approximate symmetric functions ( By increasing the depth and width of Neural Networks ):
The picture comes from Qi Rui Zhongtai, a doctoral student at Stanford University : Deep learning on point cloud and its application in 3D scene understanding
1.2 Rotation invariance ( Geometric invariance )
Rotation invariance refers to , By spinning , All points (x,y,z) The coordinates of change , But it's still the same object , As shown below :
So for ordinary PointNet(vanilla), If you input the same object with different rotation angles successively , It may not recognize it well . The method in the paper is a new one T-Net Network to learn point cloud rotation , Calibrate the object , The rest PointNet(vanilla) Just classify or segment the calibrated object .
Point cloud is a kind of data that is very easy to do geometric transformation , Just multiply the matrix . As shown in the figure below , One N×3 Multiply the point cloud matrix by A 3×3 The rotation matrix of can get the matrix after rotation transformation , So learn one from the input point cloud 3×3 Matrix , You can correct it .
Similarly, map the point cloud to K After the redundant space of dimension , Right again K Check the point cloud features of dimension , But this proofreading needs to introduce a regularization penalty term , We want it to be as close as possible to an orthogonal matrix .【 Regularization is due to the difficulty of high-dimensional space optimization , Regularization can reduce the difficulty of optimization .】
summary :maxpooling Solve permutation invariance ( Disorder ) problem , Spatial transformation network solves the problem of rotation invariance
Two 、PointNet
Point cloud classification network :
The picture comes from Qi Rui Zhongtai, a doctoral student at Stanford University : Deep learning on point cloud and its application in 3D scene understanding *
say concretely , For each of these N×3 Point cloud input of , The network first goes through a T-Net Align it in space ( Rotate to the front ), Re pass MLP Map it to 64 On the high dimensional space of dimension , Again 64 Align the dimension space , Finally, it maps to 1024 Dimensional space . Now for every point , There is one. 1024 Vector representation of dimensions , And this vector representation for a 3 Dimensional point clouds are obviously redundant , Therefore, at this time, maximum pooling is introduced max pool operation , take 1024 Only the largest one remains on all channels of dimension , That's what we got 1×1024 The overall characteristics of . The global feature is through a cascaded fully connected network ( That's the last MLP), Finally, one K Classification results .
Point cloud segmentation network :
The picture comes from Qi Rui Zhongtai, a doctoral student at Stanford University : Deep learning on point cloud and its application in 3D scene understanding *
The segmentation of point cloud can be defined as a classification problem of each point , If you know the classification of each point , This point can be divided into fixed categories . Of course , We cannot segment each point directly through the global coordinates . A simple and effective way is , We can put local characteristics , The features of a single point are combined with the global coordinates , Realize the function of segmentation . The simplest way is , We can put the overall characteristics , Repeat N All over , Then each one is connected with the features of the original single point .【 Explanation of insertion : As mentioned above, local features and global features are combined (64+1024=1088), So it's not difficult to explain 1088 The origin of . Now? , A single point has 1088 dimension .
】 It is equivalent to a single point being retrieved in the global feature ( That is, to look at the global characteristics of a single point “ I ” Where in this global feature ,“ I ” Which category should it belong to ?). We will do another for each connected feature MLP The change of , Finally, classify each point into M class , Equivalent to output M individual score.
3、 ... and 、PointNet experimental result
The following experimental results are from Qi Rui Zhongtai, a doctoral student at Stanford University : Deep learning on point cloud and its application in 3D scene understanding *
From the above experimental results :pointnet At that time, both segmentation and classification results exceeded the voxel series network at that time , At the same time, due to the characteristics of less parameters , Train fast , It belongs to lightweight network .
PointNet Suitable for mobile devices such as mobile phones .
PointNet Robust , Insensitive to the loss of points :
The paper found that , The points that can activate the network to the greatest extent are the backbone points of the object ( The second line below ), Take samples from it , It's easy to get the original structure . So this is PointNet The source of the network's lack of robustness .
边栏推荐
- MySQL composite index with or without ID
- [torch] the most concise logging User Guide
- Jordan decomposition example of matrix
- spark sql任务性能优化(基础)
- Drawing mechanism of view (I)
- PointNet理解(PointNet实现第4步)
- ModuleNotFoundError: No module named ‘pytest‘
- Spark SQL task performance optimization (basic)
- [in depth learning series (8)]: principles of transform and actual combat
- Interpretation of ernie1.0 and ernie2.0 papers
猜你喜欢
Implementation of yolov5 single image detection based on pytorch
[medical] participants to medical ontologies: Content Selection for Clinical Abstract Summarization
Faster-ILOD、maskrcnn_benchmark安装过程及遇到问题
[introduction to information retrieval] Chapter 6 term weight and vector space model
Faster-ILOD、maskrcnn_ Benchmark trains its own VOC data set and problem summary
【MEDICAL】Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
Alpha Beta Pruning in Adversarial Search
如何高效开发一款微信小程序
How to efficiently develop a wechat applet
【MEDICAL】Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
随机推荐
Calculate the difference in days, months, and years between two dates in PHP
Two dimensional array de duplication in PHP
【信息检索导论】第二章 词项词典与倒排记录表
Error in running test pyspark in idea2020
叮咚,Redis OM对象映射框架来了
SSM garbage classification management system
华为机试题-20190417
Illustration of etcd access in kubernetes
Faster-ILOD、maskrcnn_benchmark安装过程及遇到问题
Optimization method: meaning of common mathematical symbols
Using MATLAB to realize: Jacobi, Gauss Seidel iteration
Record of problems in the construction process of IOD and detectron2
Jordan decomposition example of matrix
allennlp 中的TypeError: Object of type Tensor is not JSON serializable错误
parser.parse_args 布尔值类型将False解析为True
Implementation of yolov5 single image detection based on pytorch
MySQL composite index with or without ID
超时停靠视频生成
TimeCLR: A self-supervised contrastive learning framework for univariate time series representation
《Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer》论文翻译