当前位置:网站首页>【Wing Loss】《Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks》
【Wing Loss】《Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks》
2022-07-02 07:45:00 【bryant_ meng】
CVPR-2018
List of articles
1 Background and Motivation
Key points of the face , Like the tip of the nose , Eye Center , For others such as face recognition / Facial recognition / 3D Face reconstruction and other face analysis tasks provide a wealth of geometric information
The development of deep learning has also greatly improved the face key point location task in unstructured face scenes
One crucial aspect of deep learning is to define a loss function leading to better-learnt representation from underlying data.
The author analyzes different loss Effect in face key point detection task , Put forward new losses wing loss
2 Related Work
- Different regressions are analyzed loss Advantages and disadvantages , Put forward wing loss For face key point location
- Put forward pose-based data balance
- Propose a new method of face key point detection Network structure
3 Advantages / Contributions
Network Architectures
Go straight back to
Forecast thermodynamic diagramDealing with Pose Variations
multiview models
use 3D face models
multi-task learning( Promote each other )Cascaded Networks
4 Method
1) Loss function
The loss functions commonly used in regression tasks are as follows
L1 / L2 / smooth L1
L2 loss is sensitive to outliers
smooth L1 loss yes L1 and L2 The combination of , It's also a special case of the Huber loss
come from [ Face key point detection ] Wing loss Interpretation of the thesis
By analyzing AFLW On dataset NME(Normalised Mean Error) The cumulative distribution histogram of
loss functions analysed in the last section(L1 / L2 / Smooth L1) perform well for large errors( That is, the roughly red framed area , Large error , But the corresponding samples are relatively few )
more attention should be paid to small and medium range errors( It's roughly a green framed area ,NME partial medium and small, There are many samples —— The curve is steep )
The author further analyzes
Error is x x x Under the circumstances
L1 The magnitude of the loss gradient is 1, Optimize step size (optimal step sizes) Then for x x x, The greater the error , The more optimization times
L2 The magnitude of the loss gradient is x x x, The optimization step is 1, The greater the error , The greater the gradient
In both cases the update towards the solution will be dominated by larger errors
That is to say ,it is hard to correct relatively small displacements
The author introduces ln x x x loss To alleviate the above L1 / L2 loss The disadvantage of being too affected by large errors ,
ln x x x Gradient magnitude 1 x \frac{1}{x} x1, The optimization step is x 2 x^2 x2, Small error when , Big gradient Small steps , Big error when , Small gradient Large step size
For large error and small error scenarios , Both gradient and step size have a good equilibrium
Considering the large initial error of face key point location task , To speed up convergence , The author introduces L1 loss(L2 Yes outline sensitive ), The overall design is as follows
C = w − w l n ( 1 + ∣ w ∣ / ε ) C = w - wln(1+|w|/\varepsilon ) C=w−wln(1+∣w∣/ε) Constant , Ensure the smoothness of the piecewise function
ε \varepsilon ε It should not be set too small , According to the gradient − w ε + x -\frac{w}{\varepsilon+x} −ε+xw It can be seen that , ε \varepsilon ε After hours , In the face of small errors , There may be a gradient explosion
2)Pose-based data balancing
In order to solve extreme pose variations
Procrustes Analysis + PAC To analyze the shape distribution of faces in the dataset
Principle comes from 《Master Opencv… Reading notes 》 Non rigid face tracking II
The distribution of projection coefficient of the training samples is represented by a histogram with K bins give the result as follows
Abscissa face angle , Number of ordinate samples
Pose-based data balancing The strategy is to put bin Face samples corresponding to relatively few positions , Repeat sampling , Make it more balanced
3) Network structure
A focus on loss And sample strategy , The design of network structure is relatively simple
Pose-based data balancing Relieved out-of-plane head rotations problem
There are other factors that affect the final face key point location
in-plane head rotations and inaccurate bounding boxes output from a poor face detector etc.
So the author cascade For a moment
The first stage CNN6, Input 64x64x3, Output keys + Refined offset of face frame ?(refine the input image for the second network by removing the in-plane head rotation and correcting the bounding box, I can't see too much information )
The first stage CNN7, Input 128x127x3, Output keys
5 Experiments
5.1 Datasets and Metrics
AFLW:19 A little bit ,AFLW protocol The evaluation index ——NME + Face frame width
300W:68 A little bit ,NME + inter-pupil distance
5.2 Experiments
1)loss
AFLW On dataset , Different loss Comparison
Explore the next wing loss in w w w and ε \varepsilon ε Setting of super parameters
2)Pose-based data balancing Strategy
Even if CNN coordination L1 and L2 loss, Also super fierce , explain Pose-based data balancing The strategy is very effective
300W
3)Run time and network architectures
6 Conclusion(own) / Future work
CNN6 What is the output of , Ha ha ha ! I have to check the code
wing loss, chart 4 Is the core source of inspiration
Pose-based data balancing
Angle definition :
The picture is from How to rotate the face in the image ? | CVPR 2018in-plane / out-of-plane, Quote the following two answers
It should be that stress or deformation occurs in a xoy In the plane , Call in-plane, Yes z The component force or displacement of the direction is called out-of-plane
Deformation is relative to a plane , There are some deformations in this plane , Some deformations are out of plane deformations , That's the corresponding in-plane and out-of-plane
Learn others' understanding
边栏推荐
- PointNet理解(PointNet实现第4步)
- 【Sparse-to-Dense】《Sparse-to-Dense:Depth Prediction from Sparse Depth Samples and a Single Image》
- Common machine learning related evaluation indicators
- 超时停靠视频生成
- 【MEDICAL】Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
- How to efficiently develop a wechat applet
- 【Paper Reading】
- What if the laptop task manager is gray and unavailable
- Handwritten call, apply, bind
- [torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated
猜你喜欢
Using MATLAB to realize: Jacobi, Gauss Seidel iteration
Tencent machine test questions
Sorting out dialectics of nature
【FastDepth】《FastDepth:Fast Monocular Depth Estimation on Embedded Systems》
iOD及Detectron2搭建过程问题记录
【深度学习系列(八)】:Transoform原理及实战之原理篇
How do vision transformer work?【论文解读】
MoCO ——Momentum Contrast for Unsupervised Visual Representation Learning
Timeout docking video generation
Traditional target detection notes 1__ Viola Jones
随机推荐
SSM garbage classification management system
PointNet原理证明与理解
Installation and use of image data crawling tool Image Downloader
Use matlab to realize: chord cut method, dichotomy, CG method, find zero point and solve equation
Cognitive science popularization of middle-aged people
[multimodal] clip model
Faster-ILOD、maskrcnn_ Benchmark installation process and problems encountered
【Cascade FPD】《Deep Convolutional Network Cascade for Facial Point Detection》
Point cloud data understanding (step 3 of pointnet Implementation)
CONDA common commands
label propagation 标签传播
常见的机器学习相关评价指标
生成模型与判别模型的区别与理解
How to turn on night mode on laptop
conda常用命令
What if a new window always pops up when opening a folder on a laptop
MoCO ——Momentum Contrast for Unsupervised Visual Representation Learning
PHP uses the method of collecting to insert a value into the specified position in the array
自然辩证辨析题整理
Yolov3 trains its own data set (mmdetection)