当前位置:网站首页>【TCDCN】《Facial landmark detection by deep multi-task learning》
【TCDCN】《Facial landmark detection by deep multi-task learning》
2022-07-02 07:45:00 【bryant_ meng】
ECCV-2014
List of articles
1 Background and Motivation
Face key point detection is the basic component of many face analysis tasks , For example, face attributes 、 Face authentication 、 Face recognition, etc
At present, face key point detection processing partial occlusion and large head pose variations The effect needs to be improved
The author noticed that the location of the key points of the face has a certain correlation with the face attributes , for example
when a kid is smiling, his mouth is widely opened( Second column )
the inter-ocular distance is smaller in faces with large yaw rotation( The last column )
Joint face attributes (head pose estimation, gender classfication, age estimation, facial expression recognition etc. ) Optimize with face key points , Can we improve the performance of key points ?
The author begins his narration
2 Related Work
- Facial landmark detection
- Landmark detection by CNN
- Multi-task learning
3 Advantages / Contributions
Put forward TCDCN The Internet , Multi task learning , Use face attributes to assist in optimizing face key point detection ( Cooperate during training early stopping)
4 Method
1) Network structure
Multitasking includes : Face key point detection ( Lord )+ pose + gender + wear glasses + smiling
The network structure is as follows ,4 Conv + 1 Fc
The activation function uses absolute tangent function( It should mean hyperbolic tangent , Ha ha ha )
Different samples , Same attribute , Characteristic similarity , It reflects the multi task learning shared features It has certain generalization
2)Problem Formulation
Let's first look at the formula of multi task learning ( The regular term is omitted )
r r r Main task , That is, face key point detection task
α \alpha α Auxiliary task , That is, face attribute classification task
N N N Is the number of samples
λ \lambda λ Is the weighting factor
complete object function by
The key point detection adopts least square loss, Attribute classification adopts cross entropy loss
3)Task-wise early stopping
different tasks have different loss functions and learning diffculties, and thus with different convergence rates( Different people have different destinations , The drop off point is also different )
The author is in the process of multi task learning , Adopt early stop mechanism , Get off in batches
The judgment basis for stopping is
- t t t Indicates the current number of iterations
- k k k Indicates the iteration number span of the statistics of the early stop mechanism
- E v a l E_{val} Eval, Verify the loss
- E t r E_{tr} Etr, From the training set loss
- m e d med med,median value
- λ α \lambda_{\alpha} λα Indicates auxiliary task α \alpha α The weighting coefficient of ,learnable
- The first item means the tendency of the training error, If k During the next iteration ,loss It's falling fast , This item is relatively small
- The second means the generalization error
5 Experiments
5.1 Datasets and Metrics
Data sets
- AFLW
- AFW
The evaluation index
- mean error: The normalized unit is inter-ocular distance
- failure rate:Mean error larger than 10% is reported as a failure
5.2 Experiments
1)Evaluating the Effectiveness of Learning with Related Task
FLD+pose performs the best
Let's look at a more detailed analysis FLD+smile
chart 5(a) It can be seen that FLD(Face Landmark Detection) and smile Attribute joint training ,nose and mouth Of mean error There is a significant decrease , This is easier to understand ( smile involving Zygomaticus and levator labii superioris muscles)
Picture from the Internet , Invasion and deletion !!!
chart 5(b) It's a bit of a rollover , ha-ha , Yes? smile and right eye The correlation is so high , but (a) The promotion in is not particularly obvious
Maybe this correlation coefficient is only a rough estimate , Ha ha ha (We use a crude method to investigate the relationship between tasks.)
The author used Pearson’s correlation To calculate the Correlation
Pearson’s correlation of the learned weight vectors of the last fully-connected layer, between the tasks of facial landmark detection and `smiling’ prediction
How to understand Pearson correlation coefficient (Pearson Correlation Coefficient)? - TimXP Answer - You know
https://www.zhihu.com/question/19734616/answer/117730676
Let's look at the union pose The promotion of
DDDD,pose Yes FLD Maximum impact , reasonable
2)The Bene ts of Task-wise Early Stopping
pose Stop at the latest
early stopping, bring FLD Convergence is faster , A more stable
3)Comparison with the Cascaded CNN
Start solo 【Cascade FPD】《Deep Convolutional Network Cascade for Facial Point Detection》 了 , ha-ha
precision
Mouth almost meaning ,cascaded CNN in left mouth mean error There is also a considerable part concentrated in 10% within
Speed
120ms vs 17ms on an Intel Core i5 CPU
7x faster
4)Comparison with other State-of-the-art Methods
Show
5)TCDCN for Robust Initialization
Instead of drawing training samples randomly as initialization, this , Come up and let you do it ……
6 Conclusion(own) / Future work
Early stop mechanism
different tasks have different loss functions and learning diffculties, and thus with different convergence rates
边栏推荐
- Pointnet understanding (step 4 of pointnet Implementation)
- Calculate the total in the tree structure data in PHP
- Sorting out dialectics of nature
- Cognitive science popularization of middle-aged people
- MMDetection模型微调
- 【Mixed Pooling】《Mixed Pooling for Convolutional Neural Networks》
- Apple added the first iPad with lightning interface to the list of retro products
- Interpretation of ernie1.0 and ernie2.0 papers
- 【Mixup】《Mixup:Beyond Empirical Risk Minimization》
- Get the uppercase initials of Chinese Pinyin in PHP
猜你喜欢
[paper introduction] r-drop: regulated dropout for neural networks
[introduction to information retrieval] Chapter 1 Boolean retrieval
基于onnxruntime的YOLOv5单张图片检测实现
PointNet原理证明与理解
How to efficiently develop a wechat applet
Use Baidu network disk to upload data to the server
论文写作tip2
Mmdetection installation problem
Pointnet understanding (step 4 of pointnet Implementation)
[model distillation] tinybert: distilling Bert for natural language understanding
随机推荐
Translation of the paper "written mathematical expression recognition with bidirectionally trained transformer"
[mixup] mixup: Beyond Imperial Risk Minimization
基于onnxruntime的YOLOv5单张图片检测实现
【Cutout】《Improved Regularization of Convolutional Neural Networks with Cutout》
MMDetection模型微调
Faster-ILOD、maskrcnn_ Benchmark installation process and problems encountered
超时停靠视频生成
label propagation 标签传播
CONDA creates, replicates, and shares virtual environments
win10解决IE浏览器安装不上的问题
【DIoU】《Distance-IoU Loss:Faster and Better Learning for Bounding Box Regression》
Tencent machine test questions
Memory model of program
latex公式正体和斜体
SSM personnel management system
Conversion of numerical amount into capital figures in PHP
Handwritten call, apply, bind
PHP returns the abbreviation of the month according to the numerical month
半监督之mixmatch
CONDA common commands