当前位置:网站首页>【TCDCN】《Facial landmark detection by deep multi-task learning》
【TCDCN】《Facial landmark detection by deep multi-task learning》
2022-07-02 07:45:00 【bryant_ meng】
ECCV-2014
List of articles
1 Background and Motivation
Face key point detection is the basic component of many face analysis tasks , For example, face attributes 、 Face authentication 、 Face recognition, etc
At present, face key point detection processing partial occlusion and large head pose variations The effect needs to be improved
The author noticed that the location of the key points of the face has a certain correlation with the face attributes , for example
when a kid is smiling, his mouth is widely opened( Second column )
the inter-ocular distance is smaller in faces with large yaw rotation( The last column )
Joint face attributes (head pose estimation, gender classfication, age estimation, facial expression recognition etc. ) Optimize with face key points , Can we improve the performance of key points ?
The author begins his narration
2 Related Work
- Facial landmark detection
- Landmark detection by CNN
- Multi-task learning
3 Advantages / Contributions
Put forward TCDCN The Internet , Multi task learning , Use face attributes to assist in optimizing face key point detection ( Cooperate during training early stopping)
4 Method
1) Network structure
Multitasking includes : Face key point detection ( Lord )+ pose + gender + wear glasses + smiling
The network structure is as follows ,4 Conv + 1 Fc
The activation function uses absolute tangent function( It should mean hyperbolic tangent , Ha ha ha )
Different samples , Same attribute , Characteristic similarity , It reflects the multi task learning shared features It has certain generalization
2)Problem Formulation
Let's first look at the formula of multi task learning ( The regular term is omitted )
r r r Main task , That is, face key point detection task
α \alpha α Auxiliary task , That is, face attribute classification task
N N N Is the number of samples
λ \lambda λ Is the weighting factor
complete object function by
The key point detection adopts least square loss, Attribute classification adopts cross entropy loss
3)Task-wise early stopping
different tasks have different loss functions and learning diffculties, and thus with different convergence rates( Different people have different destinations , The drop off point is also different )
The author is in the process of multi task learning , Adopt early stop mechanism , Get off in batches
The judgment basis for stopping is
- t t t Indicates the current number of iterations
- k k k Indicates the iteration number span of the statistics of the early stop mechanism
- E v a l E_{val} Eval, Verify the loss
- E t r E_{tr} Etr, From the training set loss
- m e d med med,median value
- λ α \lambda_{\alpha} λα Indicates auxiliary task α \alpha α The weighting coefficient of ,learnable
- The first item means the tendency of the training error, If k During the next iteration ,loss It's falling fast , This item is relatively small
- The second means the generalization error
5 Experiments
5.1 Datasets and Metrics
Data sets
- AFLW
- AFW
The evaluation index
- mean error: The normalized unit is inter-ocular distance
- failure rate:Mean error larger than 10% is reported as a failure
5.2 Experiments
1)Evaluating the Effectiveness of Learning with Related Task
FLD+pose performs the best
Let's look at a more detailed analysis FLD+smile
chart 5(a) It can be seen that FLD(Face Landmark Detection) and smile Attribute joint training ,nose and mouth Of mean error There is a significant decrease , This is easier to understand ( smile involving Zygomaticus and levator labii superioris muscles)
Picture from the Internet , Invasion and deletion !!!
chart 5(b) It's a bit of a rollover , ha-ha , Yes? smile and right eye The correlation is so high , but (a) The promotion in is not particularly obvious
Maybe this correlation coefficient is only a rough estimate , Ha ha ha (We use a crude method to investigate the relationship between tasks.)
The author used Pearson’s correlation To calculate the Correlation
Pearson’s correlation of the learned weight vectors of the last fully-connected layer, between the tasks of facial landmark detection and `smiling’ prediction
How to understand Pearson correlation coefficient (Pearson Correlation Coefficient)? - TimXP Answer - You know
https://www.zhihu.com/question/19734616/answer/117730676
Let's look at the union pose The promotion of
DDDD,pose Yes FLD Maximum impact , reasonable
2)The Bene ts of Task-wise Early Stopping
pose Stop at the latest
early stopping, bring FLD Convergence is faster , A more stable
3)Comparison with the Cascaded CNN
Start solo 【Cascade FPD】《Deep Convolutional Network Cascade for Facial Point Detection》 了 , ha-ha
precision
Mouth almost meaning ,cascaded CNN in left mouth mean error There is also a considerable part concentrated in 10% within
Speed
120ms vs 17ms on an Intel Core i5 CPU
7x faster
4)Comparison with other State-of-the-art Methods
Show
5)TCDCN for Robust Initialization
Instead of drawing training samples randomly as initialization, this , Come up and let you do it ……
6 Conclusion(own) / Future work
Early stop mechanism
different tasks have different loss functions and learning diffculties, and thus with different convergence rates
边栏推荐
- SSM second hand trading website
- Feeling after reading "agile and tidy way: return to origin"
- ModuleNotFoundError: No module named ‘pytest‘
- 【Mixed Pooling】《Mixed Pooling for Convolutional Neural Networks》
- 【Ranking】Pre-trained Language Model based Ranking in Baidu Search
- Comparison of chat Chinese corpus (attach links to various resources)
- Common CNN network innovations
- Drawing mechanism of view (II)
- PointNet理解(PointNet实现第4步)
- Traditional target detection notes 1__ Viola Jones
猜你喜欢
How to clean up logs on notebook computers to improve the response speed of web pages
Win10+vs2017+denseflow compilation
常见的机器学习相关评价指标
【FastDepth】《FastDepth:Fast Monocular Depth Estimation on Embedded Systems》
【AutoAugment】《AutoAugment:Learning Augmentation Policies from Data》
ModuleNotFoundError: No module named ‘pytest‘
Spark SQL task performance optimization (basic)
Faster-ILOD、maskrcnn_ Benchmark training coco data set and problem summary
A slide with two tables will help you quickly understand the target detection
【MnasNet】《MnasNet:Platform-Aware Neural Architecture Search for Mobile》
随机推荐
《Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer》论文翻译
[Bert, gpt+kg research] collection of papers on the integration of Pretrain model with knowledge
Play online games with mame32k
机器学习理论学习:感知机
Sparksql data skew
Calculate the difference in days, months, and years between two dates in PHP
[introduction to information retrieval] Chapter 6 term weight and vector space model
MMDetection安装问题
解决latex图片浮动的问题
[multimodal] clip model
Huawei machine test questions-20190417
聊天中文语料库对比(附上各资源链接)
ModuleNotFoundError: No module named ‘pytest‘
Classloader and parental delegation mechanism
【AutoAugment】《AutoAugment:Learning Augmentation Policies from Data》
[model distillation] tinybert: distilling Bert for natural language understanding
【Mixed Pooling】《Mixed Pooling for Convolutional Neural Networks》
【DIoU】《Distance-IoU Loss:Faster and Better Learning for Bounding Box Regression》
Transform the tree structure into array in PHP (flatten the tree structure and keep the sorting of upper and lower levels)
Convert timestamp into milliseconds and format time in PHP