当前位置:网站首页>【TCDCN】《Facial landmark detection by deep multi-task learning》
【TCDCN】《Facial landmark detection by deep multi-task learning》
2022-07-02 06:26:00 【bryant_meng】
ECCV-2014
文章目录
1 Background and Motivation
人脸关键点检测是许多人脸分析任务的基本组成部件,例如人脸属性、人脸认证、人脸识别等
目前人脸关键点检测处理 partial occlusion and large head pose variations 效果还有待提高
作者注意人脸关键点的位置和人脸属性有一定的关联性,例如
when a kid is smiling, his mouth is widely opened(第二列)
the inter-ocular distance is smaller in faces with large yaw rotation(最后一列)
联合人脸属性(head pose estimation, gender classfication, age estimation, facial expression recognition 等)和人脸关键点一起优化,是否可以提升关键点的性能呢?
作者开始了他的叙述
2 Related Work
- Facial landmark detection
- Landmark detection by CNN
- Multi-task learning
3 Advantages / Contributions
提出 TCDCN 网络,多任务学习,用人脸属性辅助优化人脸关键点检测(训练时配合 early stopping)
4 Method
1)网络结构
多任务包含有:人脸关键点检测(主)+ pose + gender + wear glasses + smiling
网路结构如下,4 Conv + 1 Fc
激活函数采用的是 absolute tangent function(应该指的是双曲正切吧,哈哈哈)
不同样本,相同属性,特征相似,反映了多任务学出来的 shared features 有一定的泛化性
2)Problem Formulation
先看看多任务学习的公式(省略了正则项)
r r r 主任务,也即人脸关键点检测任务
α \alpha α 辅助任务,也即人脸属性分类任务
N N N 是样本数量
λ \lambda λ 是加权系数
完整的 object function 为
关键点检测采用的是 least square loss,属性分类采用的是 cross entropy loss
3)Task-wise early stopping
different tasks have different loss functions and learning diffculties, and thus with different convergence rates(不同人目的地不一样,下车点也不一样)
作者在多任务学习的过程中,采用早停机制,分批下车
停止的判断依据为
- t t t 表示当前的迭代次数
- k k k 表示早停机制统计的迭代次数跨度
- E v a l E_{val} Eval,验证机上的 loss
- E t r E_{tr} Etr,训练集上的 loss
- m e d med med,median value
- λ α \lambda_{\alpha} λα 表示辅助任务 α \alpha α 的加权系数,learnable
- 第一项表示 the tendency of the training error,如果 k 次迭代过程中,loss 下降的很快,该项比较小
- 第二项表示 the generalization error
5 Experiments
5.1 Datasets and Metrics
数据集
- AFLW
- AFW
评价指标
- mean error:归一化单位为 inter-ocular distance
- failure rate:Mean error larger than 10% is reported as a failure
5.2 Experiments
1)Evaluating the Effectiveness of Learning with Related Task
FLD+pose performs the best
下面看看更详细的分析 FLD+smile
图5(a)可以看出 FLD(Face Landmark Detection) 和 smile 属性联合训练,nose 和 mouth 的 mean error 有明显的降低,这个比较好理解(微笑 involving Zygomaticus and levator labii superioris muscles)
图片来自网络,侵删!!!
图5(b)的话就有点翻车的意思,哈哈,怎么 smile 和 right eye 的相关性这么高,但(a)中提升也不是特别明显
可能这个相关系数只是粗略的估算出来的吧,哈哈哈(We use a crude method to investigate the relationship between tasks.)
作者用的 Pearson’s correlation 来计算相关性的
Pearson’s correlation of the learned weight vectors of the last fully-connected layer, between the tasks of facial landmark detection and `smiling’ prediction
如何理解皮尔逊相关系数(Pearson Correlation Coefficient)? - TimXP的回答 - 知乎
https://www.zhihu.com/question/19734616/answer/117730676
再看看联合 pose 的提升情况
DDDD,pose 对 FLD 影响最大,合理
2)The Bene ts of Task-wise Early Stopping
pose 最晚停
early stopping,使得 FLD 收敛更快,更稳定
3)Comparison with the Cascaded CNN
开始 solo 【Cascade FPD】《Deep Convolutional Network Cascade for Facial Point Detection》了,哈哈
精度
嘴差点意思,cascaded CNN 中 left mouth mean error 也有相当一部分集中在 10% 以内
速度
120ms vs 17ms on an Intel Core i5 CPU
7x faster
4)Comparison with other State-of-the-art Methods
秀
5)TCDCN for Robust Initialization
Instead of drawing training samples randomly as initialization,这,上来就让你车马炮是吧……
6 Conclusion(own) / Future work
早停机制
different tasks have different loss functions and learning diffculties, and thus with different convergence rates
边栏推荐
- Using MATLAB to realize: power method, inverse power method (origin displacement)
- Common CNN network innovations
- Practice and thinking of offline data warehouse and Bi development
- [introduction to information retrieval] Chapter 1 Boolean retrieval
- 半监督之mixmatch
- The first quickapp demo
- 解决万恶的open failed: ENOENT (No such file or directory)/(Operation not permitted)
- 【调参Tricks】WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach
- Common machine learning related evaluation indicators
- 【论文介绍】R-Drop: Regularized Dropout for Neural Networks
猜你喜欢
Point cloud data understanding (step 3 of pointnet Implementation)
SSM laboratory equipment management
Classloader and parental delegation mechanism
[model distillation] tinybert: distilling Bert for natural language understanding
TimeCLR: A self-supervised contrastive learning framework for univariate time series representation
【模型蒸馏】TinyBERT: Distilling BERT for Natural Language Understanding
Using MATLAB to realize: Jacobi, Gauss Seidel iteration
使用百度网盘上传数据到服务器上
Common CNN network innovations
label propagation 标签传播
随机推荐
PHP returns the corresponding key value according to the value in the two-dimensional array
PPT的技巧
view的绘制机制(三)
【信息检索导论】第一章 布尔检索
腾讯机试题
Find in laravel8_ in_ Usage of set and upsert
Regular expressions in MySQL
SSM supermarket order management system
Huawei machine test questions
Faster-ILOD、maskrcnn_benchmark安装过程及遇到问题
SSM laboratory equipment management
深度学习分类优化实战
ERNIE1.0 与 ERNIE2.0 论文解读
Faster-ILOD、maskrcnn_benchmark训练自己的voc数据集及问题汇总
MySQL composite index with or without ID
Calculate the total in the tree structure data in PHP
ModuleNotFoundError: No module named ‘pytest‘
win10解决IE浏览器安装不上的问题
程序的内存模型
【Ranking】Pre-trained Language Model based Ranking in Baidu Search