当前位置：网站首页>【TCDCN】《Facial landmark detection by deep multi-task learning》

【TCDCN】《Facial landmark detection by deep multi-task learning》

2022-07-02 06:26:00 【bryant_meng】

在这里插入图片描述

ECCV-2014

文章目录

1 Background and Motivation
2 Related Work
3 Advantages / Contributions
4 Method
5 Experiments
- 5.1 Datasets and Metrics
- 5.2 Experiments
6 Conclusion（own） / Future work

1 Background and Motivation

人脸关键点检测是许多人脸分析任务的基本组成部件，例如人脸属性、人脸认证、人脸识别等

目前人脸关键点检测处理 partial occlusion and large head pose variations 效果还有待提高

作者注意人脸关键点的位置和人脸属性有一定的关联性，例如

when a kid is smiling, his mouth is widely opened（第二列）

the inter-ocular distance is smaller in faces with large yaw rotation（最后一列）

在这里插入图片描述

联合人脸属性（head pose estimation, gender classfication, age estimation, facial expression recognition 等）和人脸关键点一起优化，是否可以提升关键点的性能呢？

作者开始了他的叙述

2 Related Work

Facial landmark detection
Landmark detection by CNN
Multi-task learning

3 Advantages / Contributions

提出 TCDCN 网络，多任务学习，用人脸属性辅助优化人脸关键点检测（训练时配合 early stopping）

4 Method

1）网络结构

多任务包含有：人脸关键点检测（主）+ pose + gender + wear glasses + smiling

网路结构如下，4 Conv + 1 Fc
在这里插入图片描述
激活函数采用的是 absolute tangent function（应该指的是双曲正切吧，哈哈哈）

在这里插入图片描述
不同样本，相同属性，特征相似，反映了多任务学出来的 shared features 有一定的泛化性

2）Problem Formulation

先看看多任务学习的公式（省略了正则项）
在这里插入图片描述

$r$ 主任务，也即人脸关键点检测任务
$\alpha$ 辅助任务，也即人脸属性分类任务
$N$ 是样本数量
$\lambda$ 是加权系数

完整的 object function 为
在这里插入图片描述
关键点检测采用的是 least square loss，属性分类采用的是 cross entropy loss

3）Task-wise early stopping

different tasks have different loss functions and learning diffculties, and thus with different convergence rates（不同人目的地不一样，下车点也不一样）

作者在多任务学习的过程中，采用早停机制，分批下车

停止的判断依据为

在这里插入图片描述

$t$ 表示当前的迭代次数
$k$ 表示早停机制统计的迭代次数跨度
$E_{val}$ ，验证机上的 loss
$E_{tr}$ ，训练集上的 loss
$m e d$ ，median value
$\lambda_{\alpha}$ 表示辅助任务 $\alpha$ 的加权系数，learnable
第一项表示 the tendency of the training error，如果 k 次迭代过程中，loss 下降的很快，该项比较小
第二项表示 the generalization error

5 Experiments

5.1 Datasets and Metrics

数据集

AFLW
AFW

评价指标

mean error：归一化单位为 inter-ocular distance
failure rate：Mean error larger than 10% is reported as a failure

5.2 Experiments

1）Evaluating the Effectiveness of Learning with Related Task
在这里插入图片描述
FLD+pose performs the best

下面看看更详细的分析 FLD+smile
在这里插入图片描述
图5（a）可以看出 FLD（Face Landmark Detection）和 smile 属性联合训练，nose 和 mouth 的 mean error 有明显的降低，这个比较好理解（微笑 involving Zygomaticus and levator labii superioris muscles）

在这里插入图片描述
图片来自网络，侵删！！！

图5（b）的话就有点翻车的意思，哈哈，怎么 smile 和 right eye 的相关性这么高，但（a）中提升也不是特别明显

可能这个相关系数只是粗略的估算出来的吧，哈哈哈（We use a crude method to investigate the relationship between tasks.）

作者用的 Pearson’s correlation 来计算相关性的

Pearson’s correlation of the learned weight vectors of the last fully-connected layer, between the tasks of facial landmark detection and `smiling’ prediction