当前位置:网站首页>3D face reconstruction and dense alignment with position map progression network
3D face reconstruction and dense alignment with position map progression network
2022-07-27 09:56:00 【yfy2022yfy】
2019/10/31, I didn't take notes of what I read before , Recap .
Address of thesis :http://openaccess.thecvf.com/content_ECCV_2018/papers/Yao_Feng_Joint_3D_Face_ECCV_2018_paper.pdf
github Address :https://github.com/YadiraF/PRNet
Abstract
This paper presents a direct method , To achieve 3D Face reconstruction and dense alignment . Designed a 2D expression , be called UV Coordinates , Can be in UV Space saves the complete face 3D shape . And then you can train CNN Network from a 2D Figure regression 3D shape .
One 、 brief introduction
In the early 3D Reconstruction and dense alignment research , Have used 2D Key points , but 2D Keys cannot handle large angles and occlusion ;
Used 3DMM Method , Also limited by perspective projection and 3D Splines (3D ThinPlate Spline) The amount of calculation is large ;
There are end-to-end solutions , Get rid of these restrictions , But additional networks are needed to estimate depth information , And does not provide dense alignment ;
VRN Voxel representation is proposed , This method requires a lot of calculation , Low resolution , And because of the sparse characteristics of point clouds , There are a lot of invalid calculations .
This paper presents an end-to-end PRN Multitasking approach , It can complete dense face alignment and 3D Face shape reconstruction . Main contributions :
- In an end-to-end manner , High resolution 3D Face reconstruction and dense alignment ;
- Designed UV Location map , To record the face 3D Location information ;
- A weight mask is designed for loss Calculation ,loss The weight of each point in is different , It can significantly improve network performance
- CNN Adopt lightweight mode , A single face task can achieve 100FPS
- stay AFLW200-3D and Florence Reachable on dataset 25% Performance improvement of
Two 、 Methods to introduce
Take what you wrote before ppt Use it .

chart 1 UV Space
Open source datasets , Such as 300W-LP,ground-truth It's no use UV It means , So Mr. Cheng UV Training data . take (a) The point cloud coordinates in are transformed into (d) The form of expression , The method is shown in Fig 2:

After generating the required training data, you can use lightweight CNN Network to deal with :

The network structure code is as follows :
se = tcl.conv2d(x, num_outputs=size, kernel_size=4, stride=1) # 256 x 256 x 16
se = resBlock(se, num_outputs=size * 2, kernel_size=4, stride=2) # 128 x 128 x 32
se = resBlock(se, num_outputs=size * 2, kernel_size=4, stride=1) # 128 x 128 x 32
se = resBlock(se, num_outputs=size * 4, kernel_size=4, stride=2) # 64 x 64 x 64
se = resBlock(se, num_outputs=size * 4, kernel_size=4, stride=1) # 64 x 64 x 64
se = resBlock(se, num_outputs=size * 8, kernel_size=4, stride=2) # 32 x 32 x 128
se = resBlock(se, num_outputs=size * 8, kernel_size=4, stride=1) # 32 x 32 x 128
se = resBlock(se, num_outputs=size * 16, kernel_size=4, stride=2) # 16 x 16 x 256
se = resBlock(se, num_outputs=size * 16, kernel_size=4, stride=1) # 16 x 16 x 256
se = resBlock(se, num_outputs=size * 32, kernel_size=4, stride=2) # 8 x 8 x 512
se = resBlock(se, num_outputs=size * 32, kernel_size=4, stride=1) # 8 x 8 x 512
pd = tcl.conv2d_transpose(se, size * 32, 4, stride=1) # 8 x 8 x 512
pd = tcl.conv2d_transpose(pd, size * 16, 4, stride=2) # 16 x 16 x 256
pd = tcl.conv2d_transpose(pd, size * 16, 4, stride=1) # 16 x 16 x 256
pd = tcl.conv2d_transpose(pd, size * 16, 4, stride=1) # 16 x 16 x 256
pd = tcl.conv2d_transpose(pd, size * 8, 4, stride=2) # 32 x 32 x 128
pd = tcl.conv2d_transpose(pd, size * 8, 4, stride=1) # 32 x 32 x 128
pd = tcl.conv2d_transpose(pd, size * 8, 4, stride=1) # 32 x 32 x 128
pd = tcl.conv2d_transpose(pd, size * 4, 4, stride=2) # 64 x 64 x 64
pd = tcl.conv2d_transpose(pd, size * 4, 4, stride=1) # 64 x 64 x 64
pd = tcl.conv2d_transpose(pd, size * 4, 4, stride=1) # 64 x 64 x 64
pd = tcl.conv2d_transpose(pd, size * 2, 4, stride=2) # 128 x 128 x 32
pd = tcl.conv2d_transpose(pd, size * 2, 4, stride=1) # 128 x 128 x 32
pd = tcl.conv2d_transpose(pd, size, 4, stride=2) # 256 x 256 x 16
pd = tcl.conv2d_transpose(pd, size, 4, stride=1) # 256 x 256 x 16
pd = tcl.conv2d_transpose(pd, 3, 4, stride=1) # 256 x 256 x 3
pd = tcl.conv2d_transpose(pd, 3, 4, stride=1) # 256 x 256 x 3
pos = tcl.conv2d_transpose(pd, 3, 4, stride=1, activation_fn = tf.nn.sigmoid)The residual block code is as follows , Activation is relu, Normalization is BN,shortcut Corresponding to cubic convolution , Subsequent channel merging , Finally, normalize and activate :
def resBlock(x, num_outputs, kernel_size = 4, stride=1, activation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm, scope=None):
assert num_outputs%2==0 #num_outputs must be divided by channel_factor(2 here)
with tf.variable_scope(scope, 'resBlock'):
shortcut = x
if stride != 1 or x.get_shape()[3] != num_outputs:
shortcut = tcl.conv2d(shortcut, num_outputs, kernel_size=1, stride=stride,
activation_fn=None, normalizer_fn=None, scope='shortcut')
x = tcl.conv2d(x, num_outputs/2, kernel_size=1, stride=1, padding='SAME')
x = tcl.conv2d(x, num_outputs/2, kernel_size=kernel_size, stride=stride, padding='SAME')
x = tcl.conv2d(x, num_outputs, kernel_size=1, stride=1, activation_fn=None, padding='SAME', normalizer_fn=None)
x += shortcut
x = normalizer_fn(x)
x = activation_fn(x)
return xLoss
![]()

P(x,y) yes UV Prediction results of location space , Representing each pixel xyz Location
W(x,y) yes UV Weight of position space , Yes UV Space weight control ,2D Key points : Eyes, nose and mouth : Face others : other = 16:4:3:0;
Training
Training data source : Used 300W-LP Data sets
- Have face data from all angles ,resize To 256x256
- 3DMM Annotation of coefficients
- Use 3DMM Generate 3D Point cloud , And convert 3D Point cloud to UV Space
Although generated GT Used 3DMM Dimensioning factor of , But the model itself does not contain 3DMM Any linear constraint of the model .
Data augmentation : All kinds of scenes
- Angle transformation :-45 ~ 45 degree
- translation : coefficient 0.9 ~ 1.2 ( The size of the original drawing is the benchmark )
- Color channel transformation : coefficient 0.6 ~ 1.4
- Add noise 、 Texture occlusion , Simulate the real situation occlusion .
- adam Optimizer , Initial learning rate 0.0001, Every time 5 individual epoch, attenuation 1 And a half ,batch size:16
3、 ... and 、 test result
First, let's talk about what this method can do , Because I learned 2D And 3D Mapping between , The functions that dimension can realize are as follows :

Some test indicators in the paper are as follows :



边栏推荐
- 习题 --- 快排、归并、浮点数二分
- [cloud native • Devops] master the container management tool rancher
- 加油程序君
- July training (day 08) - prefix and
- Leetcode.565. array nesting____ Violent dfs- > pruning dfs- > in situ modification
- 省应急管理厅:广州可争取推广幼儿应急安全宣教经验
- LeetCode.1260. 二维网格迁移____原地暴力 / 降维+循环数组直接定位
- 如何使用TDengine Sink Connector?
- 食品安全 | 垃圾食品越吃越想吃?这份常见食品热量表请收好
- It's great to write code for 32 inch curved screen display! Send another one!
猜你喜欢

Interview Essentials: shrimp skin server 15 consecutive questions

LeetCode.814. 二叉树剪枝____DFS

ACL2021最佳论文出炉,来自字节跳动

视觉SLAM十四讲笔记(一):第一讲+第二讲

Exercises --- quick arrangement, merging, floating point number dichotomy

MOS drive in motor controller

直播倒计时 3 天|SOFAChannel#29 基于 P2P 的文件和镜像加速系统 Dragonfly

面试必备:虾皮服务端15连问

Qt 学习(二) —— Qt Creator简单介绍

3D人脸重建:Joint 3D Face Reconstruction and Dense Alignment with position Map Regression Network
随机推荐
Leetcode.565. array nesting____ Violent dfs- > pruning dfs- > in situ modification
NPM common commands
July training (day 13) - two way linked list
July training (day 18) - tree
Understand chisel language. 26. Chisel advanced input signal processing (II) -- majority voter filtering, function abstraction and asynchronous reset
吃透Chisel语言.23.Chisel时序电路(三)——Chisel移位寄存器(Shift Register)详解
Review summary of engineering surveying examination
Interview JD T5, was pressed on the ground friction, who knows what I experienced?
Why do microservices have to have API gateways?
Nacos configuration center dynamically refreshes the data source
QT | about the problem that QT creator cannot open the project and compile it
32寸曲面屏显示器写代码太爽了!再送一台!
吃透Chisel语言.26.Chisel进阶之输入信号处理(二)——多数表决器滤波、函数抽象和异步复位
Food safety | the kitchen board environment is very important. Do you know these use details?
Towards the peak of life
C # set different text watermarks for each page of word
July training (day 15) - depth first search
圆环工件毛刺(凸起)缺口(凹陷)检测案例
Google Earth engine app - print the coordinates of points to the console and map, set the style and update it
3D修复论文:Shape Inpainting using 3D Generative Adversarial Network and Recurrent Convolutional Networks