当前位置:网站首页>3D face reconstruction and dense alignment with position map progression network
3D face reconstruction and dense alignment with position map progression network
2022-07-27 09:56:00 【yfy2022yfy】
2019/10/31, I didn't take notes of what I read before , Recap .
Address of thesis :http://openaccess.thecvf.com/content_ECCV_2018/papers/Yao_Feng_Joint_3D_Face_ECCV_2018_paper.pdf
github Address :https://github.com/YadiraF/PRNet
Abstract
This paper presents a direct method , To achieve 3D Face reconstruction and dense alignment . Designed a 2D expression , be called UV Coordinates , Can be in UV Space saves the complete face 3D shape . And then you can train CNN Network from a 2D Figure regression 3D shape .
One 、 brief introduction
In the early 3D Reconstruction and dense alignment research , Have used 2D Key points , but 2D Keys cannot handle large angles and occlusion ;
Used 3DMM Method , Also limited by perspective projection and 3D Splines (3D ThinPlate Spline) The amount of calculation is large ;
There are end-to-end solutions , Get rid of these restrictions , But additional networks are needed to estimate depth information , And does not provide dense alignment ;
VRN Voxel representation is proposed , This method requires a lot of calculation , Low resolution , And because of the sparse characteristics of point clouds , There are a lot of invalid calculations .
This paper presents an end-to-end PRN Multitasking approach , It can complete dense face alignment and 3D Face shape reconstruction . Main contributions :
- In an end-to-end manner , High resolution 3D Face reconstruction and dense alignment ;
- Designed UV Location map , To record the face 3D Location information ;
- A weight mask is designed for loss Calculation ,loss The weight of each point in is different , It can significantly improve network performance
- CNN Adopt lightweight mode , A single face task can achieve 100FPS
- stay AFLW200-3D and Florence Reachable on dataset 25% Performance improvement of
Two 、 Methods to introduce
Take what you wrote before ppt Use it .

chart 1 UV Space
Open source datasets , Such as 300W-LP,ground-truth It's no use UV It means , So Mr. Cheng UV Training data . take (a) The point cloud coordinates in are transformed into (d) The form of expression , The method is shown in Fig 2:

After generating the required training data, you can use lightweight CNN Network to deal with :

The network structure code is as follows :
se = tcl.conv2d(x, num_outputs=size, kernel_size=4, stride=1) # 256 x 256 x 16
se = resBlock(se, num_outputs=size * 2, kernel_size=4, stride=2) # 128 x 128 x 32
se = resBlock(se, num_outputs=size * 2, kernel_size=4, stride=1) # 128 x 128 x 32
se = resBlock(se, num_outputs=size * 4, kernel_size=4, stride=2) # 64 x 64 x 64
se = resBlock(se, num_outputs=size * 4, kernel_size=4, stride=1) # 64 x 64 x 64
se = resBlock(se, num_outputs=size * 8, kernel_size=4, stride=2) # 32 x 32 x 128
se = resBlock(se, num_outputs=size * 8, kernel_size=4, stride=1) # 32 x 32 x 128
se = resBlock(se, num_outputs=size * 16, kernel_size=4, stride=2) # 16 x 16 x 256
se = resBlock(se, num_outputs=size * 16, kernel_size=4, stride=1) # 16 x 16 x 256
se = resBlock(se, num_outputs=size * 32, kernel_size=4, stride=2) # 8 x 8 x 512
se = resBlock(se, num_outputs=size * 32, kernel_size=4, stride=1) # 8 x 8 x 512
pd = tcl.conv2d_transpose(se, size * 32, 4, stride=1) # 8 x 8 x 512
pd = tcl.conv2d_transpose(pd, size * 16, 4, stride=2) # 16 x 16 x 256
pd = tcl.conv2d_transpose(pd, size * 16, 4, stride=1) # 16 x 16 x 256
pd = tcl.conv2d_transpose(pd, size * 16, 4, stride=1) # 16 x 16 x 256
pd = tcl.conv2d_transpose(pd, size * 8, 4, stride=2) # 32 x 32 x 128
pd = tcl.conv2d_transpose(pd, size * 8, 4, stride=1) # 32 x 32 x 128
pd = tcl.conv2d_transpose(pd, size * 8, 4, stride=1) # 32 x 32 x 128
pd = tcl.conv2d_transpose(pd, size * 4, 4, stride=2) # 64 x 64 x 64
pd = tcl.conv2d_transpose(pd, size * 4, 4, stride=1) # 64 x 64 x 64
pd = tcl.conv2d_transpose(pd, size * 4, 4, stride=1) # 64 x 64 x 64
pd = tcl.conv2d_transpose(pd, size * 2, 4, stride=2) # 128 x 128 x 32
pd = tcl.conv2d_transpose(pd, size * 2, 4, stride=1) # 128 x 128 x 32
pd = tcl.conv2d_transpose(pd, size, 4, stride=2) # 256 x 256 x 16
pd = tcl.conv2d_transpose(pd, size, 4, stride=1) # 256 x 256 x 16
pd = tcl.conv2d_transpose(pd, 3, 4, stride=1) # 256 x 256 x 3
pd = tcl.conv2d_transpose(pd, 3, 4, stride=1) # 256 x 256 x 3
pos = tcl.conv2d_transpose(pd, 3, 4, stride=1, activation_fn = tf.nn.sigmoid)The residual block code is as follows , Activation is relu, Normalization is BN,shortcut Corresponding to cubic convolution , Subsequent channel merging , Finally, normalize and activate :
def resBlock(x, num_outputs, kernel_size = 4, stride=1, activation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm, scope=None):
assert num_outputs%2==0 #num_outputs must be divided by channel_factor(2 here)
with tf.variable_scope(scope, 'resBlock'):
shortcut = x
if stride != 1 or x.get_shape()[3] != num_outputs:
shortcut = tcl.conv2d(shortcut, num_outputs, kernel_size=1, stride=stride,
activation_fn=None, normalizer_fn=None, scope='shortcut')
x = tcl.conv2d(x, num_outputs/2, kernel_size=1, stride=1, padding='SAME')
x = tcl.conv2d(x, num_outputs/2, kernel_size=kernel_size, stride=stride, padding='SAME')
x = tcl.conv2d(x, num_outputs, kernel_size=1, stride=1, activation_fn=None, padding='SAME', normalizer_fn=None)
x += shortcut
x = normalizer_fn(x)
x = activation_fn(x)
return xLoss
![]()

P(x,y) yes UV Prediction results of location space , Representing each pixel xyz Location
W(x,y) yes UV Weight of position space , Yes UV Space weight control ,2D Key points : Eyes, nose and mouth : Face others : other = 16:4:3:0;
Training
Training data source : Used 300W-LP Data sets
- Have face data from all angles ,resize To 256x256
- 3DMM Annotation of coefficients
- Use 3DMM Generate 3D Point cloud , And convert 3D Point cloud to UV Space
Although generated GT Used 3DMM Dimensioning factor of , But the model itself does not contain 3DMM Any linear constraint of the model .
Data augmentation : All kinds of scenes
- Angle transformation :-45 ~ 45 degree
- translation : coefficient 0.9 ~ 1.2 ( The size of the original drawing is the benchmark )
- Color channel transformation : coefficient 0.6 ~ 1.4
- Add noise 、 Texture occlusion , Simulate the real situation occlusion .
- adam Optimizer , Initial learning rate 0.0001, Every time 5 individual epoch, attenuation 1 And a half ,batch size:16
3、 ... and 、 test result
First, let's talk about what this method can do , Because I learned 2D And 3D Mapping between , The functions that dimension can realize are as follows :

Some test indicators in the paper are as follows :



边栏推荐
- 35 spark streaming backpressure mechanism, spark data skew solution and kylin's brief introduction
- 深度剖析分库分表最强辅助Sharding Sphere
- 省应急管理厅:广州可争取推广幼儿应急安全宣教经验
- July training (day 08) - prefix and
- 活体检测综述
- Interview Essentials: shrimp skin server 15 consecutive questions
- July training (day 16) - queue
- 2016 outlook
- July training (day 09) - two point search
- Gbase 8A MPP cluster capacity expansion practice
猜你喜欢

Looking for a job for 4 months, interviewing 15 companies and getting 3 offers

食品安全 | 无糖是真的没有糖吗?这些真相要知道

数据分析如何解决商业问题?这里有份超详细攻略

吃透Chisel语言.26.Chisel进阶之输入信号处理(二)——多数表决器滤波、函数抽象和异步复位

Exercises --- quick arrangement, merging, floating point number dichotomy

历时一年,论文终于被国际顶会接收了

吃透Chisel语言.25.Chisel进阶之输入信号处理(一)——异步输入与去抖动

Meeting seating function of conference OA project & Implementation of meeting submission for approval

Leetcode.1260. 2D grid migration____ In situ violence / dimensionality reduction + direct positioning of circular array

会议OA项目之会议排座功能&&会议送审的实现
随机推荐
安装了HAL库如何恢复原来的版本
July training (day 08) - prefix and
Understand chisel language. 27. Chisel advanced finite state machine (I) -- basic finite state machine (Moore machine)
吃透Chisel语言.26.Chisel进阶之输入信号处理(二)——多数表决器滤波、函数抽象和异步复位
Explain knative cloud function framework in simple terms!
好久不送书,浑身不舒服
32寸曲面屏显示器写代码太爽了!再送一台!
深度剖析分库分表最强辅助Sharding Sphere
July training (day 15) - depth first search
Leetcode.1260. 2D grid migration____ In situ violence / dimensionality reduction + direct positioning of circular array
Food safety | the more you eat junk food, the more you want to eat it? Please keep this common food calorimeter
Looking for a job for 4 months, interviewing 15 companies and getting 3 offers
About getter/setter methods
July training (day 16) - queue
Towards the peak of life
flash闪存使用和STM32CUBEMX安装教程【第三天】
7/26 thinking +dp+ suffix array learning
c'mon! Please don't ask me about ribbon's architecture principle during the interview
Exercises --- quick arrangement, merging, floating point number dichotomy
[cloud native • Devops] master the container management tool rancher