当前位置:网站首页>Overview of head pose estimation
Overview of head pose estimation
2022-07-28 08:57:00 【@BangBang】
Head posture estimation (Head Pose Estimation ): Get the posture angle of the head through a facial image . stay 3D In the space , Express The rotation of an object Can be Three Euler angles (Euler Angle) To express : Separate calculation pitch( around X Shaft rotation ),yaw( around Y Shaft rotation ) and roll( around Z Shaft rotation ) , Separate scientific names Pitch angle 、 Yaw angle and roll angle , Generally speaking, it is Look up 、 Shake your head and turn your head . Seeing is believing , Upper schematic diagram :

3D 2D Head estimation of mapping ( Conventional )
If you are familiar with camera calibration , It's easier to understand , because Head Pose Estimation The more difficult part has been solved by the Bulls , A classic Head Pose Estimation The steps of the algorithm are generally :
(1)2D Face key point detection ;(2)3D Face model matching ;(3) solve 3D Point and corresponding 2D Transformation relationship of points ;(4) The Euler angle is solved according to the rotation matrix .
As we all know, the attitude of an object relative to the camera can be expressed by rotation matrix and translation matrix :
(1) Translation matrix : The spatial position relation matrix of the object relative to the camera , use T Express ;
(2) Rotation matrix : The spatial attitude relation matrix of the object relative to the camera , use R Express .
It seems that there must be coordinate system transformation , Namely : World coordinate system (UVW)、 Camera coordinate system (XYZ)、 Image center coordinate system (uv) And pixel coordinate system (xy), Here's the picture :


See blog : Head pose estimation principle and visualization , It is more effective to face or face measurement ,yaw horn 150 The accuracy above degree is not enough , And affected by key points
Research progress of head pose estimation
Head pose estimation (HPE) The traditional head pose calculation method is to estimate some key points of the face , Use the average head model to solve the corresponding problem from two dimensions to three dimensions , We think this is a fragile approach , Because it completely depends on the performance of face key points , Headform and 3D To 2D The projection of . We propose a robust attitude determination method , stay 300W-LP( A large integrated extended data set ) Train a convolutional neural network with multiple losses through joint attitude classification and regression , Predict the inherent Euler angle directly from the image ( Yaw 、 Pitch and roll ).
Fine-Grained Head Pose Estimation Without Keypoints
We prove , Compared with the key based method , Use convolution neural network to estimate directly from the image 3D Head posture shows higher accuracy . Due to deep learning , The key point detector has been significantly improved recently . But the recovery of head posture is essentially a two-step process , There are many opportunities for mistakes . There are two main problems :
There may be not enough key points detectedsecondly , The accuracy of attitude estimation depends on 3D The mass of the headform.
General head model , It may be biased by any individual , Adjusting the head model to suit everyone needs a lot of data , And it will produce a huge amount of computation .
in application , need High precision head pose estimation , A common solution is to use RGBD( depth ) The camera . These can be very accurate , But there are also many constraints :
- Vulnerable to external light , And unconstrained posture changes .
- Depth camera draw ratio RGB More power , It is easy to have battery life problems
- RGBD The camera has a larger amount of data , Increased data storage and conversion time .

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose
Although many postures are expected to perform well , However, the network structure is relatively large, which is not suitable for running on mobile and embedded platforms .
Model



Calculate by multiple losses .

By using softmax Predict every bin Probability ( Classified loss ), To calculate yaw,pitch,roll The expectations of the , Expected and true angle Calculation MSE Loss .

In order to make the network lighter , Used EfficientNet-B0, It's not Alexnet,ResNet50 etc. ,EfficientNet-B0 yes EfficientNet The basic model of the family , Use the residual module (Inverted Residual Blocks) originate MobileNet V2, Thus, the parameters of the network are reduced . stay CPU Up to 60FPS ( Or is it 8155 platform )
Data sets


You can prune the model , Is that the parameters are further reduced .
边栏推荐
- Gbase 8A MPP and Galaxy Kirin (x86 version) complete deep adaptation
- C #, introductory tutorial -- debugging skills and logical error probe technology and source code when the program is running
- There is a bug in installing CONDA environment
- Leetcode brushes questions. I recommend this video of the sister Xueba at station B
- Top all major platforms, 22 versions of interview core knowledge analysis notes, strong on the list
- Let me teach you how to assemble a registration center?
- Dry goods semantic web, Web3.0, Web3, metauniverse, these concepts are still confused? (top)
- Eight ways to solve EMC and EMI conducted interference
- [soft test software evaluator] 2013 comprehensive knowledge over the years
- Why can ThreadLocal achieve thread isolation?
猜你喜欢

Completion report of communication software development and Application

Path and attribute labels of picture labels

【软考软件评测师】2013综合知识历年真题

Export SQL server query results to excel table

Warehouse of multiple backbone versions of yolov5

Use of tkmapper - super detailed
![Detailed explanation of DHCP distribution address of routing / layer 3 switch [Huawei ENSP]](/img/9c/b4ebe608cf639b8348adc1f1cc71c8.png)
Detailed explanation of DHCP distribution address of routing / layer 3 switch [Huawei ENSP]

Analysis and recurrence of network security vulnerabilities

第2章-14 求整数段和

Smart software completed round C financing, making Bi truly "inclusive"
随机推荐
Baidu AI Cloud Jiuzhou district and county brain, depicting a new blueprint for urban and rural areas!
Does gbase 8s support storing relational data and object-oriented data?
快速搭建一个网关服务,动态路由、鉴权的流程,看完秒会(含流程图)
When will brain like intelligence, which is popular in academia, land? Let's listen to what the industry masters say - qubits, colliders, x-knowledge Technology
Hundreds of billions of it operation and maintenance market has come to the era of speaking by "effect"
Review the past and know the new MySQL isolation level
How can MySQL query judge whether multiple field values exist at the same time
[cloud computing] several mistakes that enterprises need to avoid after going to the cloud
Machine learning how to achieve epidemic visualization -- epidemic data analysis and prediction practice
kubernetes之Deployment
Alibaba technology has four sides + intersection +hr, and successfully got the offer. Can't double non undergraduate students enter the big factory?
'global event bus' &' message subscription and Publishing '
1w5 words to introduce those technical solutions of distributed system in detail
Dry goods semantic web, Web3.0, Web3, metauniverse, these concepts are still confused? (top)
Smart software completed round C financing, making Bi truly "inclusive"
PHP基础知识 - PHP 使用 MySQLI
Recycling of classes loaded by classloader
思迈特软件Smartbi完成C轮融资,推动国产BI加速进入智能化时代
What content does the new version of network security level protection evaluation report template contain? Where can I find it?
Round C financing has been completed! Smart software leads domestic Bi ecological empowerment, and products and services are a step forward