当前位置:网站首页>2D human pose estimation with residual log likelihood estimation (RLE) [link only]
2D human pose estimation with residual log likelihood estimation (RLE) [link only]
2022-06-10 15:49:00 【light169】
【 Reference resources 】 Focus on Chapter 4
[ICCV2021 Oral] Learn the potential error distribution ——Human Pose Regression with Residual Log-likelihood Estimation(RLE) Paper notes - You know- RLE After recasting the glory of the regression method , Where are the similarities and differences between regression and heat map ?| Attitude estimation ICCV2021 Post reading experiment - You know
- Zero basic understanding RLE(Residual Log-likelihood Estimation)| Attitude estimation ICCV 2021 Oral - You know
- Flow based generation model -Flow based generative models - You know Mr. Li Hongyi note
Take the error in training as a sample , utilize MLE Maximum likelihood estimation and Flow-based Generate models to learn the potential error distribution
This is a ICCV2021 Oral A paper on , I happened to see it while following up the latest paper on attitude estimation . The core idea of this paper is the content cited above , Although this paper is aimed at human posture estimation , But its Learning error distribution The idea can be extended to any task .
1. Let's start with the Gauss heat map
as everyone knows , Attitude estimation is divided into Coordinate regression and Thermogram regression Two factions , What I started with was the return of the heat map . The heat map in the heat map regression has always been designed by hand
Two dimensional independent Gaussian distribution of . For example, suppose the output heat map
A resolution of 64×64, be
by 2, Somewhere on the heat map The value of is

among
Is the real coordinate point of the key point in the input picture .
Why use Gaussian distribution ? Until I read this paper and understood the rationality of Gaussian distribution heat map .
2. Coordinate error distribution

MSE "Mean Square Error"

Easy to understand , That can be regressed from the coordinates first
The loss function looks like . Coordinate regression directly predicts the coordinates of feature points
( Some also predict
, Represents the variance of the error distribution ), and
The loss function is :
(2)
Here we only focus on one key point , Like the left shoulder . This loss function is very intuitive , That is, let the predicted coordinate point approach the real coordinate point , It can be derived from maximum likelihood estimation .
2.1 Maximum likelihood estimation
[ Maximum likelihood estimation (Maximum likelihood estimation)]
Let the sample obey the normal distribution
, Then the likelihood function is 
We assume that the error
Meet expectations for 0、 The variance of
Gaussian distribution of :

As for why it can be assumed , See the last part of the answer below , The central limit theorem .
What is the essence of the least square method ? - Ma's answer - You know What is the essence of the least square method ? - You know
What is different from the least square method is , We predicted here
, For different characteristic points / human body , The variance of the error distribution is different .

Output density
Understand... From another angle ( From the perspective of the original paper ), We can argue that Coordinate regression prediction One. Gaussian distribution
, Gaussian distribution The expectation is
、 The variance of
, That is, it is predicted that the coordinates of the real feature points are at a certain point X The probability of satisfies the distribution :

Here we explain why the heat map in the heat map regression adopts Gaussian distribution . In fact, the sample we observed is
, So using Maximum likelihood estimation Law , We need to maximize
:

If we assume that
Is constant , You can start from (5) You get (1) type .
2.2 Is it really a Gaussian distribution ?( The motivation of the thesis )
Other works mentioned in the paper pointed out that , Use Laplacian distribution Assuming better performance , Corresponding
Loss function . A distribution hypothesis closer to the true error distribution , Should have better performance . Therefore, the author proposes to use flow-based generativate model Come on Learn the potential error distribution .
3. The core idea
3.1 Flow-based generativate model

Flow-based generativate model, source : Teacher lihongyi's courseware
First, a brief introduction Flow-based generativate model( hereinafter referred to as Flow), The goal is to train a generator G , A simple distribution
The samples in are transformed into complex distributions
Sample in :
![]()
For complex distributions
The sample we observed is
( For example, the real animation avatar in the animation avatar generation task ), So according to the maximum likelihood estimation , We need to maximize the probability of these samples :

According to the formula in the upper right corner of the picture , Available :
![]()
There is no need to know why , Just know the generator G It is designed to be easy to find the inverse function . The whole training process is to combine all the observed samples with complex distribution
Seeking inverse
And maximize its probability
.
The next question is How to convert simple distribution to complex distribution ( What is a variable ,
still
etc. ), This leads to three designs in the original paper .
3.2 Basic Design
Direct learning predicts expectations as
、 The variance of
Transform the Gaussian distribution of to the coordinates of feature points
Potential distribution of , So when training with each
For observed samples .

Basic design
What's wrong with this design ? The original text says :
Therefore, ϕ will learn to fit the distribution ofacross all images. Nevertheless, the distribution that we want to learn is about how the output deviates from the ground truth conditioning on the input image, not the distribution of the ground truth itself across all images
I think the reason why this design is not good may be the simple distribution of almost every training
All different , The complex distribution of coordinates of each feature point of each human body is also different . A complex distribution has just used a sample to train the mapping from this simple distribution to itself , The next time, a simple distribution is changed , The training samples are scattered .
3.3 Reparameterization
Learn to convert standard normal distribution to error
Potential distribution of , It is equivalent to using re parameterization technique to skillfully realize Basic Design The goal of , While avoiding Basic Design The problem that training samples are dispersed in .

Direct likelihood estimation with reparameterization
This design is equivalent to taking all errors in the training process as samples , But at first, the coordinate prediction error was very large 、Flow Just started training , Probably Flow Learned the wrong error distribution , and Regressor The error distribution of this error is also used ( Loss function ) As a guide , Cannot promote each other .
This design is equivalent to taking all errors in the training process as samples , But at first, the coordinate prediction error was very large 、Flow Just started training , Probably Flow Learned the wrong error distribution , and Regressor The error distribution of this error is also used ( Loss function ) As a guide , Cannot promote each other .
3.4 Residual Log-likelihood Estimation
Learn to convert standard normal distribution to error
The quotient of the potential distribution and the Gaussian distribution (Residual), It is mainly to solve the cold start problem mentioned in the previous design ( It should be possible to say that ? ).

Residual log-likelihood estimation with reparameterization
From the point of view of the loss function , In fact, it is designed by hand
Loss function term :

among
It's the standard normal distribution .
4. Other
5. Reference material :
- Mr. Li Hongyi's course Flow-based Generative Model
- Flow based generation model -Flow based generative models - You know ( Text version )
- Li Hongyi -Flow-based Generative Model_ Bili, Bili _bilibili
- Address of thesis https://arxiv.org/pdf/2107.11291.pdf
- Code address https://github.com/Jeff-sjtu/res-loglikelihood-regression
边栏推荐
- Guanghetong cooperates with China Mobile, HP, MediaTek and Intel to build 5g fully connected PC pan terminal products
- C # game prototype character map dual movement
- 产品设计软件Figma用不了,国内有哪些相似功能的软件
- AEC of the three swordsmen in audio and video processing: the cause of echo generation and the principle of echo cancellation
- 企业如何提升文档管理水平
- Common QR decomposition, SVD decomposition and other matrix decomposition methods of visual slam to solve full rank and deficient rank least squares problems (analysis and summary of the most complete
- 2290. Minimum Obstacle Removal to Reach Corner
- This and object prototypes
- Kubernetes 1.24:statefulset introduces maxunavailable copies
- 【高代码文件格式API】上海道宁为您提供文件格式API集——Aspose,只需几行代码即可创建转换和操作100多种文件格式
猜你喜欢

凸函数的Hessian矩阵与高斯牛顿下降法增量矩阵半正定性的理解

Guanghetong high computing power intelligent module injects intelligence into 5g c-v2x in the trillion market

ORB_SLAM2视觉惯性紧耦合定位技术路线与代码详解1——IMU流型预积分

Using GDB to quickly read the kernel code of PostgreSQL

视觉SLAM常见的QR分解SVD分解等矩阵分解方式求解满秩和亏秩最小二乘问题(最全的方法分析总结)

Technology sharing | quick intercom, global intercom

this和对象原型

探索数据可视化开发平台FlyFish开源背后的秘密!

Recommend an easy-to-use designer navigation website

这几个垂直类小众导航网站,你绝对不会想错过
随机推荐
ORB_SLAM2视觉惯性紧耦合定位技术路线与代码详解3——紧耦合优化模型
ORB_ Slam2 visual inertial tight coupling positioning technology route and code explanation 1 - IMU flow pattern pre integration
[reward publicity] [content co creation] issue 16 may Xu sublimation, create a good time! You can also win a gift package of up to 500 yuan if you sign a contract with Huawei cloud Xiaobian!
docket命令
How does CRM help enterprises and salespeople?
Quelqu'un a même dit que ArrayList était deux fois plus grand. Aujourd'hui, je vais vous montrer le code source ArrayList.
3. Encounter the form of handycontrol again
凸函数的Hessian矩阵与高斯牛顿下降法增量矩阵半正定性的理解
[cloud native | kubernetes] in depth RC, RS, daemonset, statefulset (VII)
json. Load (s) and json dump(s)
uniapp中常用到的方法(部分) - 時間戳問題及富文本解析圖片問題
姿态估计之2D人体姿态估计 - SimDR: Is 2D Heatmap Representation Even Necessary for Human Pose Estimation?
视觉SLAM常见的QR分解SVD分解等矩阵分解方式求解满秩和亏秩最小二乘问题(最全的方法分析总结)
Click to unlock "keyword" of guanghetong 5g module
【无标题】
[untitled]
2290. Minimum Obstacle Removal to Reach Corner
opencv神经网络库之SVM和ANN_MLP的使用
Vins theory and code explanation 4 - initialization
无线通信模组如何助力智能无人机打造“空中物联网”?