当前位置:网站首页>maximum likelihood estimation

maximum likelihood estimation

2022-06-09 20:52:00 Eager to learn

Concept

Maximum likelihood estimation (Maximum likelihood estimation, abbreviation MLE) It is a commonly used parameter estimation method in statistics , The key of maximum likelihood estimation is , Use the known sample result information , The value of the model parameter that the maximum probability leads to these sample results .

That is, first assume that it has a certain probability distribution , But its parameters are unknown , Then the parameters of the probability distribution are estimated based on the training samples .

among , Classical methods in maximum likelihood estimation , Is to estimate the parameters of probability distribution according to data sampling , Sampling needs to meet an important assumption , That is, all samples are independent and identically distributed .

For the likelihood function p ( x ∣ θ ) p(x|\theta) p(xθ), It describes the parameters for different probability models , appear x The probability of this sample point , among x Represents sample point data , θ \theta θ The parameters of the probability model representing the sample hypothesis .

And what maximum likelihood estimation has to do is , Based on the estimation of sample sampling, the model parameters when the value of likelihood function is maximized θ \theta θ.

Example

for example : Given a set of samples x1,x2,…,xn, Suppose they obey a Gaussian distribution N ( u , σ ) N(u,\sigma) N(u,σ), The parameters are estimated by maximum likelihood u , σ u,\sigma u,σ
Explain :
Firstly, the probability density function of Gaussian distribution is used as the sample Xi The probability estimate of
 Insert picture description here
For parameters u , σ u,\sigma u,σ Make a maximum likelihood estimate ,L(x) Is the likelihood function , Intuitively, it is an attempt to u , σ u,\sigma u,σ Of all values , Find a value that makes this set of data appear most likely .

Further , In order to avoid underflow caused by tandem operation , Using log likelihood and simplification, we get
 Insert picture description here
The objective function at this time can be written as follows l ( x ) l(x) l(x) function , Because of the demand for l ( x ) l(x) l(x) The maximum value of the function u , σ u,\sigma u,σ The value of , namely u , σ u,\sigma u,σ The derivative is equal to 0 The value of time is the desired , The final solution is as follows :
 Insert picture description here
In finding the extreme value of multivariate function, there is , The point where each partial derivative is equal to zero is the critical point . Find the quadratic derivative to determine whether it is the maximum point .
for example :f(x, y) One of the critical points is (x0, y0), namely fx(x0, y0) = 0 && fy(x0, y0) = 0,f The second derivative of is fxx,fxy,fyy Satisfy
 Insert picture description here

According to the calculated results , The mean value of the normal distribution obtained by the maximum likelihood estimation method is the sample mean value , Variance is the pseudo variance of the sample ( The denominator is n). This is obviously an intuitive result .

It should be noted that , Although this parametric method can make probability estimation relatively simple , However, the accuracy of the estimated results depends heavily on whether the assumed probability distribution conforms to the potential real data distribution . In practical applications , We want to make a hypothesis that we can better approach the potential true distribution , It is often necessary to make use of the experience and knowledge of the application task itself to a certain extent , Otherwise, if we assume the form of probability distribution only by guessing , It is likely to produce misleading results .

Reference resources :
https://zhuanlan.zhihu.com/p/26614750
https://blog.csdn.net/u011508640/article/details/72815981
Watermelon book 149 page

原网站

版权声明
本文为[Eager to learn]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206092044478556.html