当前位置:网站首页>1.17 daily improvement of winter vacation learning (frequency school and Bayesian school) and maximum likelihood estimation

1.17 daily improvement of winter vacation learning (frequency school and Bayesian school) and maximum likelihood estimation

2022-06-26 08:47:00 Thick Cub with thorns

1.17 Make a little progress every day during the winter vacation

Fundamentals of machine learning

Frequency school and Bayes school

For a dataset X : d a t a → X ( x 1 , x 2 , … , x n ) T θ : p a r a m e t e r X:data \to X(x_1 , x_2 , \dots, x_n)^T \quad \theta:parameter X:dataX(x1,x2,,xn)Tθ:parameter

x − p ( x ∣ θ ) x -p(x|\theta) xp(xθ)

Frequency school ( Statistical machine learning )

θ \theta θ: Unknown constant ,x Obey probability distribution

θ m l e = a r g m a x θ l o g P ( x ∣ θ ) \theta_{mle} = arg max_\theta logP(x|\theta) θmle=argmaxθlogP(xθ)

optimization problem

  1. Model
  2. loss function
  3. The algorithm calculates the loss

Bayesian school ( Probability graph model )

θ \theta θ Obey probability distribution , θ − p ( x ∣ θ ) \theta-p(x|\theta) θp(xθ)

Bayesian formula derivation
p ( θ ∣ x ) = p ( x ∣ θ ) p ( θ ) p ( x ) p(\theta|x)=\frac{p(x|\theta)p(\theta)}{p(x)} p(θx)=p(x)p(xθ)p(θ)
p ( θ ) p(\theta) p(θ): Prior probability p ( θ ∣ x ) p(\theta|x) p(θx): Posterior probability

Bayesian estimation

p ( θ ∣ x ) = p ( x ∣ θ ) p ( θ ) ∫ θ p ( x ∣ θ ) p ( θ ) d θ p(\theta|x)=\frac{p(x|\theta)p(\theta)}{\int_\theta p(x|\theta)p(\theta)d\theta} p(θx)=θp(xθ)p(θ)dθp(xθ)p(θ)

Bayesian prediction

KaTeX parse error: Expected 'EOF', got '&' at position 17: …p(\tilde{x}|X) &̲=\int_\theta p(…

The problem of integral

Gaussian distribution

Normal distribution

For a dataset X : d a t a → X ( x 1 , x 2 , … , x n ) T X:data \to X(x_1 , x_2 , \dots, x_n)^T X:dataX(x1,x2,,xn)T

Ask for his maximum likelihood estimation

θ m l e = a r g m a x θ l o g P ( x ∣ θ ) \theta_{mle} = arg max_\theta logP(x|\theta) θmle=argmaxθlogP(xθ)

solve μ \mu μ( Unbiased estimate )

l o g P ( x ∣ θ ) = log ⁡ ∏ i = 1 N p ( x i ∣ θ ) = ∑ i = 1 N log ⁡ p ( x i ∣ θ ) μ M L E = arg ⁡ m a x μ l o g P ( x ∣ θ ) = a r g m a x μ ( ∑ i = 1 N − ( x i − μ ) 2 2 σ 2 ) = a r g m i n μ m i n ∑ i = 1 N ( x i − μ ) 2 partial guide Count = 0 μ i = 1 N ∑ i = 1 N x i E ( μ M L E ) = 1 N ∑ i = 1 N E [ x i ] = μ logP(x|\theta)=\log \prod\limits_{i=1}^Np(x_i|\theta)=\sum\limits_{i=1}^N \log p(x_i|\theta) \\ \mu_{MLE}=\arg max_\mu logP(x|\theta)\\ =argmax_\mu(\sum\limits_{i=1}^N-\frac{(x_i-\mu)^2}{2\sigma^2}) \\ =argmin_\mu min\sum\limits_{i=1}^N(x_i-\mu)^2 \\ Partial derivative =0 \\ \mu_i=\frac{1}{N}\sum\limits_{i=1}^Nx_i \\ E(\mu_{MLE})=\frac{1}{N}\sum\limits_{i=1}^NE[x_i]=\mu logP(xθ)=logi=1Np(xiθ)=i=1Nlogp(xiθ)μMLE=argmaxμlogP(xθ)=argmaxμ(i=1N2σ2(xiμ)2)=argminμmini=1N(xiμ)2 partial guide Count =0μi=N1i=1NxiE(μMLE)=N1i=1NE[xi]=μ

solve σ \sigma σ( A biased estimate )

KaTeX parse error: Expected group after '_' at position 80: …\ Partial derivative =0 \\ \sum_̲\limits{i=1}^N[…

原网站

版权声明
本文为[Thick Cub with thorns]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202170554155314.html