当前位置:网站首页>Bayes' law

Bayes' law

2022-07-07 08:09:00 Steven Devin

1. probability theory

First review some probability theory .

joint probability : event A And events B Probability of simultaneous occurrence ; Also called product rule .

P ( A , B ) = P ( A ∩ B ) = P ( A ∣ B ) P ( B ) = P ( B ∣ A ) P ( A ) P(A,B) = P(A \cap B) = P(A|B)P(B) = P(B|A)P(A) P(A,B)=P(AB)=P(AB)P(B)=P(BA)P(A)

Summation rule : event A and event B The probability of different occurrences .

P ( A ∪ B ) = P ( A ) + P ( B ) − P ( A ∩ B ) P(A \cup B) = P(A) + P(B)-P(A\cap B) P(AB)=P(A)+P(B)P(AB)

If A and B Are mutually exclusive :

P ( A ∪ B ) = P ( A ) + P ( B ) P(A \cup B) = P(A) + P(B) P(AB)=P(A)+P(B)

Total probability : If the event A The occurrence of may be caused by many possible event B Lead to .
P ( A ) = ∑ i n P ( A ∣ B i ) P ( B i ) P(A) = \sum_{i} ^nP(A|B_{i})P(B_{i}) P(A)=inP(ABi)P(Bi)

Conditional probability : Given event B event A Probability of occurrence .

P ( A ∣ B ) = P ( A , B ) P ( B ) P(A|B)=\frac{P(A,B)}{P(B)} P(AB)=P(B)P(A,B)

2. Bayes' law

In machine learning , Given the observed training data B, We are often interested in finding the best hypothesis space A.

The best hypothetical space is the most possible hypothetical space , That is, given training data B, Put all kinds of training data B In hypothetical space A Medium Prior probability Add up .

According to the above definition , Finding hypothesis space A The probability is as follows :
P ( A ) = ∑ n P ( A ∣ B i ) P ( B i ) P(A) = \sum_{n} P(A|B_{i})P(B_{i}) P(A)=nP(ABi)P(Bi)
Is that familiar ?

This is actually All probability formula , event A The occurrence of may be caused by data B 1 B_1 B1, B 2 B_2 B2… … B n B_n Bn
Many reasons lead to .

For a given training data B, Finding hypothesis space A Probability , Bayesian theorem provides a more direct method .

Bayesian law uses :

  • Hypothetical space A Of Prior probability P ( A ) P(A) P(A)
  • And observation data Prior probability probability P ( B ) P(B) P(B)
  • Given a hypothetical space A, Observation data B Probability P ( B ∣ A ) P(B|A) P(BA)

Find the given observation data B, Finding hypothesis space A Probability P ( A ∣ B ) P(A|B) P(AB), Also known as Posterior probability , Because it reflects the given data B, For hypothetical space A The influence of probability .

Contrary to a priori probability , P(A) And B It's independent .

Bayes' formula :
P ( A ∣ B ) = P ( B ∣ A ) P ( A ) P ( B ) P(A|B)=\frac{P(B|A)P(A)}{P(B)} P(AB)=P(B)P(BA)P(A)

The derivation of Bayesian formula is also very simple , Combining the conditional probability and joint probability mentioned in the first part, we can find .

Conditional probability :
P ( A ∣ B ) = P ( A , B ) P ( B ) P(A|B)=\frac{P(A,B)}{P(B)} P(AB)=P(B)P(A,B)
joint probability :
P ( A , B ) = P ( B ∣ A ) P ( A ) {P(A,B)} = P(B|A)P(A) P(A,B)=P(BA)P(A)

3. Maximum posterior probability MAP

Sometimes , Given data B, Want to ask for hypothetical space A The most likely assumption in is called Maximum posterior probability MAP(Maximum a Posteriori).

A M A P = a r g m a x P ( A ∣ B ) A_{MAP} = argmax P(A|B) AMAP=argmaxP(AB)

That is o :

= a r g m a x P ( B ∣ A ) P ( A ) P ( B ) = argmax \frac{P(B|A)P(A)}{P(B)} =argmaxP(B)P(BA)P(A)

Get rid of P ( B ) P(B) P(B) Because it is related to the assumption A yes independent .
= a r g m a x P ( B ∣ A ) P ( A ) = argmax P(B|A)P(A) =argmaxP(BA)P(A)

原网站

版权声明
本文为[Steven Devin]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207070445293934.html