当前位置：网站首页>Naive Bayesian Method of Li Hang's "Statistical Learning Methods" Notes

Naive Bayesian Method of Li Hang's "Statistical Learning Methods" Notes

2022-08-02 09:33:00 【timerring】

第4章朴素贝叶斯法

Naive is a strong assumption of the whole algorithm,That is, the variables are strongly independent of each other.

例子

L take out3beans,两颗红豆1mung bean,Passers-by and I each smoked one,Passers-by found himself is in the green beans,He wants to trade the rest with me,I can't change?change beans,Is the probability of drawing red beans the same??

$\mid B)$ 表示在B发生的条件下发生A的概率.

$\mid B)=\frac{P(A B)}{P(B)}=\frac{P(B \mid A) P(A)}{P(B)}$

设AIndicates that I drew red beans,BIndicates that the passers-by drew green beans

$\mid B)=\frac{P(B \mid A) P(A)}{P(B)}=\frac{1 \cdot \frac{1}{3}}{1}=\frac{1}{3}$

Note a misunderstanding here,Since it is knownBUnder the premise of drawing green beans,因此,这里 $P (B) = 1$ 而不是 $\frac{2}{3}$ .

结论:If you want red beans,Better to exchange with passers-by.If you want green beans,最好不要换.

Suppose there is a handwritten dataset,里面有100条记录,其中第0-9条记录是10individually written0.10-19条是10individually written1.…….第90-99条是10individually written10,write a numberX,How can you tell what the number is??
How Naive Bayes Works:

$\mid X)=?, P(Y=1 \mid X)=?, \cdots \cdots, P(Y=10 \mid X)=?$

Find the one with the highest probability,就是对应的数字.

数学表达就是：

For the handwritten dataset just now, We set digital category $C_{k}, C_{0} $表示数字 $0, \cdots \cdots $.Digital discriminant formula can be modified for just now $P\left(Y=C_{\mathbf{k}} \mid X=x\right)_{\text {. }}$

$\begin{aligned} P\left(Y=C_{\mathrm{k}} \mid X=x\right)=& \frac{P\left(X=x \mid Y=C_{k}\right) P\left(Y=C_{k}\right)}{P(X=x)} \\ =& \frac{P\left(X=x \mid Y=C_{k}\right) P\left(Y=C_{k}\right)}{\sum_{k} P\left(X=x, Y=C_{k}\right)} \\ =& \frac{P\left(X=x \mid Y=C_{k}\right) P\left(Y=C_{k}\right)}{\sum_{k} P\left(X=x \mid Y=C_{k}\right) P\left(Y=C_{k}\right)} \end{aligned}$

and since each image is8x8的像素点组成,can be viewed as a one-dimensional64数组,Here is the sampleX拆开

$\begin{aligned} \mathrm{P}\left(X=x \mid Y=C_{k}\right) &=P\left(X^{(1)}=x^{(1)} \mid Y=C_{k}\right) P\left(X^{(2)}=x^{(2)} \mid Y=C_{k}\right) \cdots P\left(X^{(j)}=x^{(j)} \mid Y=C_{k}\right) \\ &=\prod_{j} P\left(X^{(j)}=x^{(j)} \mid Y=C_{k}\right) \end{aligned}$

因此上式可以化简为：

$KaTeX parse error: Expected 'EOF', got '&' at position 45: …d X = x\right) &̲ = \frac{P\left…$

$\begin{aligned} f(x)=\underset{C_{k}}{\operatorname{argmax}} P\left(Y=C_{k} \mid X=x\right) &=\frac{P\left(Y=C_{k}\right) \prod_{j} P\left(X^{(j)}=x^{(j)} \mid Y=C_{k}\right)}{\sum_{k} P\left(Y=C_{k}\right) \prod_{j} P\left(X^{(j)}=x^{(j)} \mid Y=C_{k}\right)} \\ &=P\left(Y=C_{k}\right) \prod_{j} P\left(X^{(j)}=x^{(j)} \mid Y=C_{k}\right) \end{aligned}$

其中 $\underset{C_{k}}{\operatorname{argmax}}$ It means to find the one that maximizes the probability of the latter $C_{k}$ ,其中：

$\begin{array}{l} \sum_{k} \prod_{j} p\left(X^{(j)}=x^{(j)} \mid Y=C_k) p(Y=C_k)\right. \\ =\sum_{k } \sum_{j} p\left(X^{(j)}=x^{(j)}, Y=C_{k}\right)=\sum_{j} P\left(X^{(j)}=x^{(j)}\right) \\ =P(X=x) \end{array}$

朴素贝叶斯法的参数估计

极大似然估计

在朴素贝叶斯法中, 学习意味着估计 $P\left(Y=c_{k}\right)$ 和 $P\left(X^{(j)}=x^{(j)} \mid Y=c_{k}\right)$ .可以 Apply maximum likelihood estimation to estimate corresponding probabilities.先验概率 $P\left(Y=c_{k}\right)$ 的极大似然估计是

$P\left(Y=c_{k}\right)=\frac{\sum_{i=1}^{N} I\left(y_{i}=c_{k}\right)}{N}, \quad k=1,2, \cdots, K$

设第 j 个特征 $x^{(j)} $ 可能取值的集合为 $\left\{a_{j 1}, a_{j 2}, \cdots, a_{j S_{j}}\right\}$ , 条件概率 $P\left(X^{(j)}=a_{j l} \mid Y=\right. \left.c_{k}\right)$ 的极大似然估计是

$\begin{array}{l} P\left(X^{(j)}=a_{j l} \mid Y=c_{k}\right)=\frac{\sum_{i=1}^{N} I\left(x_{i}^{(j)}=a_{j l}, y_{i}=c_{k}\right)}{\sum_{i=1}^{N} I\left(y_{i}=c_{k}\right)} \\ j=1,2, \cdots, n ; \quad l=1,2, \cdots, S_{j} ; \quad k=1,2, \cdots, K \end{array}$