当前位置：网站首页>AI operation ch8

AI operation ch8

2022-06-12 06:20:00 【JamSlade】

1

•	[ Decision tree ]  Based on information gain , Build a decision tree for the following data sets , Describe the process 
 The data needed for a decision-making classification of glasses , Data set containing 4 attribute ：
    age
    astigmatism
    trear-prod-rate For input features ,
    contact-lenses Is a decision attribute .

Insert picture description here

The first feature

We can consider the following formula
$G(D,a)=H(D)-\sum^V_{v=1}\frac{|D^v|}{D}H(D^v)$

$H (D)$ It has been decided when the data is confirmed , So we only need to consider the second half $\sum^V_{v=1}\frac{|D^v|}{D}$

Consider three eigenvalues first

For age

The eigenvalue	soft	hard	none	sum
young	1	1	1	3
pre-prebyopic	1	1	3	5
prebyopic	0	1	3	4

It is not difficult to get through the formula
$\begin{aligned}age = &-[\frac{3}{12}(\frac{1}{3}log_2\frac{1}{3}+\frac{1}{3}log_2\frac{1}{3}+\frac{1}{3}log_2\frac{1}{3})\\ &+\frac{5}{12}(\frac{1}{5}log_2\frac{1}{5}+\frac{1}{5}log_2\frac{1}{5}+\frac{3}{5}log_2\frac{3}{5})\\&+\frac{4}{12}(\frac{1}{4}log_2\frac{1}{4}+\frac{3}{4}log_2\frac{3}{4})] = 1.238\end{aligned}$

For astigmatism

The eigenvalue	soft	hard	none	sum
yes	0	3	4	7
no	1	1	3	5

Generation into the formula

$a s t i g m a t i s m = 0.979$

Tear production rate

The eigenvalue	soft	hard	none	sum
reduced	0	0	4	4
normal	2	3	3	8

Generation into the formula

$tear\_production\_rate = 1.041$

So we First take astigmatism You can maximize the function

Second feature

Then consider the remaining features

First be based on Yes situation Input characteristics under

The eigenvalue	hard	none	sum
young	1	1	2
pre-prebyopic	1	2	3
prebyopic	1	1	2
reduced	0	2	2
normal	3	2	5

$\begin{aligned}age= &-[\frac{2}{7}(\frac{1}{2}log_2\frac{1}{2}+\frac{1}{2}log_2\frac{1}{2}) \\ & +\frac{3}{7}(\frac{1}{3}log_2\frac{1}{3}+\frac{1}{3}log_2\frac{1}{3}+\frac{1}{3}log_2\frac{1}{3})\\ &+\frac{2}{7}(\frac{1}{2}log_2\frac{1}{2}+\frac{1}{2}log_2\frac{1}{2})] = 0.965\end{aligned}$
$tear\_production\_rate = 0.694$
** take yes When to choose tear
**
be based on No The situation of

The eigenvalue	soft	none	sum
young	1	0	1
pre-prebyopic	1	1	2
prebyopic	0	2	2
reduced	0	2	2
normal	2	1	3

$a g e = 0.4$
$t e a r = 0.551$

take no You should choose age

The following decision tree can be obtained
Insert picture description here

2.

[ Linear classification ] The following is derived logit function and logistic function Equivalent ：
$p(X)=\frac{e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X}}\quad \frac{p(X)}{1-p(X)}=e^{\beta_0+\beta_1X}$
Exchange element , Make $f(X)=\frac{p(X)}{1-p(X)}, \frac{f(X)}{1-f(X)}=p(X)$

$\left.\begin{aligned} \frac{p(X)}{1-p(X)}=f(X)& =\frac{\frac{e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X}}} {1- \frac{e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X}}}\\ \\ & =\frac{e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X} -(e^{\beta_0+\beta_1X}) }\\ &=e^{\beta_0+\beta_1X}\\ &=\frac{f(X)}{1-f(X)} = p(X) \end{aligned}\right.$