当前位置:网站首页>AI作业ch8
AI作业ch8
2022-06-12 06:18:00 【JamSlade】
1
• [决策树] 基于信息增益,对下述数据集进行决策树构建,描述过程
一个关于配眼镜的一个决策分类所需要的数据,数据集包含4属性:
age
astigmatism
trear-prod-rate为输入特征,
contact-lenses为决策属性。

第一特征
我们可以考虑以下公式
G ( D , a ) = H ( D ) − ∑ v = 1 V ∣ D v ∣ D H ( D v ) G(D,a)=H(D)-\sum^V_{v=1}\frac{|D^v|}{D}H(D^v) G(D,a)=H(D)−v=1∑VD∣Dv∣H(Dv)
H ( D ) H(D) H(D)在数据确定的时候已经定下来了,所以我们只需要考虑后半部分 ∑ v = 1 V ∣ D v ∣ D \sum^V_{v=1}\frac{|D^v|}{D} ∑v=1VD∣Dv∣
先考虑三个特征值
- 针对年龄
| 特征值 | soft | hard | none | sum |
|---|---|---|---|---|
| young | 1 | 1 | 1 | 3 |
| pre-prebyopic | 1 | 1 | 3 | 5 |
| prebyopic | 0 | 1 | 3 | 4 |
通过公式不难得到
a g e = − [ 3 12 ( 1 3 l o g 2 1 3 + 1 3 l o g 2 1 3 + 1 3 l o g 2 1 3 ) + 5 12 ( 1 5 l o g 2 1 5 + 1 5 l o g 2 1 5 + 3 5 l o g 2 3 5 ) + 4 12 ( 1 4 l o g 2 1 4 + 3 4 l o g 2 3 4 ) ] = 1.238 \begin{aligned}age = &-[\frac{3}{12}(\frac{1}{3}log_2\frac{1}{3}+\frac{1}{3}log_2\frac{1}{3}+\frac{1}{3}log_2\frac{1}{3})\\ &+\frac{5}{12}(\frac{1}{5}log_2\frac{1}{5}+\frac{1}{5}log_2\frac{1}{5}+\frac{3}{5}log_2\frac{3}{5})\\&+\frac{4}{12}(\frac{1}{4}log_2\frac{1}{4}+\frac{3}{4}log_2\frac{3}{4})] = 1.238\end{aligned} age=−[123(31log231+31log231+31log231)+125(51log251+51log251+53log253)+124(41log241+43log243)]=1.238
- 针对散光
| 特征值 | soft | hard | none | sum |
|---|---|---|---|---|
| yes | 0 | 3 | 4 | 7 |
| no | 1 | 1 | 3 | 5 |
代入公式
a s t i g m a t i s m = 0.979 astigmatism = 0.979 astigmatism=0.979
- 泪液生成率
| 特征值 | soft | hard | none | sum |
|---|---|---|---|---|
| reduced | 0 | 0 | 4 | 4 |
| normal | 2 | 3 | 3 | 8 |
代入公式
t e a r _ p r o d u c t i o n _ r a t e = 1.041 tear\_production\_rate = 1.041 tear_production_rate=1.041
所以我们首先取astigmatism可以让函数最大
第二特征
然后再考虑剩下的特征
首先基于Yes情况下的输入特征
| 特征值 | soft | hard | none | sum |
|---|---|---|---|---|
| young | 0 | 1 | 1 | 2 |
| pre-prebyopic | 0 | 1 | 2 | 3 |
| prebyopic | 0 | 1 | 1 | 2 |
| reduced | 0 | 0 | 2 | 2 |
| normal | 0 | 3 | 2 | 5 |
a g e = − [ 2 7 ( 1 2 l o g 2 1 2 + 1 2 l o g 2 1 2 ) + 3 7 ( 1 3 l o g 2 1 3 + 1 3 l o g 2 1 3 + 1 3 l o g 2 1 3 ) + 2 7 ( 1 2 l o g 2 1 2 + 1 2 l o g 2 1 2 ) ] = 0.965 \begin{aligned}age= &-[\frac{2}{7}(\frac{1}{2}log_2\frac{1}{2}+\frac{1}{2}log_2\frac{1}{2}) \\ & +\frac{3}{7}(\frac{1}{3}log_2\frac{1}{3}+\frac{1}{3}log_2\frac{1}{3}+\frac{1}{3}log_2\frac{1}{3})\\ &+\frac{2}{7}(\frac{1}{2}log_2\frac{1}{2}+\frac{1}{2}log_2\frac{1}{2})] = 0.965\end{aligned} age=−[72(21log221+21log221)+73(31log231+31log231+31log231)+72(21log221+21log221)]=0.965
t e a r _ p r o d u c t i o n _ r a t e = 0.694 tear\_production\_rate = 0.694 tear_production_rate=0.694
**取yes的时候选tear
**
基于No的情况
| 特征值 | soft | hard | none | sum |
|---|---|---|---|---|
| young | 1 | 0 | 0 | 1 |
| pre-prebyopic | 1 | 0 | 1 | 2 |
| prebyopic | 0 | 0 | 2 | 2 |
| reduced | 0 | 0 | 2 | 2 |
| normal | 2 | 0 | 1 | 3 |
a g e = 0.4 age = 0.4 age=0.4
t e a r = 0.551 tear=0.551 tear=0.551
取no的时候应选择age
可以得到如下的决策树
2.
[线性分类] 推导下述logit function和logistic function等价:
p ( X ) = e β 0 + β 1 X 1 + e β 0 + β 1 X p ( X ) 1 − p ( X ) = e β 0 + β 1 X p(X)=\frac{e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X}}\quad \frac{p(X)}{1-p(X)}=e^{\beta_0+\beta_1X} p(X)=1+eβ0+β1Xeβ0+β1X1−p(X)p(X)=eβ0+β1X
换元,令 f ( X ) = p ( X ) 1 − p ( X ) , f ( X ) 1 − f ( X ) = p ( X ) f(X)=\frac{p(X)}{1-p(X)}, \frac{f(X)}{1-f(X)}=p(X) f(X)=1−p(X)p(X),1−f(X)f(X)=p(X)
p ( X ) 1 − p ( X ) = f ( X ) = e β 0 + β 1 X 1 + e β 0 + β 1 X 1 − e β 0 + β 1 X 1 + e β 0 + β 1 X = e β 0 + β 1 X 1 + e β 0 + β 1 X − ( e β 0 + β 1 X ) = e β 0 + β 1 X = f ( X ) 1 − f ( X ) = p ( X ) \left.\begin{aligned} \frac{p(X)}{1-p(X)}=f(X)& =\frac{\frac{e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X}}} {1- \frac{e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X}}}\\ \\ & =\frac{e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X} -(e^{\beta_0+\beta_1X}) }\\ &=e^{\beta_0+\beta_1X}\\ &=\frac{f(X)}{1-f(X)} = p(X) \end{aligned}\right. 1−p(X)p(X)=f(X)=1−1+eβ0+β1Xeβ0+β1X1+eβ0+β1Xeβ0+β1X=1+eβ0+β1X−(eβ0+β1X)eβ0+β1X=eβ0+β1X=1−f(X)f(X)=p(X)
综上等价
边栏推荐
- Performance optimization metrics and tools
- Leetcode-1663. Minimum string with given value
- (UE4 4.27) UE4 adds a customized meshpass to realize the edge illumination of the mobile terminal
- Word2Vec
- Leetcode-553. Optimal division
- Why don't databases use hash tables?
- Houdini script vex learning
- 获取图片的尺寸
- Overview of camera image quality
- MNIST handwritten data recognition by RNN
猜你喜欢

EBook list page

相机图像质量概述

Overview of camera image quality

Explanation of sensor flicker/banding phenomenon

n次贝塞尔曲线

Textcnn (MR dataset - emotion classification)

Houdini & UE4 programmed generation of mountains and multi vegetation scattering points

Android studio mobile development creates a new database and obtains picture and text data from the database to display on the listview list

About why GPU early-z reduces overdraw

前台展示LED数字(计算器上数字类型)
随机推荐
JS pre parsing
C # converts the hexadecimal code form of text to text (ASCII)
LeetCode-剑指Offer(第二版)个人题解完整版
Research Report on truffle fungus industry - market status analysis and development prospect forecast
Es6-es11 learning
Word2Vec
数据库为什么不使用hash表?
sqlite交叉編譯動態庫
468. verifying the IP address
On the normalization of camera rotation interpolation
Script for unity3d to recursively search for a node with a specific name from all child nodes of a node
2D human pose estimation for pose estimation - pifpaf:composite fields for human pose estimation
. Net core and Net framework comparison
Leetcode-1706. Where does the club fall
Leetcode-93. Restore IP address
Directx11 advanced tutorial cluster based deffered shading
English grammar_ Adverb_ With or without ly, the meaning is different
Pytorch implementation of regression model
E-book analysis
In unity3d, billboard effect can be realized towards another target