当前位置:网站首页>Chow-Liu Tree
Chow-Liu Tree
2022-07-02 22:59:00 【Ancient road】
0. introduction
- Basic concept reference
- The paper :Approximating discrete probability distributions with dependence trees
we consider the problem of best approximating an nth-order distribution by a product of n - 1 second-order distributions.
The Chow–Liu method describes a joint probability distribution P ( X 1 , X 2 , … , X n ) P(X_{ {1}},X_{ {2}},\ldots ,X_{ {n}}) P(X1,X2,…,Xn) as a product of second-order conditional and marginal distributions. For example, the six-dimensional distribution P ( X 1 , X 2 , X 3 , X 4 , X 5 , X 6 ) P(X_{ {1}},X_{ {2}},X_{ {3}},X_{ {4}},X_{ {5}},X_{ {6}}) P(X1,X2,X3,X4,X5,X6) might be approximated as
P ′ ( X 1 , X 2 , X 3 , X 4 , X 5 , X 6 ) = P ( X 6 ∣ X 5 ) P ( X 5 ∣ X 2 ) P ( X 4 ∣ X 2 ) P ( X 3 ∣ X 2 ) P ( X 2 ∣ X 1 ) P ( X 1 ) P^{ {\prime }}(X_{ {1}},X_{ {2}},X_{ {3}},X_{ {4}},X_{ {5}},X_{ {6}})=P(X_{ {6}}|X_{ {5}})P(X_{ {5}}|X_{ {2}})P(X_{ {4}}|X_{ {2}})P(X_{ {3}}|X_{ {2}})P(X_{ {2}}|X_{ {1}})P(X_{ {1}}) P′(X1,X2,X3,X4,X5,X6)=P(X6∣X5)P(X5∣X2)P(X4∣X2)P(X3∣X2)P(X2∣X1)P(X1)
1. Mutual information mutual information

take Mutual information Consider the weight of the edge .
2.Chow-Liu Tree Theoretical basis
Given a joint PDF P ( x ) P(x) P(x), the K L K L KL-divergence D ( P , P ′ ) D\left(P, P^{\prime}\right) D(P,P′) is minimized by projecting P ( x ) \mathrm{P}(\mathrm{x}) P(x) on a maximum-weight spanning tree (MSWT) over nodes in X \mathrm{X} X, where the weight on the edge ( X i , X j ) \left(X_{i}, X_{j}\right) (Xi,Xj) is defined by the mutual information measure
I ( X i ; X j ) = ∑ x i , x j P ( x i , x j ) log P ( x i , x j ) P ( x i ) P ( x j ) I\left(X_{i} ; X_{j}\right)=\sum_{x_{i}, x_{j}} P\left(x_{i}, x_{j}\right) \log \frac{P\left(x_{i}, x_{j}\right)}{P\left(x_{i}\right) P\left(x_{j}\right)} I(Xi;Xj)=xi,xj∑P(xi,xj)logP(xi)P(xj)P(xi,xj)
- The transformed tree has the smallest Kullback-Leible The divergence .

3.Chow-Liu Tree Algorithm flow

- For distribution P ( x ) P(x) P(x), For all i ≠ j i≠j i=j, Calculate the joint distribution P ( X i , Y j ) P(X_i,Y_j) P(Xi,Yj);
- Use the 1 The probability distribution obtained in step , Calculate the mutual information of any two nodes I ( X i , Y j ) I(X_i,Y_j) I(Xi,Yj) , And put I ( X i , Y j ) I(X_i,Y_j) I(Xi,Yj) As the weight of the connecting edge of these two nodes ;
- Calculate the maximum weight spanning tree (Maximum-weight spanning tree)
- a. The initial state :n A variable ( node ),0 side
- b. Insert the edge with the largest weight
- c. Find the next largest edge , And join the tree ; After the request is added , No ring generation . otherwise , Find the next largest edge ;
- d. Repeat the process c Process until inserted n-1 side ( Tree creation complete )
- Select any node as the root , Identify the direction of the edge from the root to the leaf ;
- Approximate joint probability of the spanning tree P ′ ( x ) P'(x) P′(x) The joint probability with the original Bayesian network P ( x ) P(x) P(x) The relative entropy of is the smallest .
In fact, the operation flow of the algorithm is the same as that of the minimum spanning tree , The representative algorithms are kruskal And prim Algorithm .
4. Minimum spanning tree
5. Semi naive Bayesian classifier
- Excerpt from teacher zhouzhihua 《 machine learning 》( Watermelon book )7.4. Semi naive Bayesian classifier .
The basic idea of semi naive Bayesian classifier is to properly consider the interdependent information of some attributes , Therefore, it is not necessary to calculate the complete joint probability , And it doesn't completely ignore the strong attribute dependency .“ Independent estimation ” (One-Dependent Estimator , abbreviation ODE) It is the most commonly used strategy of semi naive Bayesian classifier . On Gu Ming , So-called " Rely solely on " It is assumed that each attribute depends on at most one other attribute outside the category , namely
P ( c ∣ x ) ∝ P ( c ) ∏ i = 1 d P ( x i ∣ c , p a i ) P(c \mid \boldsymbol{x}) \propto P(c) \prod_{i=1}^{d} P\left(x_{i} \mid c, p a_{i}\right) P(c∣x)∝P(c)i=1∏dP(xi∣c,pai)
among p a i p a_{i} pai For attributes x i x_{i} xi Dependent properties , be called x i x_{i} xi Parent attribute . here , For each attribute x i x_{i} xi, if Its parent attribute p a i p a_{i} pai It is known that , A similar formula can be used ( 7.20 ) (7.20) (7.20) To estimate the probability value P ( x i ∣ c , p a i ) P\left(x_{i} \mid c, p a_{i}\right) P(xi∣c,pai). therefore , The key to the problem is how to determine the parent attribute of each attribute , Different approaches produce different independent classifiers . The most straightforward approach is to assume that all attributes depend on the same attribute , be called " Super parent "(super-parent) , Then, the super parent attribute is determined by model selection methods such as cross validation , And from that came SPODE (Super-Parent ODE) Method . for example , In the figure 7.1(b) in , x 1 x_1 x1 Is a superparent property .

TAN (Tree Augmented naïve Bayes) [Friedman et al., 1997] Is the maximum weighted spanning tree (maximum weighted spanning tree) Algorithm (Chow-Liu tree)[Chow and Liu, 1968] On the basis of , Through the following steps, the dependency relationship between attributes is reduced to the following figure 7.1 ( c ) 7.1(\mathrm{c}) 7.1(c) The tree structure shown in :
- (1) Calculate the conditional mutual information between any two attributes (conditional mutual information)
I ( x i , x j ∣ y ) = ∑ x i , x j ; c ∈ Y P ( x i , x j ∣ c ) log P ( x i , x j ∣ c ) P ( x i ∣ c ) P ( x j ∣ c ) I\left(x_{i}, x_{j} \mid y\right)=\sum_{x_{i}, x_{j} ; c \in \mathcal{Y}} P\left(x_{i}, x_{j} \mid c\right) \log \frac{P\left(x_{i}, x_{j} \mid c\right)}{P\left(x_{i} \mid c\right) P\left(x_{j} \mid c\right)} I(xi,xj∣y)=xi,xj;c∈Y∑P(xi,xj∣c)logP(xi∣c)P(xj∣c)P(xi,xj∣c) - (2) Build a complete graph with attributes as nodes , The weight of the edge between any two nodes is set to I ( x i , x j ∣ y ) I\left(x_{i}, x_{j} \mid y\right) I(xi,xj∣y)
- (3) Build the maximum weighted spanning tree of this complete graph , Pick root variable , Set edge to directed ;
- (4) Add category node y y y, Increase from y y y To the directed edge of each attribute .
Easy to see , Conditional mutual information I ( x i , x j ∣ y ) I\left(x_{i}, x_{j} \mid y\right) I(xi,xj∣y) Characterizing attributes x i x_{i} xi and x j x_{j} xj Relevance in the case of known categories , therefore , Through the maximum spanning tree algorithm , T A N \mathrm{TAN} TAN In fact, only strongly related attributes are preserved The dependence of .
边栏推荐
- Jerry's built-in shutdown current is 1.2ua, and then it can't be turned on by long pressing [chapter]
- 数组进阶提高
- Analyse des données dossiers d'apprentissage - - analyse simple de la variance à facteur unique avec Excel
- PMP项目整合管理
- 1px pixel compatibility of mobile terminal, 1px border
- Qt QProgressBar详解
- [chestnut sugar GIS] ArcScene - how to make elevation map with height
- Gas station [problem analysis - > problem conversion - > greed]
- Jerry's fast touch does not respond [chapter]
- LeetCode 968. 监控二叉树
猜你喜欢

Niuke network: maximum submatrix

数组进阶提高
![[Solved] Splunk: Cannot get username when all users are selected“](/img/13/1e824c8005701e21fc5b4e73308d53.png)
[Solved] Splunk: Cannot get username when all users are selected“

世界环境日 | 周大福用心服务推动减碳环保

MySQL查询附近的数据.并按距离进行排序.

电商系统微服务架构

位的高阶运算

Niuke: Dragon and dungeon games

数据标注典型案例,景联文科技如何助力企业搭建数据方案
![[chestnut sugar GIS] how does global mapper batch produce ground contour lines through DSM](/img/5d/c23ec16df6ce8d78207b635f59dc20.png)
[chestnut sugar GIS] how does global mapper batch produce ground contour lines through DSM
随机推荐
Share 10 JS closure interview questions (diagrams), come in and see how many you can answer correctly
创新实力再获认可!腾讯安全MSS获2022年度云原生安全守护先锋
Go multithreaded data search
钟薛高回应产品1小时不化:含固体成分 融化不能变成水
Jericho's thimble reaction when directly touching the prototype is abnormal [chapter]
Analyse des données dossiers d'apprentissage - - analyse simple de la variance à facteur unique avec Excel
Local dealers play the community group purchase mode and share millions of operations
Rails 3 activerecord: sort by association count - rails 3 activerecord: order by count on Association
【硬件】标准阻值的由来
杰理之充电拔出,无法触摸开机【篇】
杰理之内置关机电流 1.2uA,之后不能长按开机【篇】
杰理之如何测试按键的误触率【篇】
antd组件upload上传xlsx文件,并读取文件内容
Go 4 modes Singleton
[chestnut sugar GIS] ArcMap - how to batch modify the font, color, size, etc. of annotation elements
'when to use const char * and when to use const char []' - when to use const char * and when to use const char []
Jerry's prototype has no touch, and the reinstallation becomes normal after dismantling [chapter]
用sentinel熔断比例阈值改不了,设置慢调用比例没效果
加油站[问题分析->问题转换->贪心]
Stop slave is stuck -- the event of the transaction is not copied completely