当前位置:网站首页>Discriminant model: a discriminant model creation framework log linear model
Discriminant model: a discriminant model creation framework log linear model
2022-07-06 10:30:00 【HadesZ~】
Log-Linear Model It is a framework for creating discriminant model algorithm , It does not refer to a particular model 、 It refers to a kind of model .
1. Definition
Let the model predict and consider J J J Species characteristics , j = 1 , 2 , ⋯ , J j=1,2, \cdots, J j=1,2,⋯,J; w j w_j wj Indicates that the model is right j j j Parameters of kinds of features , Its value is estimated in the process of model training ; F j ( X , y ) F_j(X, y) Fj(X,y) Represents the second part of the model j j j Characteristic function of kinds of characteristics (feature function), It expresses characteristics X X X And labels y y y Some of the relationships between , The dependent variable is the... Used for model prediction j j j Features ; Z ( X , W ) Z(X,W) Z(X,W) The normalization coefficient representing the predicted value of model characteristics , It is called normalization term or partion function. Under these conditions , The objective function of the model can be expressed as follows :
P ( y ∣ X ; W ) = e x p [ ∑ j = 1 J w j F j ( X , y ) ] Z ( X , W ) (1) P(y|\ X;W) = \frac{exp[\sum_{j=1}^{J} w_jF_j(X, y)]}{Z(X,W)} \tag{1} P(y∣ X;W)=Z(X,W)exp[∑j=1JwjFj(X,y)](1)
In the model , The characteristic function of each feature (feature function) Set manually , Given different characteristic functions, different kinds of models can be derived , establish feature function It is a process of Feature Engineering . When given manually feature function when , It's a machine learning process , When given by automatic feature mechanism feature function when , It's a deep learning process .
Z ( X , W ) Z(X,W) Z(X,W) Equal to the sum of the numerators of all possible categories of the label , namely Z ( X , W ) = ∑ i = 1 C e x p [ ∑ j = 1 J w j F j ( X , y = c i ) ] Z(X,W) = \sum_{i=1}^{C}exp[\sum_{j=1}^{J} w_jF_j(X, y = c_i)] Z(X,W)=∑i=1Cexp[∑j=1JwjFj(X,y=ci)], Its function is to normalize the molecular term , Let the fractional result satisfy the conditional probability property .
2. Derivative logistic regression model
Let the set of all possible tags be C = { c 1 , c 2 , ⋯ , c N } C = \{c_1, c_2, \cdots, c_N\} C={ c1,c2,⋯,cN}、 Input characteristics X X X Is a length of J J J Vector X = ( x 1 , x 2 , ⋯ , x d ) X=(x_1, x_2, \cdots, x_d) X=(x1,x2,⋯,xd). So given F j ( X , y ) = x j ⋅ I ( y = c i ) F_j(X, y) = x_j \cdot I(y=c_i) Fj(X,y)=xj⋅I(y=ci), I ( y = c i ) I(y=c_i) I(y=ci) yes indicator function, When y = c i y=c_i y=ci when indicator function The value of is 1, Otherwise 0. therefore , The objective function of the model is :
P ( y = c i ∣ X ; W ) = e x p [ ∑ j = 1 + d ( i − 1 ) d + d ( i − 1 ) w j x j − d ( i − 1 ) ] ∑ i = 1 C e x p [ ∑ j = 1 + d ( i − 1 ) d + d ( i − 1 ) w j x j − d ( i − 1 ) ] (2) P(y=c_i|\ X;W) = \frac{ exp \begin{bmatrix} \sum_{j=1 + d(i-1)}^{d+d(i-1)} w_jx_{j-d(i-1)} \end{bmatrix} }{ \sum_{i=1}^{C} exp \begin{bmatrix} \sum_{j=1 + d(i-1)}^{d+d(i-1)} w_jx_{j-d(i-1)} \end{bmatrix} } \tag{2} P(y=ci∣ X;W)=∑i=1Cexp[∑j=1+d(i−1)d+d(i−1)wjxj−d(i−1)]exp[∑j=1+d(i−1)d+d(i−1)wjxj−d(i−1)](2)
Where the model parameters w j ∈ R 3 d wj \in R^{3d} wj∈R3d, Parameter vector W = ( w 1 , w 2 , ⋯ , w d , w d + 1 , ⋯ , w 2 d , ⋯ , w 1 + d ( C − 1 ) , ⋯ , w d + d ( C − 1 ) ) W = (w_1, w_2, \cdots, w_d, w_{d+1}, \cdots, w_{2d}, \cdots,w_{1+d(C-1), \cdots, w_{d+ d(C-1)}}) W=(w1,w2,⋯,wd,wd+1,⋯,w2d,⋯,w1+d(C−1),⋯,wd+d(C−1)); Let's take the sub vectors in the parameter vector ( w 1 + d ( i − 1 ) , ⋯ , w d + d ( i − 1 ) ) (w_{1 + d(i-1)}, \cdots, w_{d + d(i-1)}) (w1+d(i−1),⋯,wd+d(i−1)) Write it down as W i W_{i} Wi, So the parameter vector can be rewritten as W = ( W 1 , W 2 , ⋯ , W C ) W=(W_{1}, W_{2}, \cdots, W_{C}) W=(W1,W2,⋯,WC), Bring it into type ( 2 ) type (2) type (2) The objective function of the model can be abbreviated as :
P ( y = c i ∣ X ; W ) = e x p [ W i T ⋅ X ] ∑ i = 1 C e x p [ W i T ⋅ X ] (3) P(y=c_i|\ X;W) = \frac{ exp [W_{i}^T \cdot X] }{ \sum_{i=1}^{C} exp [W_{i}^T \cdot X] } \tag{3} P(y=ci∣ X;W)=∑i=1Cexp[WiT⋅X]exp[WiT⋅X](3)
obviously , type ( 3 ) type (3) type (3) Equivalent to P ( y ∣ X ; W ) = S o f t m a x ( W T X ) P(y|\ X;W) = Softmax(W^TX) P(y∣ X;W)=Softmax(WTX); thus , We have Log-Linear Model A multi classification logistic regression model is derived (Multinomial Logistic Regression).
3. derivative CRF Model
Empathy , set up X ˉ \bar{X} Xˉ Is a length of T T T Observable feature sequence of , y ˉ \bar{y} yˉ Is its corresponding tag sequence , If given F j ( X , y ) = ∑ t = 2 T f t ( y t − 1 , y t , X ˉ , t ) F_j(X, y) = \sum_{t=2}^{T} f_t(y_{t-1}, y_t, \bar{X}, t) Fj(X,y)=∑t=2Tft(yt−1,yt,Xˉ,t) , Then you can get Linera CRF The objective function of the model :
P ( y ˉ ∣ X ˉ ; W ) = 1 Z ( X , W ) e x p [ ∑ t = 2 T f t ( y t − 1 , y t , X ˉ , t ) ] (4) P(\bar{y}|\ \bar{X};W) = \frac{1}{Z(X,W)}exp \begin{bmatrix} \sum_{t=2}^{T} f_t(y_{t-1}, y_t, \bar{X}, t) \end{bmatrix} \tag{4} P(yˉ∣ Xˉ;W)=Z(X,W)1exp[∑t=2Tft(yt−1,yt,Xˉ,t)](4)
边栏推荐
- 15 medical registration system_ [appointment registration]
- How to build an interface automation testing framework?
- 如何让shell脚本变成可执行文件
- Flash operation and maintenance script (running for a long time)
- C miscellaneous dynamic linked list operation
- NLP routes and resources
- Time in TCP state_ The role of wait?
- Implement sending post request with form data parameter
- Sichuan cloud education and double teacher model
- MySQL實戰優化高手08 生產經驗:在數據庫的壓測過程中,如何360度無死角觀察機器性能?
猜你喜欢
Nanny hand-in-hand teaches you to write Gobang in C language
Implement context manager through with
MySQL33-多版本并发控制
MySQL27-索引优化与查询优化
再有人问你数据库缓存一致性的问题,直接把这篇文章发给他
Const decorated member function problem
Cmooc Internet + education
If someone asks you about the consistency of database cache, send this article directly to him
Mysql27 index optimization and query optimization
MySQL 29 other database tuning strategies
随机推荐
Windchill配置远程Oracle数据库连接
Windchill configure remote Oracle database connection
Complete web login process through filter
实现以form-data参数发送post请求
如何搭建接口自动化测试框架?
[Julia] exit notes - Serial
Nanny hand-in-hand teaches you to write Gobang in C language
安装OpenCV时遇到的几种错误
颜值爆表,推荐两款JSON可视化工具,配合Swagger使用真香
PyTorch RNN 实战案例_MNIST手写字体识别
Mexican SQL manual injection vulnerability test (mongodb database) problem solution
Mysql30 transaction Basics
MySQL实战优化高手10 生产经验:如何为数据库的监控系统部署可视化报表系统?
MySQL real battle optimization expert 08 production experience: how to observe the machine performance 360 degrees without dead angle in the process of database pressure test?
Several errors encountered when installing opencv
[programmers' English growth path] English learning serial one (verb general tense)
MySQL实战优化高手03 用一次数据更新流程,初步了解InnoDB存储引擎的架构设计
C miscellaneous shallow copy and deep copy
Jar runs with error no main manifest attribute
Security design verification of API interface: ticket, signature, timestamp