当前位置：网站首页>[data mining] differences, advantages and disadvantages between generative model and discriminant model

[data mining] differences, advantages and disadvantages between generative model and discriminant model

2022-07-26 01:16:00 【Better Bench】

Insert picture description here

1 difference

（1） Discriminant model

Study P(x|y). It is to use a model or function to directly fit the probability distribution P(y|x), fitting P(x|y), It is fitting the relationship from fruit to cause , That is to say Y Under the condition of occurrence ,X Probability of occurrence , Corresponding to the actual training , It's based on label To train the model , Then judge the category , This fitted model is called discriminant model .

（2） Generate models

Study P(y|x). The formula is broken down into three parts $P(y|x)=\frac{P(x|y)P(y)}{P(x)}$ .

P(x,y) It's the joint probability distribution , It's something to fit ,
P(x) Express x Probability
P(y) It can pass the label Ask directly

Then the process of generating the model is interpreted as fitting a probability distribution （ The essence is fitting P(x|y), because P(x,y)=P(x|y)P(y)）, Then according to the maximum value in the probability distribution , To determine the type of data . fitting P(y|x), It is fitting the relationship from cause to effect , This fitted model is called generative model . To put it bluntly , The generation model can sample and generate data according to the joint probability distribution .

notes ：P ( x | y )： It means that Y Under the condition of occurrence ,X Probability of occurrence .P ( x , y )： It's the joint probability distribution .P(x,y)=P(X=x and Y=y), It is also for X and Y Probability distribution of .

summary ： Direct fitting P (x|y) It's the discriminant model . Direct fitting concept distribution P(y,x), Or indirect fitting P(y|x) Is the generation model .

2 give an example

（1） Common discriminant models

K a near neighbor (KNN)
Linear regression (Linear Regression)
Logistic returns (Logistic Regression)
neural network (NN)
Support vector machine (SVM)
Gauss process (Gaussian Process)
Conditional random field (CRF)
Categorizing regression trees CART(Classification and Regression Tree)

（2） Common generation models

LDA Theme model
Naive Bayes
Gaussian mixture model
The hidden Markov model (HMM)！
Bayesian network
Sigmoid Belief Networks
Markov random Airport (Markov Random Fields)
Deep belief network (DBN)

3 Advantages and disadvantages

（1） Generate models
advantage ：

Generation gives a joint distribution , Not only can the conditional probability distribution be calculated from the joint distribution , Other information can also be given , For example, you can use to calculate the edge probability distribution . If the edge distribution of an input sample is very small , Then it can be considered that the learned model may not be suitable for classifying this sample , The classification effect may be bad , It's also called outlier detection.
The convergence speed of the generated model is relatively fast , That is, when the number of samples is large , The generated model can converge to the real model faster .
The generation model can solve the problem of hidden variables , For example, Gaussian mixture model is the generation method with hidden variables .

shortcoming ：

Although joint distribution can provide more information , But it also needs more samples and more calculations . In order to estimate the conditional distribution of categories more accurately , Need to increase the number of samples , Moreover, many information about the conditional probability of categories is not available to us for classification , So if we only need to do classification tasks , It wastes computing resources .
In practice, in most cases , No discrimination model works well .

（２） Discriminant model
advantage ：

Save computing resources , The number of samples required is also less than that of the generated model .
The accuracy is often higher than that of the generated model .
Because of direct learning , It is not necessary to solve the category conditional probability , So it allows us to abstract the input （ For example, dimensionality reduction 、 Structure, etc ）, Thus, learning problems can be simplified .

shortcoming ：