当前位置:网站首页>Hands on learning and deep learning -- a brief introduction to softmax regression

Hands on learning and deep learning -- a brief introduction to softmax regression

2022-06-12 08:13:00 Orange acridine 21

1、 First, let's explain the difference between regression and classification :

Regression estimates a continuous value , For example, predict the price of a house ;

Classification predicts a discrete category , For example, I predict whether a picture is a cat or a dog .

example 1:MNIST ( Handwritten digit recognition )10 class

example 2:ImageNet( Classification of natural objects )1000 class

2、Kaggle A typical classification problem on

example 1、 Divide the microscopic picture of human protein into 28 class

example 2、 Divide malware into 9 Categories

example 3、 Will be malicious Wikipedia Comment sharing 7 class

3、 From regression to multiclass classification

Return to : Single continuous data output ; Natural interval R; The difference between the loss function and the real value is the loss .

classification : Usually multiple outputs ; The output of the i The first element is predicted to be i Class confidence ;

 4、 From regression to multiclass classification —— Mean square loss

Suppose there is n Categories , Encode the category with one valid digit , Long for n Vector , from y1 To yn, Hypothesis number 1 i One is the real category , be yi be equal to 1, The other elements are equal to 0.( There is just one position that is valid )

 

The mean square loss can also be used for training , Select the maximum value as the predicted value

 5、 From regression to multiclass classification —— No calibration ratio

Encode the category with one valid digit , The maximum value is used as the prediction

Need confidence to identify the correct class Oy Much larger than other incorrect classes Oi( Large margin )

The output is a probability ( non-negative , And for 1), Now the output is a O1 To On Vector , Introduce a new operation word "Jiao" softmax, We will softmax Effect to O Get one above y_hat, It is a long for n Vector , He has attributes , Each element is nonnegative and the sum is 1.

y_hat The inside number i The elements are equal to O The inside number i Elements are indexed ( The benefits of the index are , I don't care what it's worth , Can become nonnegative ) And then divide by all of them Ok Do index . then y_hat It's a probability .

  Finally, you get two probabilities , A real , A predictive , Then compare the difference between the two probabilities as a loss .

6、softmax And cross entropy to do the loss

Cross entropy is often used to measure the difference between two probabilities :

hypothesis p,q Is a discrete probability

Take its loss as :

The gradient is the difference between the real probability and the predicted probability :

7、 summary

  • softmax Regression is a multi class classification model
  • Use softmax The operator obtains the prediction confidence of each class
  • Cross entropy is used to measure the difference between prediction and labeling

 

 

 

 

 

 

 

原网站

版权声明
本文为[Orange acridine 21]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/03/202203010550044122.html