当前位置：网站首页>Common loss function of deep learning

Common loss function of deep learning

2022-07-02 00:35:00 【Falling flowers and rain】

List of articles

1. Classification task
- 1.1 Multi category tasks
- 1.2 2. Classified tasks
2. Return to the task
summary

In deep learning , Loss function is a function used to measure the quality of model parameters , The way to measure is to compare the difference between network output and real output , The name of loss function is different in different literatures , There are mainly the following naming methods ：

Insert picture description here

1. Classification task

The cross entropy loss function is most used in the classification task of deep learning , So here we focus on this loss function .

1.1 Multi category tasks

In multi classification tasks, we usually use softmax take logits In the form of probability , Therefore, the cross entropy loss of multi classification is also called softmax Loss , Its calculation method is ：

Insert picture description here

among ,y Is the sample x The true probability of belonging to a category , and f(x) Is the prediction score of the sample belonging to a certain category ,S yes softmax function ,L To measure p,q The difference between the loss results .

Example ：

Insert picture description here

The cross entropy loss in the above figure is ：

Insert picture description here

Understand... From the perspective of probability , Our goal is to minimize the negative value of the logarithm of the prediction probability corresponding to the correct category , As shown in the figure below ：

Insert picture description here

stay tf.keras Use in CategoricalCrossentropy Realization , As shown below ：

#  Import the corresponding package 
import tensorflow as tf
#  Set true and predicted values 
y_true = [[0, 1, 0], [0, 0, 1]]
y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]
#  Instantiation cross entropy loss 
cce = tf.keras.losses.CategoricalCrossentropy()
#  Calculate the loss result 
cce(y_true, y_pred).numpy()

The result is ：

1.176939

1.2 2. Classified tasks

When processing task 2 , We're not using softmax Activation function , But use sigmoid Activation function , The loss function is adjusted accordingly , Using the cross entropy loss function of binary classification ：

Insert picture description here

among ,y Is the sample x The true probability of belonging to a category , and y^ Is the prediction probability that the sample belongs to a certain category ,L The loss result used to measure the difference between the real value and the predicted value .

stay tf.keras When implemented in BinaryCrossentropy(), As shown below ：

#  Import the corresponding package 
import tensorflow as tf
#  Set true and predicted values 
y_true = [[0], [1]]
y_pred = [[0.4], [0.6]]
#  Cross entropy loss of instantiated binary classification 
bce = tf.keras.losses.BinaryCrossentropy()
#  Calculate the loss result 
bce(y_true, y_pred).numpy()

The result is ：

0.5108254

2. Return to the task

The loss functions commonly used in regression tasks are as follows ：

2.1 MAE Loss

Mean absolute loss(MAE) Also known as L1 Loss, It takes the absolute error as the distance ：

Insert picture description here

The curve is shown in the figure below ：

Insert picture description here

Characteristic is ： because L1 loss Have sparsity , To punish larger values , Therefore, it is often added as a regular item to other loss As a constraint .L1 loss The biggest problem is that the gradient is not smooth at zero , Causes the minimum to be skipped .

stay tf.keras Use in MeanAbsoluteError Realization , As shown below ：

#  Import the corresponding package 
import tensorflow as tf
#  Set true and predicted values 
y_true = [[0.], [0.]]
y_pred = [[1.], [1.]]
#  Instantiation MAE Loss 
mae = tf.keras.losses.MeanAbsoluteError()
#  Calculate the loss result 
mae(y_true, y_pred).numpy()

The result is ：

1.0

2.2 MSE Loss

Mean Squared Loss/ Quadratic Loss(MSE loss) Also known as L2 loss, Or Euclidean distance , It takes the sum of the squares of the errors as the distance ：

Insert picture description here

The curve is shown in the figure below ：

Insert picture description here

Characteristic is ：L2 loss It is also often used as a regular term . When the predicted value is very different from the target value , Gradients explode easily .

stay tf.keras Pass through MeanSquaredError Realization ：

#  Import the corresponding package 
import tensorflow as tf
#  Set true and predicted values 
y_true = [[0.], [1.]]
y_pred = [[1.], [1.]]
#  Instantiation MSE Loss 
mse = tf.keras.losses.MeanSquaredError()
#  Calculate the loss result 
mse(y_true, y_pred).numpy()

The result is ：

0.5

2.3 smooth L1 Loss

Smooth L1 The loss function is as follows ：

Insert picture description here

among ： $x = f (x) - y$ Is the difference between the real value and the predicted value .

Insert picture description here

As can be seen from the above figure , This function is actually a piecewise function , stay [-1,1] Between is actually L2 Loss , That's it L1 The problem of unsmooth , stay [-1,1] Outside the range , It's actually L1 Loss , This solves the problem of gradient explosion of outliers . This loss function is usually used in target detection .

stay tf.keras Use in Huber Calculate the loss , As shown below ：

#  Import the corresponding package 
import tensorflow as tf
#  Set true and predicted values 
y_true = [[0], [1]]
y_pred = [[0.6], [0.4]]
#  Instantiation smooth L1 Loss 
h = tf.keras.losses.Huber()
#  Calculate the loss result 
h(y_true, y_pred).numpy()

result ：

0.18

summary

Know the loss function of classification task
The cross entropy loss function of multi classification and the cross entropy loss function of two classification
Know the loss function of the regression task
MAE,MSE,smooth L1 Loss function

原网站

版权声明
本文为[Falling flowers and rain]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/182/202207011907580343.html