当前位置:网站首页>Common loss function of deep learning
Common loss function of deep learning
2022-07-02 00:35:00 【Falling flowers and rain】
List of articles

In deep learning , Loss function is a function used to measure the quality of model parameters , The way to measure is to compare the difference between network output and real output , The name of loss function is different in different literatures , There are mainly the following naming methods :
1. Classification task
The cross entropy loss function is most used in the classification task of deep learning , So here we focus on this loss function .
1.1 Multi category tasks
In multi classification tasks, we usually use softmax take logits In the form of probability , Therefore, the cross entropy loss of multi classification is also called softmax Loss , Its calculation method is :
among ,y Is the sample x The true probability of belonging to a category , and f(x) Is the prediction score of the sample belonging to a certain category ,S yes softmax function ,L To measure p,q The difference between the loss results .
Example :
The cross entropy loss in the above figure is :
Understand... From the perspective of probability , Our goal is to minimize the negative value of the logarithm of the prediction probability corresponding to the correct category , As shown in the figure below :
stay tf.keras Use in CategoricalCrossentropy Realization , As shown below :
# Import the corresponding package
import tensorflow as tf
# Set true and predicted values
y_true = [[0, 1, 0], [0, 0, 1]]
y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]
# Instantiation cross entropy loss
cce = tf.keras.losses.CategoricalCrossentropy()
# Calculate the loss result
cce(y_true, y_pred).numpy()
The result is :
1.176939
1.2 2. Classified tasks
When processing task 2 , We're not using softmax Activation function , But use sigmoid Activation function , The loss function is adjusted accordingly , Using the cross entropy loss function of binary classification :
among ,y Is the sample x The true probability of belonging to a category , and y^ Is the prediction probability that the sample belongs to a certain category ,L The loss result used to measure the difference between the real value and the predicted value .
stay tf.keras When implemented in BinaryCrossentropy(), As shown below :
# Import the corresponding package
import tensorflow as tf
# Set true and predicted values
y_true = [[0], [1]]
y_pred = [[0.4], [0.6]]
# Cross entropy loss of instantiated binary classification
bce = tf.keras.losses.BinaryCrossentropy()
# Calculate the loss result
bce(y_true, y_pred).numpy()
The result is :
0.5108254
2. Return to the task
The loss functions commonly used in regression tasks are as follows :
2.1 MAE Loss
Mean absolute loss(MAE) Also known as L1 Loss, It takes the absolute error as the distance :
The curve is shown in the figure below :
Characteristic is : because L1 loss Have sparsity , To punish larger values , Therefore, it is often added as a regular item to other loss As a constraint .L1 loss The biggest problem is that the gradient is not smooth at zero , Causes the minimum to be skipped .
stay tf.keras Use in MeanAbsoluteError Realization , As shown below :
# Import the corresponding package
import tensorflow as tf
# Set true and predicted values
y_true = [[0.], [0.]]
y_pred = [[1.], [1.]]
# Instantiation MAE Loss
mae = tf.keras.losses.MeanAbsoluteError()
# Calculate the loss result
mae(y_true, y_pred).numpy()
The result is :
1.0
2.2 MSE Loss
Mean Squared Loss/ Quadratic Loss(MSE loss) Also known as L2 loss, Or Euclidean distance , It takes the sum of the squares of the errors as the distance :
The curve is shown in the figure below :
Characteristic is :L2 loss It is also often used as a regular term . When the predicted value is very different from the target value , Gradients explode easily .
stay tf.keras Pass through MeanSquaredError Realization :
# Import the corresponding package
import tensorflow as tf
# Set true and predicted values
y_true = [[0.], [1.]]
y_pred = [[1.], [1.]]
# Instantiation MSE Loss
mse = tf.keras.losses.MeanSquaredError()
# Calculate the loss result
mse(y_true, y_pred).numpy()
The result is :
0.5
2.3 smooth L1 Loss
Smooth L1 The loss function is as follows :
among : 𝑥 = f ( x ) − y 𝑥=f(x)−y x=f(x)−y Is the difference between the real value and the predicted value .
As can be seen from the above figure , This function is actually a piecewise function , stay [-1,1] Between is actually L2 Loss , That's it L1 The problem of unsmooth , stay [-1,1] Outside the range , It's actually L1 Loss , This solves the problem of gradient explosion of outliers . This loss function is usually used in target detection .
stay tf.keras Use in Huber Calculate the loss , As shown below :
# Import the corresponding package
import tensorflow as tf
# Set true and predicted values
y_true = [[0], [1]]
y_pred = [[0.6], [0.4]]
# Instantiation smooth L1 Loss
h = tf.keras.losses.Huber()
# Calculate the loss result
h(y_true, y_pred).numpy()
result :
0.18
summary
Know the loss function of classification task
The cross entropy loss function of multi classification and the cross entropy loss function of two classificationKnow the loss function of the regression task
MAE,MSE,smooth L1 Loss function
边栏推荐
- Barbie q! How to analyze the new game app?
- vue 强制清理浏览器缓存
- Asp .NetCore 微信订阅号自动回复之文本篇
- If the browser is accidentally closed, how does react cache the forms filled out by users?
- Flow control statement of SQL data analysis [if, case... When detailed]
- Export default the exported object cannot be deconstructed, and module Differences between exports
- Take the enclave Park as a sample to see how Yuhua and Shaoshan play the song of Chang Zhu Tan integrated development
- Kyushu cloud and Intel jointly released the smart campus private cloud framework, enabling new infrastructure for education
- UVM tutorial
- Difficult to get up syndrome (bit by bit greed)
猜你喜欢
excel数据透视表
九州云与英特尔联合发布智慧校园私有云框架,赋能教育新基建
[QT] QT cannot find a solution to the compiler using msvc2017
New version of free mobile phone, PC, tablet, notebook four terminal Website thumbnail display diagram online one click to generate website source code
【微信授权登录】uniapp开发小程序,实现获取微信授权登录功能
How to improve data quality
Database -- sqlserver details
Take the enclave Park as a sample to see how Yuhua and Shaoshan play the song of Chang Zhu Tan integrated development
2023款雷克萨斯ES产品公布,这回进步很有感
Leetcode medium question sharing (5)
随机推荐
Is it safe and reliable to open an account in Caixue school and make new debts?
Use the htaccess file to prohibit the script execution permission in the directory
Source code of Qiwei automatic card issuing system
What is ThreadLocal memory leak and how to solve it
leetcode96不同的二叉搜索樹
@Valid parameter verification does not take effect
The difference between timer and scheduledthreadpoolexecutor
二叉搜索树的创建,查找,添加,删除操作
Node——Egg 实现上传文件接口
mysql之B tree 以及 B+tree
创业团队如何落地敏捷测试,提升质量效能?丨声网开发者创业讲堂 Vol.03
Weather forecast applet source code weather wechat applet source code
使用多线程Callable查询oracle数据库
[opencv450] hog+svm and hog+cascade for pedestrian detection
Difficult to get up syndrome (bit by bit greed)
How to type spaces in latex
Record the accidental success and failure of uploading large files
Schrodinger's Japanese learning applet source code
Halcon knowledge: an attempt of 3D reconstruction
Leetcode skimming: binary tree 02 (middle order traversal of binary tree)