当前位置:网站首页>[in-depth learning] review pytoch's 19 loss functions
[in-depth learning] review pytoch's 19 loss functions
2022-07-04 20:27:00 【Demeanor 78】
Just for academic sharing , It does not represent the position of the official account , Deletion of infringement contact
Reproduced in : author :mingo_ Sensitive
Link to the original text :https://blog.csdn.net/shanglianlm/article/details/85019768
Reading guide
Nineteen loss functions are summarized in this paper , Its mathematical formula and code implementation are introduced , I hope you can master .
01
Basic usage
criterion = LossCriterion() # Constructors have their own arguments
loss = criterion(x, y) # There are also parameters when calling the standard 02
Loss function
2-1 L1 Norm loss L1Loss
Calculation output and target The absolute value of the difference .
torch.nn.L1Loss(reduction='mean')Parameters :
reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.
2-2 Loss of mean square error MSELoss
Calculation output and target The mean square error of the difference between .
torch.nn.MSELoss(reduction='mean')Parameters :
reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.
2-3 Cross entropy loss CrossEntropyLoss
When training has C It's very effective when it comes to the classification of two categories . Optional parameters weight Must be a 1 dimension Tensor, Weights will be assigned to each category . Very effective for imbalanced training sets .
In multi category tasks , Always use softmax Activation function + Cross entropy loss function , Because cross entropy describes the difference between two probability distributions , But the neural network outputs vectors , It's not in the form of a probability distribution . So we need to softmax The activation function performs a vector “ normalization ” In the form of probability distribution , And then the cross entropy loss function is used to calculate loss.

torch.nn.CrossEntropyLoss(weight=None, ignore_index=-100, reduction='mean')Parameters :
weight (Tensor, optional) – Customize the weight of each category . It has to be a length of C Of Tensor
ignore_index (int, optional) – Set a target value , The target value is ignored , So that it doesn't affect The gradient of the input .
reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.
2-4 KL Divergence loss KLDivLoss
Calculation input and target Between KL The divergence .KL Divergence can be used to measure the distance between different continuous distributions , In the space of continuous output distribution ( Discrete sampling ) It is very effective for direct regression on .
torch.nn.KLDivLoss(reduction='mean')Parameters :
reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.
2-5 Binary cross entropy loss BCELoss
The calculation function of cross entropy in binary classification task . Error used to measure reconstruction , For example, automatic encoders . Pay attention to the value of the target t[i] For the range of 0 To 1 Between .
torch.nn.BCELoss(weight=None, reduction='mean')Parameters :
weight (Tensor, optional) – Custom each batch Elemental loss The weight of . It has to be a length of “nbatch” Of Of Tensor
pos_weight(Tensor, optional) – Customized for each positive sample loss The weight of . It has to be a length by “classes” Of Tensor
2-6 BCEWithLogitsLoss
BCEWithLogitsLoss The loss function takes Sigmoid Layer integrated into BCELoss Class . This version is simpler than using a Sigmoid Layer and the BCELoss More stable numerically , Because after merging these two operations into one layer , You can use log-sum-exp Of Techniques to achieve numerical stability .
torch.nn.BCEWithLogitsLoss(weight=None, reduction='mean', pos_weight=None)Parameters :
weight (Tensor, optional) – Custom each batch Elemental loss The weight of . It has to be a length by “nbatch” Of Tensor
pos_weight(Tensor, optional) – Customized for each positive sample loss The weight of . It has to be a length by “classes” Of Tensor
2-7 MarginRankingLoss
torch.nn.MarginRankingLoss(margin=0.0, reduction='mean')about mini-batch( Small batch ) The loss function for each instance in is as follows :

Parameters :
margin: The default value is 0
2-8 HingeEmbeddingLoss
torch.nn.HingeEmbeddingLoss(margin=1.0, reduction='mean')about mini-batch( Small batch ) The loss function for each instance in is as follows :
Parameters :

margin: The default value is 1
2-9 Multi label classification loss MultiLabelMarginLoss
torch.nn.MultiLabelMarginLoss(reduction='mean')about mini-batch( Small batch ) For each sample in, the loss is calculated as follows :

2-10 Smooth version L1 Loss SmoothL1Loss
Also known as Huber Loss function .
torch.nn.SmoothL1Loss(reduction='mean')
among

2-11 2 Classified logistic Loss SoftMarginLoss
torch.nn.SoftMarginLoss(reduction='mean')
2-12 Multi label one-versus-all Loss MultiLabelSoftMarginLoss
torch.nn.MultiLabelSoftMarginLoss(weight=None, reduction='mean')
2-13 cosine Loss CosineEmbeddingLoss
torch.nn.CosineEmbeddingLoss(margin=0.0, reduction='mean')
Parameters :
margin: The default value is 0
2-14 Multi category classification of hinge Loss MultiMarginLoss
torch.nn.MultiMarginLoss(p=1, margin=1.0, weight=None, reduction='mean')
Parameters :
p=1 perhaps 2 The default value is :1
margin: The default value is 1
2-15 Triplet loss TripletMarginLoss
torch.nn.TripletMarginLoss(margin=1.0, p=2.0, eps=1e-06, swap=False, reduction='mean')
among :

2-16 Connection timing classification loss CTCLoss
CTC Connection timing classification loss , You can automatically align data that is not aligned , It is mainly used for training serialization data without prior alignment . For example, speech recognition 、ocr Identification and so on .
torch.nn.CTCLoss(blank=0, reduction='mean')Parameters :
reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.
2-17 Negative log likelihood loss NLLLoss
Negative log likelihood loss . Used for training C There are three categories of classification problems .
torch.nn.NLLLoss(weight=None, ignore_index=-100, reduction='mean')Parameters :
weight (Tensor, optional) – Customize the weight of each category . It has to be a length of C Of Tensor
ignore_index (int, optional) – Set a target value , The target value is ignored , So that it doesn't affect The gradient of the input .
2-18 NLLLoss2d
Negative log likelihood loss for image input . It calculates the negative log likelihood loss per pixel .
torch.nn.NLLLoss2d(weight=None, ignore_index=-100, reduction='mean')Parameters :
weight (Tensor, optional) – Customize the weight of each category . It has to be a length of C Of Tensor
reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.
2-19 PoissonNLLLoss
The target value is the negative log likelihood loss of Poisson distribution
torch.nn.PoissonNLLLoss(log_input=True, full=False, eps=1e-08, reduction='mean')Parameters :
log_input (bool, optional) – If set to True , loss Will be in accordance with the public type exp(input) - target * input To calculate , If set to False , loss Will follow input - target * log(input+eps) Calculation .
full (bool, optional) – Whether to calculate all of loss, i. e. add Stirling Approximation term target * log(target) - target + 0.5 * log(2 * pi * target).
eps (float, optional) – The default value is : 1e-8
Reference material
http://www.voidcn.com/article/p-rtzqgqkz-bpg.html

Past highlights
It is suitable for beginners to download the route and materials of artificial intelligence ( Image & Text + video ) Introduction to machine learning series download Chinese University Courses 《 machine learning 》( Huang haiguang keynote speaker ) Print materials such as machine learning and in-depth learning notes 《 Statistical learning method 》 Code reproduction album machine learning communication qq Group 955171419, Please scan the code to join wechat group 
边栏推荐
- 上线首月,这家露营地游客好评率高达99.9%!他是怎么做到的?
- What ppt writing skills does the classic "pyramid principle" teach us?
- Six stones programming: about code, there are six triumphs
- How to adapt your games to different sizes of mobile screen
- 被奉为经典的「金字塔原理」,教给我们哪些PPT写作技巧?
- kotlin 基本使用
- 一文搞懂Go语言中文件的读写与创建
- 黑马程序员-软件测试--07阶段2-linux和数据库-09-24-linux命令学习步骤,通配符,绝对路径,相对路径,文件和目录常用命令,文件内容相关操作,查看日志文件,ping命令使用,
- Data set division
- Why is the maximum speed the speed of light
猜你喜欢

What is involution?

C # better operation mongodb database

Multi table operation - external connection query

Small hair cat Internet of things platform construction and application model

Siemens HMI download prompts lack of panel image solution

What does the neural network Internet of things mean? Popular explanation

Swagger suddenly went crazy

YOLOv5s-ShuffleNetV2
![[today in history] July 4: the first e-book came out; The inventor of magnetic stripe card was born; Palm computer pioneer was born](/img/0b/73f0d98a6db813e54074abe199ed98.png)
[today in history] July 4: the first e-book came out; The inventor of magnetic stripe card was born; Palm computer pioneer was born

What should we pay attention to when doing social media marketing? Here is the success secret of shopline sellers!
随机推荐
kotlin 基本使用
一文搞懂Go语言中文件的读写与创建
Neural network IOT platform construction (IOT platform construction practical tutorial)
Actual combat simulation │ JWT login authentication
Basic use of kotlin
Related concepts of federal learning and motivation (1)
Huawei cloud store homepage banner resource bit application
So this is the BGP agreement
Dark horse programmer - software testing - stage 07 2-linux and database -09-24-linux command learning steps, wildcards, absolute paths, relative paths, common commands for files and directories, file
Decryption function calculates "task state and lifecycle management" of asynchronous task capability
C语言-入门-基础-语法-流程控制(七)
更强的 JsonPath 兼容性及性能测试之2022版(Snack3,Fastjson2,jayway.jsonpath)
JVM系列之对象的创建
MySQL中的日期时间类型与格式化方式
Niuke Xiaobai month race 7 e applese's super ability
1008 elevator (20 points) (PAT class a)
Qt编写物联网管理平台38-多种数据库支持
水晶光电:长安深蓝SL03的AR-HUD产品由公司供应
In operation (i.e. included in) usage of SSRs filter
Lingyun going to sea | 10 jump &huawei cloud: jointly help Africa's inclusive financial services