当前位置:网站首页>[in-depth learning] review pytoch's 19 loss functions

[in-depth learning] review pytoch's 19 loss functions

2022-07-04 20:27:00 Demeanor 78

Just for academic sharing , It does not represent the position of the official account , Deletion of infringement contact

Reproduced in : author :mingo_ Sensitive

Link to the original text :https://blog.csdn.net/shanglianlm/article/details/85019768

Reading guide

Nineteen loss functions are summarized in this paper , Its mathematical formula and code implementation are introduced , I hope you can master .

01

Basic usage

criterion = LossCriterion() # Constructors have their own arguments 
loss = criterion(x, y) # There are also parameters when calling the standard 

02

Loss function

2-1 L1 Norm loss L1Loss

Calculation output and target The absolute value of the difference .

torch.nn.L1Loss(reduction='mean')

Parameters :

reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.

2-2 Loss of mean square error MSELoss

Calculation output and target The mean square error of the difference between .

torch.nn.MSELoss(reduction='mean')

Parameters :

reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.

2-3 Cross entropy loss CrossEntropyLoss

When training has C It's very effective when it comes to the classification of two categories . Optional parameters weight Must be a 1 dimension Tensor, Weights will be assigned to each category . Very effective for imbalanced training sets .

In multi category tasks , Always use softmax Activation function + Cross entropy loss function , Because cross entropy describes the difference between two probability distributions , But the neural network outputs vectors , It's not in the form of a probability distribution . So we need to softmax The activation function performs a vector “ normalization ” In the form of probability distribution , And then the cross entropy loss function is used to calculate loss.

965c1af997212bdac6c48ebdc8238aa6.png

torch.nn.CrossEntropyLoss(weight=None, ignore_index=-100, reduction='mean')

Parameters :

weight (Tensor, optional) – Customize the weight of each category . It has to be a length of C Of Tensor

ignore_index (int, optional) – Set a target value , The target value is ignored , So that it doesn't affect The gradient of the input .

reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.

2-4 KL Divergence loss KLDivLoss

Calculation input and target Between KL The divergence .KL Divergence can be used to measure the distance between different continuous distributions , In the space of continuous output distribution ( Discrete sampling ) It is very effective for direct regression on .

torch.nn.KLDivLoss(reduction='mean')

Parameters :

reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.

2-5 Binary cross entropy loss BCELoss

The calculation function of cross entropy in binary classification task . Error used to measure reconstruction , For example, automatic encoders . Pay attention to the value of the target t[i] For the range of 0 To 1 Between .

torch.nn.BCELoss(weight=None, reduction='mean')

Parameters :

weight (Tensor, optional) – Custom each batch Elemental loss The weight of . It has to be a length of “nbatch” Of Of Tensor

pos_weight(Tensor, optional) – Customized for each positive sample loss The weight of . It has to be a length by “classes” Of Tensor

2-6 BCEWithLogitsLoss

BCEWithLogitsLoss The loss function takes Sigmoid Layer integrated into BCELoss Class . This version is simpler than using a Sigmoid Layer and the BCELoss More stable numerically , Because after merging these two operations into one layer , You can use log-sum-exp Of Techniques to achieve numerical stability .

torch.nn.BCEWithLogitsLoss(weight=None, reduction='mean', pos_weight=None)

Parameters :

weight (Tensor, optional) – Custom each batch Elemental loss The weight of . It has to be a length by “nbatch” Of Tensor

pos_weight(Tensor, optional) – Customized for each positive sample loss The weight of . It has to be a length by “classes” Of Tensor

2-7 MarginRankingLoss

torch.nn.MarginRankingLoss(margin=0.0, reduction='mean')

about mini-batch( Small batch ) The loss function for each instance in is as follows :

09b605f0c87f2e34afc3394a422ef627.png

Parameters :

margin: The default value is 0

2-8 HingeEmbeddingLoss

torch.nn.HingeEmbeddingLoss(margin=1.0,  reduction='mean')

about mini-batch( Small batch ) The loss function for each instance in is as follows :

Parameters :

b3c0d91f4f4aa5adc70c29c40bf7d752.png

margin: The default value is 1

2-9 Multi label classification loss MultiLabelMarginLoss

torch.nn.MultiLabelMarginLoss(reduction='mean')

about mini-batch( Small batch ) For each sample in, the loss is calculated as follows :

de84e0a60e401a552922efef81610ced.png

2-10 Smooth version L1 Loss SmoothL1Loss

Also known as Huber Loss function .

torch.nn.SmoothL1Loss(reduction='mean')

471c063c7f6c35b6ee81c9fdd44612ed.png

among

678e49121a8aa0dc026f21979e5ced45.png

2-11 2 Classified logistic Loss SoftMarginLoss

torch.nn.SoftMarginLoss(reduction='mean')

483a726c2b01576f2fe81fa31cfd229e.png

2-12 Multi label one-versus-all Loss MultiLabelSoftMarginLoss

torch.nn.MultiLabelSoftMarginLoss(weight=None, reduction='mean')

4ca4b59bd037bbcd14c0c60fe4e54ec1.png

2-13 cosine Loss CosineEmbeddingLoss

torch.nn.CosineEmbeddingLoss(margin=0.0, reduction='mean')

4f9ac13ea57c0de7ae174219c24c8240.png

Parameters :

margin: The default value is 0

2-14 Multi category classification of hinge Loss MultiMarginLoss

torch.nn.MultiMarginLoss(p=1, margin=1.0, weight=None,  reduction='mean')

21300b8b23b9e84cb7e2801b351c832f.png

Parameters :

p=1 perhaps 2 The default value is :1

margin: The default value is 1

2-15 Triplet loss TripletMarginLoss

torch.nn.TripletMarginLoss(margin=1.0, p=2.0, eps=1e-06, swap=False, reduction='mean')

8a372d0970b71d37034a480dfc58573c.png

among :

bbb62e8b229a80dc1cf0def056d01bb8.png

2-16 Connection timing classification loss CTCLoss

CTC Connection timing classification loss , You can automatically align data that is not aligned , It is mainly used for training serialization data without prior alignment . For example, speech recognition 、ocr Identification and so on .

torch.nn.CTCLoss(blank=0, reduction='mean')

Parameters :

reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.

2-17 Negative log likelihood loss NLLLoss

Negative log likelihood loss . Used for training C There are three categories of classification problems .

torch.nn.NLLLoss(weight=None, ignore_index=-100,  reduction='mean')

Parameters :

weight (Tensor, optional) – Customize the weight of each category . It has to be a length of C Of Tensor

ignore_index (int, optional) – Set a target value , The target value is ignored , So that it doesn't affect The gradient of the input .

2-18 NLLLoss2d

Negative log likelihood loss for image input . It calculates the negative log likelihood loss per pixel .

torch.nn.NLLLoss2d(weight=None, ignore_index=-100, reduction='mean')

Parameters :

weight (Tensor, optional) – Customize the weight of each category . It has to be a length of C Of Tensor

reduction- Three values ,none: Do not use reduction ;mean: return loss The average of and ;sum: return loss And . Default :mean.

2-19 PoissonNLLLoss

The target value is the negative log likelihood loss of Poisson distribution

torch.nn.PoissonNLLLoss(log_input=True, full=False,  eps=1e-08,  reduction='mean')

Parameters :

log_input (bool, optional) – If set to True , loss Will be in accordance with the public type exp(input) - target * input To calculate , If set to False , loss Will follow input - target * log(input+eps) Calculation .

full (bool, optional) – Whether to calculate all of loss, i. e. add Stirling Approximation term target * log(target) - target + 0.5 * log(2 * pi * target).

eps (float, optional) – The default value is : 1e-8

Reference material

http://www.voidcn.com/article/p-rtzqgqkz-bpg.html

fd9290abf7d3973360b24eb15208d305.jpeg

 Past highlights 




 It is suitable for beginners to download the route and materials of artificial intelligence ( Image & Text + video ) Introduction to machine learning series download Chinese University Courses 《 machine learning 》( Huang haiguang keynote speaker ) Print materials such as machine learning and in-depth learning notes 《 Statistical learning method 》 Code reproduction album machine learning communication qq Group 955171419, Please scan the code to join wechat group 

5cf4938511cffbdd21d198b14f40f7ee.png

原网站

版权声明
本文为[Demeanor 78]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/185/202207041838244305.html