当前位置:网站首页>1、 Focal loss theory and code implementation
1、 Focal loss theory and code implementation
2022-07-29 06:08:00 【My hair is messy】
List of articles
Preface
In this paper, the reference : When can I see Qingmeng blogger's article
Reference to the original :https://www.jianshu.com/p/30043bcc90b6
One 、 The basic theory
1. use soft - gamma: Increase periodically in the process of training gamma There may be better performance improvement .
2.alpha It is related to the frequency of each category in the training data .
3.F.nll_loss(torch.log(F.softmax(inputs, dim=1),target) The function function of is similar to F.cross_entropy identical .
F.nll_loss For target Of one-hot encoding, Encode it as input shape same tensor, Then compare with the previous one ( namely F.nll_loss First item entered ) Conduct element-wise production.

be based on alpha=1 Use different gamma Value the result of the experiment
4.focal loss What problems have been solved ?
(1) Different categories are uneven
(2) Difficult and easy sample imbalance
5. stay retinanet in , Besides using focal loss Outside , Special processing is also done for initialization , How to do it ?
stay retinanet in , Yes classification subnet The last floor of conv Set its offset b by :
Two 、 Realization
1. The formula
The standard Cross Entropy and Focal Loss by :
See Zhihu for the forward and backward derivation of :https://zhuanlan.zhihu.com/p/32631517
2. Code implementation
1. Based on binary classification cross entropy .
# 1. Based on binary classification cross entropy
class FocalLoss(nn.Module):
def __init__(self, alpha=1, gamma=2, logits=False, reduce=True):
super(FocalLoss, self).__init__()
self.alpha = alpha
self.gamma = gamma
self.logits = logits
self.reduce = reduce
def forward(self, inputs, targets):
if self.logits:
BCE_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduce=False)
else:
BCE_loss = F.binary_cross_entropy(inputs, targets, reduce=False)
pt = torch.exp(-BCE_loss)
F_loss = self.alpha * (1-pt)**self.gamma * BCE_loss
if self.reduce:
return torch.mean(F_loss)
else:
return F_loss
2. The realization of Zhihu boss
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
class FocalLoss(nn.Module):
r"""
This criterion is a implemenation of Focal Loss, which is proposed in
Focal Loss for Dense Object Detection.
Loss(x, class) = - \alpha (1-softmax(x)[class])^gamma \log(softmax(x)[class])
The losses are averaged across observations for each minibatch.
Args:
alpha(1D Tensor, Variable) : the scalar factor for this criterion
gamma(float, double) : gamma > 0; reduces the relative loss for well-classified examples (p > .5),
putting more focus on hard, misclassified examples
size_average(bool): By default, the losses are averaged over observations for each minibatch.
However, if the field size_average is set to False, the losses are
instead summed for each minibatch.
"""
def __init__(self, class_num, alpha=None, gamma=2, size_average=True):
super(FocalLoss, self).__init__()
if alpha is None:
self.alpha = Variable(torch.ones(class_num, 1))
else:
if isinstance(alpha, Variable):
self.alpha = alpha
else:
self.alpha = Variable(alpha)
self.gamma = gamma
self.class_num = class_num
self.size_average = size_average
def forward(self, inputs, targets):
N = inputs.size(0)
C = inputs.size(1)
P = F.softmax(inputs)
class_mask = inputs.data.new(N, C).fill_(0)
class_mask = Variable(class_mask)
ids = targets.view(-1, 1)
class_mask.scatter_(1, ids.data, 1.)
#print(class_mask)
if inputs.is_cuda and not self.alpha.is_cuda:
self.alpha = self.alpha.cuda()
alpha = self.alpha[ids.data.view(-1)]
probs = (P*class_mask).sum(1).view(-1,1)
log_p = probs.log()
#print('probs size= {}'.format(probs.size()))
#print(probs)
batch_loss = -alpha*(torch.pow((1-probs), self.gamma))*log_p
#print('-----bacth_loss------')
#print(batch_loss)
if self.size_average:
loss = batch_loss.mean()
else:
loss = batch_loss.sum()
return loss
``
边栏推荐
- 【目标检测】KL-Loss:Bounding Box Regression with Uncertainty for Accurate Object Detection
- A preliminary study on fastjason's autotype
- 2021-06-10
- 虚假新闻检测论文阅读(四):A novel self-learning semi-supervised deep learning network to detect fake news on...
- Power Bi report server custom authentication
- 研究生新生培训第一周:深度学习和pytorch基础
- 【语义分割】SETR_Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformer
- Personal learning website
- Configuration and use of Nacos external database
- 一、PyTorch Cookbook(常用代码合集)
猜你喜欢
![[clustmaps] visitor statistics](/img/1a/173664a633fd14ea56696dd824acb6.png)
[clustmaps] visitor statistics

fastText学习——文本分类

Spring, summer, autumn and winter with Miss Zhang (3)

Flink connector Oracle CDC synchronizes data to MySQL in real time (oracle19c)

Reporting Services- Web Service

How to obtain openid of wechat applet in uni app project

【目标检测】KL-Loss:Bounding Box Regression with Uncertainty for Accurate Object Detection

"Full flash measurement" database acceleration solution

Exploration of flutter drawing skills: draw arrows together (skill development)

Discussion on the design of distributed full flash memory automatic test platform
随机推荐
Beijing Baode & taocloud jointly build the road of information innovation
【语义分割】Mapillary 数据集简介
MySQL inserts millions of data (using functions and stored procedures)
A preliminary study on fastjason's autotype
Wechat built-in browser prohibits caching
一、迁移学习与fine-tuning有什么区别?
【Transformer】AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
Set automatic build in idea - change the code, and refresh the page without restarting the project
Ribbon learning notes II
The difference between asyncawait and promise
Flink connector Oracle CDC synchronizes data to MySQL in real time (oracle12c)
ML11-SKlearn实现支持向量机
第2周学习:卷积神经网络基础
[semantic segmentation] full attention network for semantic segmentation
GA-RPN:引导锚点的建议区域网络
【图像分类】如何使用 mmclassification 训练自己的分类模型
MarkDown简明语法手册
Analysis on the principle of flow
D3.js vertical relationship diagram (with arrows and text description of connecting lines)
iSCSI vs iSER vs NVMe-TCP vs NVMe-RDMA