当前位置:网站首页>[target detection] generalized focal loss v1
[target detection] generalized focal loss v1
2022-07-29 06:04:00 【Dull cat】
List of articles

The paper : https://arxiv.org/pdf/2006.04388.pdf
Code :https://github.com/open-mmlab/mmdetection/tree/master/configs/gfl
Source :NIPS2020
emphasis :
- A new method for determining the position of the bounding box is proposed generalize Modeling of distribution ( The clearer the boundary, the better the learning , The distribution will be sharp , The more fuzzy the boundary is, the worse the learning will be , Flat distribution )
One 、 background
One-stage Target detector basically models target detection as a task of dense classification and location .
Classification tasks generally use Focal Loss To optimize , Positioning tasks are generally learning Dirac delta Distribution .
Such as FCOS A quantity for estimating the positioning quality is proposed in :IoU score or centerness score, then NMS When sorting , Multiply the classification score by the box quality score .
Current One-stage The target detector usually introduces a separate prediction branch to quantify the positioning effect , The prediction effect of positioning is helpful for classification , So as to improve the detection performance .
This paper proposes three basic elements :
- Quality estimation of detection frame ( Such as IoU score or FCOS Of centerness score)
- classification
- location
There are two main problems in the current implementation :
1、 The classification score and frame quality estimation are inconsistent during training and testing

Inconsistent usage : Classification and quality estimation , In the training process is separate , But in the test process, it is multiplied together , As NMS score Sort by , There is a certain gap
The objects are not the same : With the help of Focal Loss Power , Classification and branching can make a small number of positive samples and a large number of negative samples train together , But the quality estimation of the box is actually only for positive sample training .
about one-stage detector , do NMS When sorting , All samples will multiply the classification score by the box quality score , To sort , Therefore, there must be some negative samples with low scores whose quality prediction has no supervision signal in the training process , That is, the quality of a large number of negative samples is not measured . This will lead to a negative sample with a low classification score , Due to the prediction of a very high box quality score , As a result, it is predicted to be a positive sample .

2、bbox regression The expression of is not flexible (Dirac delta Inflexible distribution ), There is no way to model complex scenes uncertainty
- In a complex scene , The representation of bounding box has strong uncertainty , The essence of the existing box regression is to model a very single Dirac distribution , Very inflexible . So the author hopes to use a general To model the representation of the bounding box . The problem is shown in the figure 3 Shown ( Like a skateboard blurred by water , And heavily sheltered elephants ):

Two 、 Method
For the two existing problems :
① Training and testing are inconsistent
② The modeling of frame position distribution is not universal
The author proposes the following solutions .
Solve problem one : Build a classification-IoU joint representation
For the first inconsistency between training and testing , In order to ensure the consistency of training and testing , At the same time, both classification and frame quality prediction can be trained to all positive and negative samples , The author proposes to combine the expression of box with classification score .
Method :
When the category of prediction is ground-truth When it comes to categories , Use position quality score As confidence , The position quality score in this paper is to use IoU Score to measure .

Problem solving II : Directly regress an arbitrary distribution to model the representation of the box
Method : Use softmax To achieve , It involves deriving from the integral form of Dirac distribution to the integral form of general distribution to express the box
thus , It eliminates the inconsistency between training and testing , And established as shown in the figure 2b Strong correlation between classification and positioning .
Besides , Negative samples can be used 0 quality scores To supervise .

Generalized Focal Loss The composition of the :
- QFL:Quality Focal Loss, Joint expression of learning classification score and position score
- DFL:Distribution Focal Loss, Model the position of the box as a general distribution, Let the network quickly focus on the distribution of positions close to the target position
Generalized Focal Loss How it was put forward :
① original FL:
Today's intensive forecasting tasks , Generally used Focal Loss To optimize the classification and branch , Can solve the prospect 、 Problems such as the imbalance of the number of backgrounds , The formula is as follows , But it can only support 0/1 Such discrete categories label.

**① Put forward QFL:Quality Focal Loss **
The standard one-hot The code is 1, Other positions are 0.
Use classification-IoU features , Be able to put the standard one-hot Coding softens , Make it more soft, The goal of learning y ∈ [ 0 , 1 ] y\in[0,1] y∈[0,1], Rather than direct learning objectives “1”.
For this paper, joint representation ,label Turned into 0~1 Continuous values of .FL No longer applicable .
- y=0 when , Negative samples ,quality score by 0
- 0<y<=1 when , Indicates a positive sample , And position score label y It's using IoU score It means , be in 0~1 Between

In order to ensure QFL Yes Focal Loss The balance of difficult and easy samples 、 The ability of positive and negative samples , It can also support the supervision of continuous values , Need to be right FL Make some extensions .
- Cross entropy − l o g ( p t ) -log(p_t) −log(pt) An extension of : − ( ( 1 − y ) l o g ( 1 − σ ) + y l o g ( σ ) ) -((1-y)log(1-\sigma) + ylog(\sigma)) −((1−y)log(1−σ)+ylog(σ))
- Modulation factor ( 1 − p t ) γ (1-p_t)^\gamma (1−pt)γ An extension of : ∣ y − σ ∣ β ( β > = 0 ) |y-\sigma|^\beta (\beta >=0) ∣y−σ∣β(β>=0)
Quality Focal Loss(QFL) Ultimately for :

- σ = y \sigma = y σ=y yes QFL Global minimum solution
- chart 5a It shows the difference β \beta β The effect of (y=0.5)
- ∥ y − σ ∥ β \|y-\sigma\|^\beta ∥y−σ∥β Is a modulation factor , When a sample quality When the estimation is inaccurate , The modulation factor will be very large , Let the network pay more attention to this difficult sample , When quality When the estimate of tends to be accurate , namely σ \sigma σ → y y y when , The modulation factor tends to 0, This sample pair loss The influence weight of will be reduced . β \beta β Control the process of reduction , this paper β = 2 \beta=2 β=2 The optimal .

② Put forward DFL: Distribution Focal Loss
The position learning in this paper takes the relative offset as the regression goal , And the previous articles are generally based on Dirac distribution δ ( x − y ) \delta(x-y) δ(x−y) For guidance , Satisfy ∫ − ∞ + ∞ δ ( x − y ) d x = 1 \int_{-\infty}^{+\infty} \delta(x-y)dx = 1 ∫−∞+∞δ(x−y)dx=1, We usually use the full connection layer to realize .
But this paper takes into account the diversity of real distribution , Choose to use a more general distribution to represent the location distribution .
The real distribution is usually not too far away from the location of the annotation , So another one is added Loss

- DFL It can make the network focus on the target faster y y y Nearby values , Increase their probability
- The meaning is to optimize and label in the form of cross entropy y The probability of the closest left and right positions , So that the network can focus on the distribution of adjacent areas of the target location faster
QFL and DFL It can be expressed as GFL:

- Variable is y l y_l yl and y r y_r yr
- The predicted distributions of the above two variables are : p y l p_{y_l} pyl and p y r p_{y_r} pyr, And p y l + p y r = 1 p_{y_l} + p_{y_r} = 1 pyl+pyr=1
- The final prediction is : y ^ = y l p y l + y r p y r \hat{y}=y_lp_{y_l}+y_rp_{y_r} y^=ylpyl+yrpyr, And y l < = y ^ < = y r y_l <= \hat{y} <= y_r yl<=y^<=yr
Trained loss as follows :

3、 ... and 、 effect


边栏推荐
- isAccessible()方法:使用反射技巧让你的性能提升数倍
- anaconda中移除旧环境、增加新环境、查看环境、安装库、清理缓存等操作命令
- 并发编程学习笔记 之 原子操作类AtomicReference、AtomicStampedReference详解
- 电脑视频暂停再继续,声音突然变大
- 【Transformer】AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
- Lock lock of concurrent programming learning notes and its implementation basic usage of reentrantlock, reentrantreadwritelock and stampedlock
- 虚假新闻检测论文阅读(四):A novel self-learning semi-supervised deep learning network to detect fake news on...
- Flink connector Oracle CDC synchronizes data to MySQL in real time (oracle19c)
- GAN:生成对抗网络 Generative Adversarial Networks
- ASM插桩:学完ASM Tree api,再也不用怕hook了
猜你喜欢

anaconda中移除旧环境、增加新环境、查看环境、安装库、清理缓存等操作命令

MarkDown简明语法手册
![[ml] PMML of machine learning model -- Overview](/img/a1/cd3eff044d903dbcfb880e854713e5.png)
[ml] PMML of machine learning model -- Overview

PyTorch的数据读取机制

【Transformer】SOFT: Softmax-free Transformer with Linear Complexity

fastText学习——文本分类

【图像分类】如何使用 mmclassification 训练自己的分类模型

ROS教程(Xavier)
![[DL] introduction and understanding of tensor](/img/d8/a367c26b51d9dbaf53bf4fe2a13917.png)
[DL] introduction and understanding of tensor

Tear the ORM framework by hand (generic + annotation + reflection)
随机推荐
Spring, summer, autumn and winter with Miss Zhang (2)
A preliminary study on fastjason's autotype
【Transformer】TransMix: Attend to Mix for Vision Transformers
Detailed explanation of tool classes countdownlatch and cyclicbarrier of concurrent programming learning notes
FFmpeg创作GIF表情包教程来了!赶紧说声多谢乌蝇哥?
[go] use of defer
【pycharm】pycharm远程连接服务器
IDEA中设置自动build-改动代码,不用重启工程,刷新页面即可
Analysis on the principle of flow
Is flutter being quietly abandoned? On the future of flutter
Ribbon learning notes 1
[convolution kernel design] scaling up your kernels to 31x31: revising large kernel design in CNN
微信小程序源码获取(附工具的下载)
[DL] introduction and understanding of tensor
Flink connector Oracle CDC synchronizes data to MySQL in real time (oracle19c)
【Transformer】AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
ANR优化:导致 OOM 崩溃及相对应的解决方案
PHP write a diaper to buy the lowest price in the whole network
Flutter正在被悄悄放弃?浅析Flutter的未来
这些你一定要知道的进程知识