当前位置:网站首页>Learn CV two loss function from scratch (2)
Learn CV two loss function from scratch (2)
2022-07-08 02:19:00 【pogg_】

notes : Most of the content of this blog is not original , But I sort out the data I collected before , And integrate them with their own stupid solutions , Convenient for review , All references have been cited , And has been praised and collected ~
Preface : Continue to learn from scratch CV Part II loss function (1)
2.2.3 IoU Loss(Intersection-Over-Union, Intersection union ratio function )
This method was proposed by Kuangshi , Published in 2016 ACM, Thesis link :
https://arxiv.org/pdf/1608.01471.pdf
adopt 4 Independent regression of coordinate points Building boxes The shortcomings of :
(1) The way of detection and evaluation is to use IoU, The actual regression coordinate box is to use 4 Coordinates , As shown in the figure below , It's not equivalent ;L1 perhaps L2 Loss Same box , Its IoU It's not the only one
(2) adopt 4 The way a point returns to the coordinate box is to assume 4 The two coordinate points are independent of each other , Without considering its relevance , actual 4 Coordinate points have certain correlation
(3) be based on L1 and L2 The distance of loss It is not invariant to scale 
Based on this, this paper puts forward IoU Loss, It will be 4 Composed of points box Return as a whole :
The red dot in the figure above indicates the target detection network structure Head Point on part (i,j), The green box indicates Ground truth box ,
The blue box indicates Prediction Box of ,IoU
loss As defined above , Find out first 2 A box IoU, Then ask for another -ln(IoU), In fact, many are directly defined as IoU Loss = 1-IoU

Image explanation part :
IoU = Green area /( Blue area + Green area + Orange area )
and IOU Loss It can be simply expressed as :
L I O U = 1 − I o U L_{I O U}=1-I o U LIOU=1−IoU or L I O U = − l n ( I O U ) L_{I O U}=-ln(I O U) LIOU=−ln(IOU)
2.2.4 GIoU Loss(Generalized Intersection over Union)
This method was proposed by Stanford scholars , Published in CVPR2019, Thesis link :
https://arxiv.org/pdf/1902.09630.pdf
IoU Loss The shortcomings of :
(1) When the prediction box and the target box do not intersect ,IoU(A,B)=0 when , Can't reflect A,B The distance , In this case, the loss function is not differentiable ,IoU Loss Can't optimize when two boxes don't intersect .
(2) Suppose that the size of the prediction box and the target box are determined , As long as the intersection value of the two boxes is certain , Its IoU When the values are the same ,IoU The value does not reflect how the two boxes intersect .
As shown in the figure above , Three boxes with different relative positions have the same IoU=0.33 value , But have different GIoU=0.33,0.24,-0.1. When the alignment direction of the box is better GIoU The value of will be higher .
The red box is A、B Outside rectangle 
GIoU The implementation method is as follows , among C by A and B The outer rectangle of . use C subtract A and B The union of divided by C I get a number , Then use the box A and B Of IoU Subtract this value to get GIoU Value .
Image explanation part :
stay IoU Find one on the basis of “ It has a rectangular frame ”C, This global box can just put two b-box Put in . So for a little more area C_.
According to the diagram above : G I O U = I O U − C − / C GIOU = IOU - C_{-}/C GIOU=IOU−C−/C
GIOU loss It can be simply expressed as :
namely :
In two b-box Without intersection :
You can see GIOU Meeting ** Changes with the distance between the two boxes ,** So we can see that loss On , Guide the direction of the prediction box .
GIoU The nature of :
- GIoU and IoU equally , It can be used as a measure of distance , L G I o U = 1 − G L o U L_{GIoU}=1-GLoU LGIoU=1−GLoU
- GIoU Scale invariance
- about ∀ A , B \forall A,B ∀A,B , Yes G I o U ( A , B ) ≤ I o U ( A , B ) GIoU\left( A,B \right)\leq IoU\left( A,B \right) GIoU(A,B)≤IoU(A,B) And 0 ≤ I o U ( A , B ) ≤ 1 0\leq IoU\left( A,B \right)\leq1 0≤IoU(A,B)≤1 , therefore − 1 ≤ G I o U ( A , B ) ≤ 1 -1\leq GIoU\left( A,B \right)\leq1 −1≤GIoU(A,B)≤1 . When A → B A\rightarrow B A→B when , Both are equal to 1, here GIoU be equal to 1, When
A and B When they don't intersect , G I o U ( A , B ) = − 1 GIoU\left( A,B \right) = -1 GIoU(A,B)=−1
GIoU Loss Insufficient 
When the target box completely wraps the prediction box ,IoU and GIoU The values are the same , here GIoU Degenerate to IoU, It is impossible to distinguish their relative position ; At this time, the author puts forward DIoU Because the normalized distance of the center point is added , Therefore, such problems can be better optimized .
Inspiration :
be based on IoU and GIoU The problem is , The author raises two questions :
- First of all : Whether it is feasible to directly minimize the normalized distance between the prediction frame and the target frame , In order to achieve faster convergence speed .
- second : How to make regression more accurate when it overlaps or even contains the target box 、 faster .
There are three important geometric factors that should be considered in a good target box regression loss : Overlap area , Distance from the center , Aspect ratio . Based on question one , The author puts forward DIoU Loss, be relative to GIoU Loss Faster convergence , The Loss Considering The overlap area and the distance from the center point , But the aspect ratio is not taken into account ; For question two , The author puts forward CIoU Loss, Its convergence accuracy is higher , All three factors have been taken into account .
2.2.5 DIoU Loss(Distance-IoU Loss)
This article is published in AAAI 2020, Thesis link :
https://arxiv.org/pdf/1911.08287.pdf
Usually based on IoU-based Of loss Can be defined as L = 1 − I o U + R ( B , B g t ) L = 1- IoU + R\left( B,B^{gt} \right) L=1−IoU+R(B,Bgt), among R ( B , B g t ) R\left( B,B^{gt} \right) R(B,Bgt) Defined as the prediction box B B B And target box B g t B^{gt} Bgt The penalty for .
DIoU The penalty item in is expressed as R D I o U = ρ 2 ( b , b g t ) c 2 R_{DIoU} =\frac{\rho^{2}\left( b,b^{gt} \right)}{c^{2}} RDIoU=c2ρ2(b,bgt) , among b and b g t b and b^{gt} b and bgt respectively B B B and B g t B^{gt} Bgt The center of , ρ ( ⋅ ) \rho\left( \cdot \right) ρ(⋅) It means Euclidean distance , c c c Express B B B and B g t B^{gt} Bgt The diagonal distance of the smallest outer rectangle , As shown in the figure below . Can be DIoU Replace IoU be used for NMS In the algorithm , That is, the paper puts forward DIoU-NMS, The experimental results show that there is a certain improvement .
DIoU Loss function Defined as : L D I o U = 1 − I o U + ρ 2 ( b , b g t ) c 2 L_{DIoU} = 1- IoU +\frac{\rho^{2}\left( b,b^{gt} \right)}{c^{2}} LDIoU=1−IoU+c2ρ2(b,bgt)

The green box in the above figure is the target box , The black box is the prediction box , The gray box is the smallest outer rectangle of both ,d Represents the distance between the center point of the target frame and the real frame ,c Represents the distance of the smallest outer rectangle .
DIoU The nature of :
- Scale invariance When the two boxes completely coincide , L I o U = L G I o U = L D I o U = 0 L_{IoU}=L_{GIoU}=L_{DIoU}=0 LIoU=LGIoU=LDIoU=0 , When 2 When two boxes do not intersect L D I o U → 2 L_{DIoU}\rightarrow 2 LDIoU→2
- DIoU Loss Can directly optimize 2 The direct distance between two boxes , Than GIoU Loss Faster convergence
- For the case of the target box and the package prediction box ,DIoU Loss Can converge quickly , and GIoU Loss At this time, it degenerates into IoU Loss Slow convergence
2.2.6 CIoU Loss(Complete-IoU Loss)
CIoU The penalty is in DIoU An influence factor is added to the penalty term of α υ \alpha\upsilon αυ , This factor takes into account the aspect ratio of the prediction frame and the aspect ratio of the fitting target frame ,CIoU Loss function For the definition of :
among α \alpha α It's used to do trade-off Parameters of 
from α \alpha α It can be seen that , When IOU Less than 0.5 When ,CIOU It becomes DIOU.IOU The bigger it is , α \alpha α The closer the 1.
υ \upsilon υ Is a parameter used to measure the consistency of aspect ratio , Defined as :
that , stay IOU In a big way , ρ 2 ( p , p g t ) c 2 \frac{\rho^{2}\left(\boldsymbol{p}, \boldsymbol{p}^{g t}\right)}{c^{2}} c2ρ2(p,pgt) Turn into 0( The center points coincide ), It's time to adjust the aspect ratio .DIOU At this time ,loss And the gradient becomes smaller ( Only by IOU loss It's part of the transfer gradient ), and CIOU You can rely on the last one to keep loss Gradient of , So that the detector can quickly adjust itself and GT Frames have the same aspect ratio .
With a comparison chart to illustrate :
The first row is GIOU, The second row is CIOU, The green box at the origin is GT box , The black box is anchor box , The red box is the prediction box . You can see , In the prediction box and GT Boxes don't intersect ( namely IOU=0) Under the circumstances ,GIOU and CIOU They all have the ability to guide the movement of the detection box . here ,GIOU From the position 、 Aspect ratio 、size Adjust the prediction box from the same angle , and CIOU It's a quick pull back ( Don't move the shape of the prediction box much ), therefore CIOU Comparable GIOU Pull back the prediction box faster to make it IOU>0. etc. IOU>0 in the future ,CIOU Rapid adjustment size scale . etc. IOU>0.5 in the future ,CIOU The width to length ratio of ( It's also called aspect ratio ) The part begins to be the main part of gradient propagation , Make the prediction box and GT The frame has the same aspect ratio .
This is a simulation comparison made by the author in the paper , It can be seen that ,CIoU Loss The effect is the best .
Reference resources
[1] https://mp.weixin.qq.com/s/ZbryNlV3EnODofKs2d01RA
[2] https://blog.csdn.net/leviopku/article/details/114655338?spm=1001.2014.3001.5501
[3] https://blog.csdn.net/u013069552/article/details/113804323?utm_source=app&app_version=4.9.0&code=app_1562916241&uLinkId=usr1mkqgl919blen
边栏推荐
- 实现前缀树
- Leetcode question brushing record | 485_ Maximum number of consecutive ones
- leetcode 866. Prime Palindrome | 866. 回文素数
- 文盘Rust -- 给程序加个日志
- 阿锅鱼的大度
- The way fish and shrimp go
- Spock单元测试框架介绍及在美团优选的实践_第二章(static静态方法mock方式)
- Leetcode question brushing record | 283_ Move zero
- 常见的磁盘格式以及它们之间的区别
- [recommendation system paper reading] recommendation simulation user feedback based on Reinforcement Learning
猜你喜欢

th:include的使用

Key points of data link layer and network layer protocol

Applet running under the framework of fluent 3.0

OpenGL/WebGL着色器开发入门指南

Thread deadlock -- conditions for deadlock generation

See how names are added to namespace STD from cmath file

Talk about the cloud deployment of local projects created by SAP IRPA studio

leetcode 869. Reordered Power of 2 | 869. 重新排序得到 2 的幂(状态压缩)

Leetcode featured 200 channels -- array article

Leetcode featured 200 -- linked list
随机推荐
牛熊周期与加密的未来如何演变?看看红杉资本怎么说
Deep understanding of softmax
C language -cmake cmakelists Txt tutorial
Kwai applet guaranteed payment PHP source code packaging
力扣5_876. 链表的中间结点
Monthly observation of internet medical field in May 2022
BizDevOps与DevOps的关系
Leetcode question brushing record | 27_ Removing Elements
The way fish and shrimp go
leetcode 865. Smallest Subtree with all the Deepest Nodes | 865. The smallest subtree with all the deepest nodes (BFs of the tree, parent reverse index map)
【每日一题】648. 单词替换
XMeter Newsletter 2022-06|企业版 v3.2.3 发布,错误日志与测试报告图表优化
如何用Diffusion models做interpolation插值任务?——原理解析和代码实战
"Hands on learning in depth" Chapter 2 - preparatory knowledge_ 2.1 data operation_ Learning thinking and exercise answers
metasploit
Learn face detection from scratch: retinaface (including magic modified ghostnet+mbv2)
喜欢测特曼的阿洛
关于TXE和TC标志位的小知识
Height of life
Xiaobai tutorial: Raspberry pie 3b+onnxruntime+scrfd+flask to realize public face detection system