当前位置:网站首页>Learn CV two loss function from scratch (2)
Learn CV two loss function from scratch (2)
2022-07-08 02:19:00 【pogg_】
notes : Most of the content of this blog is not original , But I sort out the data I collected before , And integrate them with their own stupid solutions , Convenient for review , All references have been cited , And has been praised and collected ~
Preface : Continue to learn from scratch CV Part II loss function (1)
2.2.3 IoU Loss(Intersection-Over-Union, Intersection union ratio function )
This method was proposed by Kuangshi , Published in 2016 ACM, Thesis link :
https://arxiv.org/pdf/1608.01471.pdf
adopt 4 Independent regression of coordinate points Building boxes The shortcomings of :
(1) The way of detection and evaluation is to use IoU, The actual regression coordinate box is to use 4 Coordinates , As shown in the figure below , It's not equivalent ;L1 perhaps L2 Loss Same box , Its IoU It's not the only one
(2) adopt 4 The way a point returns to the coordinate box is to assume 4 The two coordinate points are independent of each other , Without considering its relevance , actual 4 Coordinate points have certain correlation
(3) be based on L1 and L2 The distance of loss It is not invariant to scale
Based on this, this paper puts forward IoU Loss, It will be 4 Composed of points box Return as a whole :
The red dot in the figure above indicates the target detection network structure Head Point on part (i,j), The green box indicates Ground truth box ,
The blue box indicates Prediction Box of ,IoU
loss As defined above , Find out first 2 A box IoU, Then ask for another -ln(IoU), In fact, many are directly defined as IoU Loss = 1-IoU
Image explanation part :
IoU = Green area /( Blue area + Green area + Orange area )
and IOU Loss It can be simply expressed as :
L I O U = 1 − I o U L_{I O U}=1-I o U LIOU=1−IoU or L I O U = − l n ( I O U ) L_{I O U}=-ln(I O U) LIOU=−ln(IOU)
2.2.4 GIoU Loss(Generalized Intersection over Union)
This method was proposed by Stanford scholars , Published in CVPR2019, Thesis link :
https://arxiv.org/pdf/1902.09630.pdf
IoU Loss The shortcomings of :
(1) When the prediction box and the target box do not intersect ,IoU(A,B)=0 when , Can't reflect A,B The distance , In this case, the loss function is not differentiable ,IoU Loss Can't optimize when two boxes don't intersect .
(2) Suppose that the size of the prediction box and the target box are determined , As long as the intersection value of the two boxes is certain , Its IoU When the values are the same ,IoU The value does not reflect how the two boxes intersect .
As shown in the figure above , Three boxes with different relative positions have the same IoU=0.33 value , But have different GIoU=0.33,0.24,-0.1. When the alignment direction of the box is better GIoU The value of will be higher .
The red box is A、B Outside rectangle
GIoU The implementation method is as follows , among C by A and B The outer rectangle of . use C subtract A and B The union of divided by C I get a number , Then use the box A and B Of IoU Subtract this value to get GIoU Value .
Image explanation part :
stay IoU Find one on the basis of “ It has a rectangular frame ”C, This global box can just put two b-box Put in . So for a little more area C_.
According to the diagram above : G I O U = I O U − C − / C GIOU = IOU - C_{-}/C GIOU=IOU−C−/C
GIOU loss It can be simply expressed as :
namely :
In two b-box Without intersection :
You can see GIOU Meeting ** Changes with the distance between the two boxes ,** So we can see that loss On , Guide the direction of the prediction box .
GIoU The nature of :
- GIoU and IoU equally , It can be used as a measure of distance , L G I o U = 1 − G L o U L_{GIoU}=1-GLoU LGIoU=1−GLoU
- GIoU Scale invariance
- about ∀ A , B \forall A,B ∀A,B , Yes G I o U ( A , B ) ≤ I o U ( A , B ) GIoU\left( A,B \right)\leq IoU\left( A,B \right) GIoU(A,B)≤IoU(A,B) And 0 ≤ I o U ( A , B ) ≤ 1 0\leq IoU\left( A,B \right)\leq1 0≤IoU(A,B)≤1 , therefore − 1 ≤ G I o U ( A , B ) ≤ 1 -1\leq GIoU\left( A,B \right)\leq1 −1≤GIoU(A,B)≤1 . When A → B A\rightarrow B A→B when , Both are equal to 1, here GIoU be equal to 1, When
A and B When they don't intersect , G I o U ( A , B ) = − 1 GIoU\left( A,B \right) = -1 GIoU(A,B)=−1
GIoU Loss Insufficient
When the target box completely wraps the prediction box ,IoU and GIoU The values are the same , here GIoU Degenerate to IoU, It is impossible to distinguish their relative position ; At this time, the author puts forward DIoU Because the normalized distance of the center point is added , Therefore, such problems can be better optimized .
Inspiration :
be based on IoU and GIoU The problem is , The author raises two questions :
- First of all : Whether it is feasible to directly minimize the normalized distance between the prediction frame and the target frame , In order to achieve faster convergence speed .
- second : How to make regression more accurate when it overlaps or even contains the target box 、 faster .
There are three important geometric factors that should be considered in a good target box regression loss : Overlap area , Distance from the center , Aspect ratio . Based on question one , The author puts forward DIoU Loss, be relative to GIoU Loss Faster convergence , The Loss Considering The overlap area and the distance from the center point , But the aspect ratio is not taken into account ; For question two , The author puts forward CIoU Loss, Its convergence accuracy is higher , All three factors have been taken into account .
2.2.5 DIoU Loss(Distance-IoU Loss)
This article is published in AAAI 2020, Thesis link :
https://arxiv.org/pdf/1911.08287.pdf
Usually based on IoU-based Of loss Can be defined as L = 1 − I o U + R ( B , B g t ) L = 1- IoU + R\left( B,B^{gt} \right) L=1−IoU+R(B,Bgt), among R ( B , B g t ) R\left( B,B^{gt} \right) R(B,Bgt) Defined as the prediction box B B B And target box B g t B^{gt} Bgt The penalty for .
DIoU The penalty item in is expressed as R D I o U = ρ 2 ( b , b g t ) c 2 R_{DIoU} =\frac{\rho^{2}\left( b,b^{gt} \right)}{c^{2}} RDIoU=c2ρ2(b,bgt) , among b and b g t b and b^{gt} b and bgt respectively B B B and B g t B^{gt} Bgt The center of , ρ ( ⋅ ) \rho\left( \cdot \right) ρ(⋅) It means Euclidean distance , c c c Express B B B and B g t B^{gt} Bgt The diagonal distance of the smallest outer rectangle , As shown in the figure below . Can be DIoU Replace IoU be used for NMS In the algorithm , That is, the paper puts forward DIoU-NMS, The experimental results show that there is a certain improvement .
DIoU Loss function Defined as : L D I o U = 1 − I o U + ρ 2 ( b , b g t ) c 2 L_{DIoU} = 1- IoU +\frac{\rho^{2}\left( b,b^{gt} \right)}{c^{2}} LDIoU=1−IoU+c2ρ2(b,bgt)
The green box in the above figure is the target box , The black box is the prediction box , The gray box is the smallest outer rectangle of both ,d Represents the distance between the center point of the target frame and the real frame ,c Represents the distance of the smallest outer rectangle .
DIoU The nature of :
- Scale invariance When the two boxes completely coincide , L I o U = L G I o U = L D I o U = 0 L_{IoU}=L_{GIoU}=L_{DIoU}=0 LIoU=LGIoU=LDIoU=0 , When 2 When two boxes do not intersect L D I o U → 2 L_{DIoU}\rightarrow 2 LDIoU→2
- DIoU Loss Can directly optimize 2 The direct distance between two boxes , Than GIoU Loss Faster convergence
- For the case of the target box and the package prediction box ,DIoU Loss Can converge quickly , and GIoU Loss At this time, it degenerates into IoU Loss Slow convergence
2.2.6 CIoU Loss(Complete-IoU Loss)
CIoU The penalty is in DIoU An influence factor is added to the penalty term of α υ \alpha\upsilon αυ , This factor takes into account the aspect ratio of the prediction frame and the aspect ratio of the fitting target frame ,CIoU Loss function For the definition of :
among α \alpha α It's used to do trade-off Parameters of
from α \alpha α It can be seen that , When IOU Less than 0.5 When ,CIOU It becomes DIOU.IOU The bigger it is , α \alpha α The closer the 1.
υ \upsilon υ Is a parameter used to measure the consistency of aspect ratio , Defined as :
that , stay IOU In a big way , ρ 2 ( p , p g t ) c 2 \frac{\rho^{2}\left(\boldsymbol{p}, \boldsymbol{p}^{g t}\right)}{c^{2}} c2ρ2(p,pgt) Turn into 0( The center points coincide ), It's time to adjust the aspect ratio .DIOU At this time ,loss And the gradient becomes smaller ( Only by IOU loss It's part of the transfer gradient ), and CIOU You can rely on the last one to keep loss Gradient of , So that the detector can quickly adjust itself and GT Frames have the same aspect ratio .
With a comparison chart to illustrate :
The first row is GIOU, The second row is CIOU, The green box at the origin is GT box , The black box is anchor box , The red box is the prediction box . You can see , In the prediction box and GT Boxes don't intersect ( namely IOU=0) Under the circumstances ,GIOU and CIOU They all have the ability to guide the movement of the detection box . here ,GIOU From the position 、 Aspect ratio 、size Adjust the prediction box from the same angle , and CIOU It's a quick pull back ( Don't move the shape of the prediction box much ), therefore CIOU Comparable GIOU Pull back the prediction box faster to make it IOU>0. etc. IOU>0 in the future ,CIOU Rapid adjustment size scale . etc. IOU>0.5 in the future ,CIOU The width to length ratio of ( It's also called aspect ratio ) The part begins to be the main part of gradient propagation , Make the prediction box and GT The frame has the same aspect ratio .
This is a simulation comparison made by the author in the paper , It can be seen that ,CIoU Loss The effect is the best .
Reference resources
[1] https://mp.weixin.qq.com/s/ZbryNlV3EnODofKs2d01RA
[2] https://blog.csdn.net/leviopku/article/details/114655338?spm=1001.2014.3001.5501
[3] https://blog.csdn.net/u013069552/article/details/113804323?utm_source=app&app_version=4.9.0&code=app_1562916241&uLinkId=usr1mkqgl919blen
边栏推荐
- Master go game through deep neural network and tree search
- Deep understanding of softmax
- 关于TXE和TC标志位的小知识
- Exit of processes and threads
- leetcode 869. Reordered Power of 2 | 869. 重新排序得到 2 的幂(状态压缩)
- 力扣6_1342. 将数字变成 0 的操作次数
- 力扣5_876. 链表的中间结点
- leetcode 865. Smallest Subtree with all the Deepest Nodes | 865. The smallest subtree with all the deepest nodes (BFs of the tree, parent reverse index map)
- Common disk formats and the differences between them
- 文盘Rust -- 给程序加个日志
猜你喜欢
银行需要搭建智能客服模块的中台能力,驱动全场景智能客服务升级
List of top ten domestic industrial 3D visual guidance enterprises in 2022
Spock单元测试框架介绍及在美团优选的实践_第三章(void无返回值方法mock方式)
Gaussian filtering and bilateral filtering principle, matlab implementation and result comparison
JVM memory and garbage collection -4-string
Neural network and deep learning-5-perceptron-pytorch
#797div3 A---C
[knowledge map] interpretable recommendation based on knowledge map through deep reinforcement learning
Leetcode question brushing record | 27_ Removing Elements
Ncnn+int8+yolov4 quantitative model and real-time reasoning
随机推荐
C language -cmake cmakelists Txt tutorial
咋吃都不胖的朋友,Nature告诉你原因:是基因突变了
如何用Diffusion models做interpolation插值任务?——原理解析和代码实战
The bank needs to build the middle office capability of the intelligent customer service module to drive the upgrade of the whole scene intelligent customer service
[error] error loading H5 weight attributeerror: 'STR' object has no attribute 'decode‘
"Hands on learning in depth" Chapter 2 - preparatory knowledge_ 2.2 data preprocessing_ Learning thinking and exercise answers
Strive to ensure that domestic events should be held as much as possible, and the State General Administration of sports has made it clear that offline sports events should be resumed safely and order
VR/AR 的产业发展与技术实现
UFS Power Management 介绍
List of top ten domestic industrial 3D visual guidance enterprises in 2022
企业培训解决方案——企业培训考试小程序
[knowledge map paper] r2d2: knowledge map reasoning based on debate dynamics
Talk about the realization of authority control and transaction record function of SAP system
In the digital transformation of the financial industry, the integration of business and technology needs to go through three stages
Neural network and deep learning-5-perceptron-pytorch
Spock单元测试框架介绍及在美团优选的实践_第二章(static静态方法mock方式)
Deep understanding of cross entropy loss function
Opengl/webgl shader development getting started guide
Ml backward propagation
Deep understanding of softmax