当前位置:网站首页>Some opinions and code implementation of Siou loss: more powerful learning for bounding box regression zhora gevorgyan
Some opinions and code implementation of Siou loss: more powerful learning for bounding box regression zhora gevorgyan
2022-07-07 11:32:00 【Optimistic, medium】
Recently, many official account are pushing this article , But I have some problems in the process of reading , Because the code is not open source , Understanding may not be correct , So first record , After open source, we can understand it more deeply compared with the code , I also hope that if some big guys see this article , Can you give me some advice on my immature views .
The final loss function of the experiment is calculated as follows :
among L c l s L_{cls} Lcls It is used. focal loss, W b o x W_{box} Wbox and W c l s W_{cls} Wcls The weight parameters are calculated according to genetic algorithm , L b o x L_{box} Lbox It is what this article mentions SIoU Loss , The calculation is as follows :
It mainly involves four parts of losses : Angle loss 、 Distance loss 、 Shape loss 、IoU Loss
1. Angle loss 
Here the author thinks , Angle factor can be considered , First, make the prediction box return to the same horizontal line or vertical line as the truth box , I agree with that , Can accelerate convergence , The author evaluates the loss through the following formula 
The formula consists of two parts , The first part is 1 − 2 s i n 2 ( x ) 1-2sin^2(x) 1−2sin2(x), In fact, it is c o s ( 2 x ) cos(2x) cos(2x), Make the x > 0 x>0 x>0 The situation of , Its value is only in x x x by π / 4 π/4 π/4 Take the minimum when , obtain 0, And in the x x x by 0 Take the maximum when , obtain 1; The second part is a r c s i n ( x ) − π / 4 arcsin(x)-π/4 arcsin(x)−π/4, among a r c s i n ( x ) arcsin(x) arcsin(x) That is to say α α α, It needs to be done − π / 4 -π/4 −π/4 The operation of is to consider moving the prediction box towards the side with a smaller angle , because β β β be equal to π / 2 − α π/2-α π/2−α, both − π / 4 -π/4 −π/4 The latter numbers are opposite to each other , after c o s cos cos The calculated value of the function is the same , When α α α by 0 When , Its loss is the smallest , But for π / 4 π/4 π/4 It's the biggest when I'm young . Finally, the prediction box moves faster to the horizontal or vertical line where the truth box is located .
2. Distance loss 
(1) about ρ x ρ_x ρx and ρ y ρ_y ρy The calculation of , I was still thinking that this would not always calculate the identity 1 Do you , Then I found the figure in the paper 3 The diagram is given , there c w c_w cw and c h c_h ch Refers to the length of the smallest external frame .
(2) about γ γ γ Calculation method of , My understanding is that first of all, we can get Λ Λ Λ The scope of should be [0,1], Here we pass 2 − Λ 2-Λ 2−Λ, First, prevent γ γ γ by 0 when ρ t ρ_t ρt Failure situation , Secondly, make Λ Λ Λ The smaller it is , ρ t ρ_t ρt The greater the impact of change on losses .
3. Shape loss 
Here and EIOU equally , Both take into account the true aspect ratio between the prediction box and the truth box , But for the ω x ω_x ωx and ω y ω_y ωy The calculation of , And EIOU We also need to calculate that the minimum frame length and width that can surround two frames are different , Only the length and width attributes of the truth box and the prediction box are used here , Less computation , It's supposed to be faster , But the specific effect is not clear . in addition , ω t ω_t ωt The range is [0,1], I don't think it is necessary to pass 1 − e − ω t 1-e^{-ω_t} 1−e−ωt Further calculation , Maybe the effect will be better , Finally, for θ θ θ The introduction of , It's not very understandable here .
(1) First of all, I don't understand why it would be better to introduce this factor .
(2) Secondly, distance loss can also introduce factors , Why not introduce .
4.IoU Loss
and GIOU The same as mentioned in , Here is the press 1 − I o U 1-IoU 1−IoU To calculate the
For some weights and in the article θ θ θ Parameters are calculated by genetic algorithm on the data set , I don't know the improvement effect of this part , Because the code is not open source , There are also doubts about some of these calculation methods , Therefore, it is impossible to verify the real effect of each improvement point
Here's my comment on SIoU A simple reproduction of , If there is any mistake, please correct it
#(x1,y1) and (x2,y2) They are the central coordinates of the prediction box and the real box
x1 = (b1_x1 + b1_x2) / 2
x2 = (b2_x1 + b2_x2) / 2
y1 = (b1_y1 + b1_y2) / 2
y2 = (b2_y1 + b2_y2) / 2
x_dis = torch.max(x1, x2) - torch.min(x1, x2)
y_dis = torch.max(y1, y2) - torch.min(y1, y2)
sigma = torch.pow(x_dis ** 2 + y_dis ** 2, 0.5) + eps
alpha = y_dis / sigma
beta = x_dis / sigma
threshold = pow(2, 0.5) / 2
sin_alpha = torch.where(alpha > threshold, beta, alpha)
#1 - 2 * sin(x) ** 2 Equate to cos(2x)
angle_cost = torch.cos(torch.arcsin(sin_alpha) * 2 - np.pi / 2)
cw += eps
ch += eps
rho_x = (x_dis / cw) ** 2
rho_y = (y_dis / ch) ** 2
gamma = 2 - angle_cost
distance_cost = 2 - torch.exp(-1 * gamma * rho_x) - torch.exp(-1 * gamma * rho_y)
omiga_w = torch.abs(w1 - w2) / (torch.max(w1, w2) + eps)
omiga_h = torch.abs(h1 - h2) / (torch.max(h1, h2) + eps)
# In the original paper theta stay 4 near , Range 2 To 6
theta = 4
shape_cost = torch.pow(1 - torch.exp(-1 * omiga_w), theta) + torch.pow(1 - torch.exp(-1 * omiga_h), theta)
return iou - 0.5 * (distance_cost + shape_cost)
边栏推荐
- [encapsulation of time format tool functions]
- JS array delete the specified element
- 使用引用
- 创意信息获2家机构调研:GreatDB 数据库已在9地部署
- sql里,我想设置外键,为什么出现这个问题
- Electron adding SQLite database
- 互联网协议
- Avoid mutating a prop directly since the value will be overwritten whenever the parent component
- The database synchronization tool dbsync adds support for mongodb and es
- vim 的各种用法,很实用哦,都是本人是在工作中学习和总结的
猜你喜欢

聊聊SOC启动(七) uboot启动流程三

基于DE2 115开发板驱动HC_SR04超声波测距模块【附源码】

The database synchronization tool dbsync adds support for mongodb and es

sql里,我想设置外键,为什么出现这个问题

Test the foundation of development, and teach you to prepare for a fully functional web platform environment

科普达人丨一文弄懂什么是云计算?

聊聊SOC启动(十) 内核启动先导知识

Learning notes | data Xiaobai uses dataease to make a large data screen

Reasons for the failure of web side automation test

使用MeterSphere让你的测试工作持续高效
随机推荐
verilog设计抢答器【附源码】
竟然有一半的人不知道 for 与 foreach 的区别???
R Language Using Image of magick package Mosaic Function and Image La fonction flatten empile plusieurs images ensemble pour former des couches empilées sur chaque autre
Talk about SOC startup (11) kernel initialization
【时间格式工具函数的封装】
互联网协议
Case study of Jinshan API translation function based on retrofit framework
Verilog realizes nixie tube display driver [with source code]
In my limited software testing experience, a full-time summary of automation testing experience
【系统设计】指标监控和告警系统
Automated testing framework
Briefly introduce closures and some application scenarios
Talk about SOC startup (VII) uboot startup process III
How to add aplayer music player in blog
本地navicat连接liunx下的oracle报权限不足
请查收.NET MAUI 的最新学习资源
[encapsulation of time format tool functions]
【愚公系列】2022年7月 Go教学课程 005-变量
Une fois que l'uniapp a sauté de la page dans onlaunch, cliquez sur Event Failure resolution
Two week selection of tdengine community issues | phase II