当前位置:网站首页>Some opinions and code implementation of Siou loss: more powerful learning for bounding box regression zhora gevorgyan
Some opinions and code implementation of Siou loss: more powerful learning for bounding box regression zhora gevorgyan
2022-07-07 11:32:00 【Optimistic, medium】
Recently, many official account are pushing this article , But I have some problems in the process of reading , Because the code is not open source , Understanding may not be correct , So first record , After open source, we can understand it more deeply compared with the code , I also hope that if some big guys see this article , Can you give me some advice on my immature views .
The final loss function of the experiment is calculated as follows :
among L c l s L_{cls} Lcls It is used. focal loss, W b o x W_{box} Wbox and W c l s W_{cls} Wcls The weight parameters are calculated according to genetic algorithm , L b o x L_{box} Lbox It is what this article mentions SIoU Loss , The calculation is as follows :
It mainly involves four parts of losses : Angle loss 、 Distance loss 、 Shape loss 、IoU Loss
1. Angle loss
Here the author thinks , Angle factor can be considered , First, make the prediction box return to the same horizontal line or vertical line as the truth box , I agree with that , Can accelerate convergence , The author evaluates the loss through the following formula
The formula consists of two parts , The first part is 1 − 2 s i n 2 ( x ) 1-2sin^2(x) 1−2sin2(x), In fact, it is c o s ( 2 x ) cos(2x) cos(2x), Make the x > 0 x>0 x>0 The situation of , Its value is only in x x x by π / 4 π/4 π/4 Take the minimum when , obtain 0, And in the x x x by 0 Take the maximum when , obtain 1; The second part is a r c s i n ( x ) − π / 4 arcsin(x)-π/4 arcsin(x)−π/4, among a r c s i n ( x ) arcsin(x) arcsin(x) That is to say α α α, It needs to be done − π / 4 -π/4 −π/4 The operation of is to consider moving the prediction box towards the side with a smaller angle , because β β β be equal to π / 2 − α π/2-α π/2−α, both − π / 4 -π/4 −π/4 The latter numbers are opposite to each other , after c o s cos cos The calculated value of the function is the same , When α α α by 0 When , Its loss is the smallest , But for π / 4 π/4 π/4 It's the biggest when I'm young . Finally, the prediction box moves faster to the horizontal or vertical line where the truth box is located .
2. Distance loss
(1) about ρ x ρ_x ρx and ρ y ρ_y ρy The calculation of , I was still thinking that this would not always calculate the identity 1 Do you , Then I found the figure in the paper 3 The diagram is given , there c w c_w cw and c h c_h ch Refers to the length of the smallest external frame .
(2) about γ γ γ Calculation method of , My understanding is that first of all, we can get Λ Λ Λ The scope of should be [0,1], Here we pass 2 − Λ 2-Λ 2−Λ, First, prevent γ γ γ by 0 when ρ t ρ_t ρt Failure situation , Secondly, make Λ Λ Λ The smaller it is , ρ t ρ_t ρt The greater the impact of change on losses .
3. Shape loss
Here and EIOU equally , Both take into account the true aspect ratio between the prediction box and the truth box , But for the ω x ω_x ωx and ω y ω_y ωy The calculation of , And EIOU We also need to calculate that the minimum frame length and width that can surround two frames are different , Only the length and width attributes of the truth box and the prediction box are used here , Less computation , It's supposed to be faster , But the specific effect is not clear . in addition , ω t ω_t ωt The range is [0,1], I don't think it is necessary to pass 1 − e − ω t 1-e^{-ω_t} 1−e−ωt Further calculation , Maybe the effect will be better , Finally, for θ θ θ The introduction of , It's not very understandable here .
(1) First of all, I don't understand why it would be better to introduce this factor .
(2) Secondly, distance loss can also introduce factors , Why not introduce .
4.IoU Loss
and GIOU The same as mentioned in , Here is the press 1 − I o U 1-IoU 1−IoU To calculate the
For some weights and in the article θ θ θ Parameters are calculated by genetic algorithm on the data set , I don't know the improvement effect of this part , Because the code is not open source , There are also doubts about some of these calculation methods , Therefore, it is impossible to verify the real effect of each improvement point
Here's my comment on SIoU A simple reproduction of , If there is any mistake, please correct it
#(x1,y1) and (x2,y2) They are the central coordinates of the prediction box and the real box
x1 = (b1_x1 + b1_x2) / 2
x2 = (b2_x1 + b2_x2) / 2
y1 = (b1_y1 + b1_y2) / 2
y2 = (b2_y1 + b2_y2) / 2
x_dis = torch.max(x1, x2) - torch.min(x1, x2)
y_dis = torch.max(y1, y2) - torch.min(y1, y2)
sigma = torch.pow(x_dis ** 2 + y_dis ** 2, 0.5) + eps
alpha = y_dis / sigma
beta = x_dis / sigma
threshold = pow(2, 0.5) / 2
sin_alpha = torch.where(alpha > threshold, beta, alpha)
#1 - 2 * sin(x) ** 2 Equate to cos(2x)
angle_cost = torch.cos(torch.arcsin(sin_alpha) * 2 - np.pi / 2)
cw += eps
ch += eps
rho_x = (x_dis / cw) ** 2
rho_y = (y_dis / ch) ** 2
gamma = 2 - angle_cost
distance_cost = 2 - torch.exp(-1 * gamma * rho_x) - torch.exp(-1 * gamma * rho_y)
omiga_w = torch.abs(w1 - w2) / (torch.max(w1, w2) + eps)
omiga_h = torch.abs(h1 - h2) / (torch.max(h1, h2) + eps)
# In the original paper theta stay 4 near , Range 2 To 6
theta = 4
shape_cost = torch.pow(1 - torch.exp(-1 * omiga_w), theta) + torch.pow(1 - torch.exp(-1 * omiga_h), theta)
return iou - 0.5 * (distance_cost + shape_cost)
边栏推荐
- 聊聊SOC启动(九) 为uboot 添加新的board
- Automated testing framework
- Poor math students who once dropped out of school won the fields award this year
- The database synchronization tool dbsync adds support for mongodb and es
- electron添加SQLite数据库
- Debezium同步之Debezium架构详解
- R language uses the quantile function to calculate the quantile of the score value (20%, 40%, 60%, 80%), uses the logical operator to encode the corresponding quantile interval (quantile) into the cla
- Neural approvals to conversational AI (1)
- What development models did you know during the interview? Just read this one
- Verilog 实现数码管显视驱动【附源码】
猜你喜欢
Use metersphere to keep your testing work efficient
.NET MAUI 性能提升
自动化测试框架
聊聊SOC启动(九) 为uboot 添加新的board
The running kubernetes cluster wants to adjust the network segment address of pod
Electron adding SQLite database
Half of the people don't know the difference between for and foreach???
面试被问到了解哪些开发模型?看这一篇就够了
Verilog 实现数码管显视驱动【附源码】
【系统设计】指标监控和告警系统
随机推荐
【愚公系列】2022年7月 Go教学课程 005-变量
R语言可视化分面图、假设检验、多变量分组t检验、可视化多变量分组分面箱图(faceting boxplot)并添加显著性水平、添加抖动数据点(jitter points)
LeetCode - 面试题17.24 最大子矩阵
Qt 实现容器的DELETE的方式
技术分享 | 抓包分析 TCP 协议
毕业季|与青春作伴,一起向未来!
Verilog design responder [with source code]
About the application of writing shell script JSON in JMeter
Web端自动化测试失败的原因
RationalDMIS2022 高级编程宏程序
Talk about SOC startup (11) kernel initialization
Apprentissage comparatif non supervisé des caractéristiques visuelles par les assignations de groupes de contrôle
禁锢自己的因素,原来有这么多
OneDNS助力高校行业网络安全
The database synchronization tool dbsync adds support for mongodb and es
Technology sharing | packet capturing analysis TCP protocol
There are so many factors that imprison you
Automated testing framework
The running kubernetes cluster wants to adjust the network segment address of pod
基于Retrofit框架的金山API翻译功能案例