当前位置:网站首页>Some opinions and code implementation of Siou loss: more powerful learning for bounding box regression zhora gevorgyan
Some opinions and code implementation of Siou loss: more powerful learning for bounding box regression zhora gevorgyan
2022-07-07 11:32:00 【Optimistic, medium】
Recently, many official account are pushing this article , But I have some problems in the process of reading , Because the code is not open source , Understanding may not be correct , So first record , After open source, we can understand it more deeply compared with the code , I also hope that if some big guys see this article , Can you give me some advice on my immature views .
The final loss function of the experiment is calculated as follows :
among L c l s L_{cls} Lcls It is used. focal loss, W b o x W_{box} Wbox and W c l s W_{cls} Wcls The weight parameters are calculated according to genetic algorithm , L b o x L_{box} Lbox It is what this article mentions SIoU Loss , The calculation is as follows :
It mainly involves four parts of losses : Angle loss 、 Distance loss 、 Shape loss 、IoU Loss
1. Angle loss
Here the author thinks , Angle factor can be considered , First, make the prediction box return to the same horizontal line or vertical line as the truth box , I agree with that , Can accelerate convergence , The author evaluates the loss through the following formula
The formula consists of two parts , The first part is 1 − 2 s i n 2 ( x ) 1-2sin^2(x) 1−2sin2(x), In fact, it is c o s ( 2 x ) cos(2x) cos(2x), Make the x > 0 x>0 x>0 The situation of , Its value is only in x x x by π / 4 π/4 π/4 Take the minimum when , obtain 0, And in the x x x by 0 Take the maximum when , obtain 1; The second part is a r c s i n ( x ) − π / 4 arcsin(x)-π/4 arcsin(x)−π/4, among a r c s i n ( x ) arcsin(x) arcsin(x) That is to say α α α, It needs to be done − π / 4 -π/4 −π/4 The operation of is to consider moving the prediction box towards the side with a smaller angle , because β β β be equal to π / 2 − α π/2-α π/2−α, both − π / 4 -π/4 −π/4 The latter numbers are opposite to each other , after c o s cos cos The calculated value of the function is the same , When α α α by 0 When , Its loss is the smallest , But for π / 4 π/4 π/4 It's the biggest when I'm young . Finally, the prediction box moves faster to the horizontal or vertical line where the truth box is located .
2. Distance loss
(1) about ρ x ρ_x ρx and ρ y ρ_y ρy The calculation of , I was still thinking that this would not always calculate the identity 1 Do you , Then I found the figure in the paper 3 The diagram is given , there c w c_w cw and c h c_h ch Refers to the length of the smallest external frame .
(2) about γ γ γ Calculation method of , My understanding is that first of all, we can get Λ Λ Λ The scope of should be [0,1], Here we pass 2 − Λ 2-Λ 2−Λ, First, prevent γ γ γ by 0 when ρ t ρ_t ρt Failure situation , Secondly, make Λ Λ Λ The smaller it is , ρ t ρ_t ρt The greater the impact of change on losses .
3. Shape loss
Here and EIOU equally , Both take into account the true aspect ratio between the prediction box and the truth box , But for the ω x ω_x ωx and ω y ω_y ωy The calculation of , And EIOU We also need to calculate that the minimum frame length and width that can surround two frames are different , Only the length and width attributes of the truth box and the prediction box are used here , Less computation , It's supposed to be faster , But the specific effect is not clear . in addition , ω t ω_t ωt The range is [0,1], I don't think it is necessary to pass 1 − e − ω t 1-e^{-ω_t} 1−e−ωt Further calculation , Maybe the effect will be better , Finally, for θ θ θ The introduction of , It's not very understandable here .
(1) First of all, I don't understand why it would be better to introduce this factor .
(2) Secondly, distance loss can also introduce factors , Why not introduce .
4.IoU Loss
and GIOU The same as mentioned in , Here is the press 1 − I o U 1-IoU 1−IoU To calculate the
For some weights and in the article θ θ θ Parameters are calculated by genetic algorithm on the data set , I don't know the improvement effect of this part , Because the code is not open source , There are also doubts about some of these calculation methods , Therefore, it is impossible to verify the real effect of each improvement point
Here's my comment on SIoU A simple reproduction of , If there is any mistake, please correct it
#(x1,y1) and (x2,y2) They are the central coordinates of the prediction box and the real box
x1 = (b1_x1 + b1_x2) / 2
x2 = (b2_x1 + b2_x2) / 2
y1 = (b1_y1 + b1_y2) / 2
y2 = (b2_y1 + b2_y2) / 2
x_dis = torch.max(x1, x2) - torch.min(x1, x2)
y_dis = torch.max(y1, y2) - torch.min(y1, y2)
sigma = torch.pow(x_dis ** 2 + y_dis ** 2, 0.5) + eps
alpha = y_dis / sigma
beta = x_dis / sigma
threshold = pow(2, 0.5) / 2
sin_alpha = torch.where(alpha > threshold, beta, alpha)
#1 - 2 * sin(x) ** 2 Equate to cos(2x)
angle_cost = torch.cos(torch.arcsin(sin_alpha) * 2 - np.pi / 2)
cw += eps
ch += eps
rho_x = (x_dis / cw) ** 2
rho_y = (y_dis / ch) ** 2
gamma = 2 - angle_cost
distance_cost = 2 - torch.exp(-1 * gamma * rho_x) - torch.exp(-1 * gamma * rho_y)
omiga_w = torch.abs(w1 - w2) / (torch.max(w1, w2) + eps)
omiga_h = torch.abs(h1 - h2) / (torch.max(h1, h2) + eps)
# In the original paper theta stay 4 near , Range 2 To 6
theta = 4
shape_cost = torch.pow(1 - torch.exp(-1 * omiga_w), theta) + torch.pow(1 - torch.exp(-1 * omiga_h), theta)
return iou - 0.5 * (distance_cost + shape_cost)
边栏推荐
- 自律,提升自制力原来也有方法
- 什么是高内聚、低耦合?
- Verilog 实现数码管显视驱动【附源码】
- TDengine 社区问题双周精选 | 第二期
- 创意信息获2家机构调研:GreatDB 数据库已在9地部署
- 大佬们有没有人遇到过 flink oracle cdc,读取一个没有更新操作的表,隔十几秒就重复读取
- R語言使用magick包的image_mosaic函數和image_flatten函數把多張圖片堆疊在一起形成堆疊組合圖像(Stack layers on top of each other)
- 如何在博客中添加Aplayer音乐播放器
- 【系统设计】指标监控和告警系统
- QT | multiple windows share a prompt box class
猜你喜欢
Avoid mutating a prop directly since the value will be overwritten whenever the parent component
关于测试人生的一站式发展建议
Zhou Yajin, a top safety scholar of Zhejiang University, is a curiosity driven activist
Poor math students who once dropped out of school won the fields award this year
Debezium同步之Debezium架构详解
Using ENSP to do MPLS pseudo wire test
Onedns helps college industry network security
Antd select selector drop-down box follows the scroll bar to scroll through the solution
Design intelligent weighing system based on Huawei cloud IOT (STM32)
使用MeterSphere让你的测试工作持续高效
随机推荐
The running kubernetes cluster wants to adjust the network segment address of pod
分布式数据库主从配置(MySQL)
自动化测试框架
vim 的各种用法,很实用哦,都是本人是在工作中学习和总结的
Use metersphere to keep your testing work efficient
使用MeterSphere让你的测试工作持续高效
网络协议 概念
【时间格式工具函数的封装】
sink 消费 到 MySQL, 数据库表里面已经设置了 自增主键, flink 里面,如何 操作?
R Language Using Image of magick package Mosaic Function and Image La fonction flatten empile plusieurs images ensemble pour former des couches empilées sur chaque autre
软件设计之——“高内聚低耦合”
STM32 entry development write DS18B20 temperature sensor driver (read ambient temperature, support cascade)
Android 面试知识点
Blog moved to Zhihu
对比学习之 Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
R language Visual facet chart, hypothesis test, multivariable grouping t-test, visual multivariable grouping faceting boxplot, and add significance levels and jitter points
创意信息获2家机构调研:GreatDB 数据库已在9地部署
禁锢自己的因素,原来有这么多
The database synchronization tool dbsync adds support for mongodb and es
What if copying is prohibited?