当前位置:网站首页>关于SIoU的原理和代码实现(回顾IoU、GIoU、DIoU、CIoU)
关于SIoU的原理和代码实现(回顾IoU、GIoU、DIoU、CIoU)
2022-06-11 06:10:00 【小姜贼菜】
论文:https://arxiv.org/pdf/2205.12740.pdf
代码实现(非官方):https://github.com/xialuxi/yolov5-car-plate/commit/aa41d1819b1fb03b4dc73e8a3e0000c46cfc370b
图片源自视频教程(这个大佬视频教程yyds):https://www.bilibili.com/video/BV1yi4y1g7ro?p=4
原理:
从最早的IoU到GIoU,再到DIoU和CIoU,现在出现了SIoU
L2损失与 IoU损失的比较
GIoU损失

A代表蓝色的框,最大的矩形框。u代表GT和预测框的并集。
DIoU损失

图片一左侧的上面是GIoU,下面的是DIoU:其中黑色的代表anchor,蓝色的代表预测框,绿色的为GT框

CIoU损失


SIoU损失
再上面的基础上考虑了角度
在论文中也重新定义了距离 cost和shape cost,
角度cost 定义如下:
这里我看的很奇怪的一点就是,这个α为啥带入到sin,又带入到反sin,这不是多此一举吗?σ就是两个框的中心距离呗。
距离cost 定义如下:
shape cost定义如下:
整的lost 定义:
还有很多细节没有分析、挖掘、探讨,这里只是草草的分享下,记录下。
代码实现:
!!!重要的事情说三遍,不是我实现的,不是我实现的,不是我实现的。来自于开头链接的大佬:
if SIoU: # SIoU Loss https://arxiv.org/pdf/2205.12740.pdf
sigma = torch.pow(cw ** 2 + ch ** 2, 0.5)
sin_alpha_1 = ch / sigma
sin_alpha_2 = cw / sigma
threshold = pow(2, 0.5) / 2
sin_alpha = torch.where(sin_alpha_1 > threshold, sin_alpha_2, sin_alpha_1)
# angle_cost = 1 - 2 * torch.pow( torch.sin(torch.arcsin(sin_alpha) - np.pi/4), 2)
angle_cost = torch.cos(torch.arcsin(sin_alpha) * 2 - np.pi / 2)
rho_x = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) / cw) ** 2
rho_y = ((b2_y1 + b2_y2 - b1_y1 - b1_y2) / ch) ** 2
gamma = 2 - angle_cost
distance_cost = 2 - torch.exp(-1 * gamma * rho_x) - torch.exp(-1 * gamma * rho_y)
omiga_w = torch.abs(w1 - w2) / torch.max(w1, w2)
omiga_h = torch.abs(h1 - h2) / torch.max(h1, h2)
shape_cost = torch.pow(1 - torch.exp(-1 * omiga_w), 4) + torch.pow(1 - torch.exp(-1 * omiga_h), 4)
return iou - 0.5 * (distance_cost + shape_cost)
边栏推荐
- Observer mode (listener mode) + thread pool to realize asynchronous message sending
- The difference between call and apply and bind
- Shandong University machine learning final 2021
- Twitter data collection (content, fans, keywords, etc.)
- Examinelistactivity of Shandong University project training
- Dichotomy find template
- Markdown + typora + picgo experimental report template attached
- PHP laravel8 send email
- FPGA面试题目笔记(二)——同步异步D触发器、静动态时序分析、分频设计、Retiming
- Using idea to add, delete, modify and query database
猜你喜欢

ERROR 1215 (HY000): Cannot add foreign key constraint
![[IOS development interview] operating system learning notes](/img/1d/2ec6857c833de00923d791f3a34f53.jpg)
[IOS development interview] operating system learning notes

SQLI_ LIBS range construction and 1-10get injection practice

Graphsage paper reading

Eureka集群搭建

FPGA Design -- ping pong operation implementation and Modelsim simulation

Can Amazon, express, lazada and shrimp skin platforms use the 911+vm environment to carry out production number, maintenance number, supplement order and other operations?

学好C语言从关键字开始

A multi classification model suitable for discrete value classification -- softmax regression

Sqli-libs range 23-24 filtration and secondary injection practice
随机推荐
Can Amazon, express, lazada and shrimp skin platforms use the 911+vm environment to carry out production number, maintenance number, supplement order and other operations?
All the benefits of ci/cd, but greener
Ethical discussion on reptile Technology
FPGA面试题目笔记(三)——跨时钟域中握手信号同步的实现、任意分频、进制转换、RAM存储器等、原码反码和补码
Wechat applet (authorized login) (not recommended, click the home page to view the updated authorized login)
学好C语言从关键字开始
Observer mode (listener mode) + thread pool to realize asynchronous message sending
How to treat the ethical issues arising from driverless Technology
Delegation agreement, data source agreement and advanced view in view
QT socket设置连接超时时间
FPGA面試題目筆記(四)—— 序列檢測器、跨時鐘域中的格雷碼、乒乓操作、降低靜動態損耗、定點化無損誤差、恢複時間和移除時間
Matlab实现均值滤波与FPGA进行对比,并采用modelsim波形仿真
Quartz2d drawing technology
Global case | how Capgemini connects global product teams through JIRA software and confluence
Transfer Learning
End of 2021 graphics of Shandong University
Installing MySQL for Linux
11. Gesture recognition
跨境电商测评自养号团队应该怎么做?
FPGA设计中提高工作频率及降低功耗题目合集