当前位置:网站首页>(四)旋转物体检测数据roLabelImg转DOTA格式
(四)旋转物体检测数据roLabelImg转DOTA格式
2022-08-05 05:54:00 【恒友成】
欢迎访问个人网络日志知行空间
roLabelImg 工具仓库地址:https://github.com/cgvict/roLabelImg
1.进入可以画旋转检测框的模式
2.标注文件样式
roLabelImg
标注旋转检测框时是先画一个常规的矩形框,然后绕矩形的中心点顺时针和逆时针旋转一定的角度来实现的。标注文件中对旋转检测框的定义是使用(cx, cy, width, height, angle)
的格式定义的,如下:
<robndbox>
<cx>1178.4388</cx>
<cy>1004.6478</cy>
<w>319.635</w>
<h>273.2016</h>
<angle>0.46</angle>
</robndbox>
(cx, cy)
是旋转框的中心点像素坐标,w
的定义是在roLabelImg
中画初始矩形框时在图像x
方向上的边长, 另一条边是h
,画好初始矩形后,无论后续如何旋转,w
和h
所指的边都不会变。angle
角的定义是旋转矩形检测框的w
边和X
轴正方向顺时针所成的角度,其大小为[0,pi)
初始矩形检测框:
调整姿态后检测框:
3.DOTA数据格式
DOTA
是武汉大学开源的旋转物体检测数据集,其主页见https://captain-whu.github.io/DOTA/dataset.html。DOTA
标注文件的格式为:
x1, y1, x2, y2, x3, y3, x4, y4, category, difficult
(x1, y1, x2, y2, x3, y3, x4, y4)
分别是旋转物体检测框的四个顶点的坐标,category
是检测框物体对象的类别
4.roLabelImg标注文件转DOTA
格式
大多数旋转物体检测的开源算法的数据处理部分都支持DOTA
格式,如商汤开源的mmrotate,为了更快的在自己数据集上验证算法的有效性,最方便的算法就是将roLabelImg
标注的xml
文件转成上述的标签格式,roLabelImg
标注文件转DOTA
可分成四种情况。
- 1) θ ∈ ( π / 2 , π ) \theta \in (\pi/2, \pi) θ∈(π/2,π),且中心点C落在点1右侧
- 2) θ ∈ ( π / 2 , π ) \theta \in (\pi/2, \pi) θ∈(π/2,π),且中心点C落在点1左侧
- 3) θ ∈ [ 0 , π / 2 ] \theta \in [0, \pi/2] θ∈[0,π/2],且中心点C落在点1左侧
- 4) θ ∈ [ 0 , π / 2 ] \theta \in [0, \pi/2] θ∈[0,π/2],且中心点C落在点1右侧
以 θ ∈ ( π / 2 , π ) \theta \in (\pi/2, \pi) θ∈(π/2,π),且中心点C落在点1右侧为例,
A(x1, y1),B(x3, y3),D(x2, y2),E(x4, y4)
点的坐标由上述三角形之间的关系可以求得:
β = ∠ C A V 2 = a r c t a n h w + π − θ \beta = \angle CAV_2 = arctan\frac{h}{w}+ \pi - \theta β=∠CAV2=arctanwh+π−θ
d = w 6 2 + h 2 2 d = \frac{\sqrt{w^62+h^2}}{2} d=2w62+h2
x 1 = c x − d c o s β y 1 = c y + d s i n β x 2 = c x + d c o s β y 2 = c y − d s i n β x 3 = x 1 − h c o s ( θ − π 2 ) y 4 = y 1 − h s i n ( θ − π 2 ) x 4 = x 2 + h c o s ( θ − π 2 ) y 4 = y 2 + h s i n ( θ − π 2 ) x1 = cx - d cos\beta \\ y1 = cy + d sin\beta \\ x2 = cx + d cos\beta \\ y2 = cy - d sin\beta \\ x3 = x1 - hcos(\theta - \frac{\pi}{2}) \\ y4 = y1 - hsin(\theta - \frac{\pi}{2}) \\ x4 = x2 + hcos(\theta - \frac{\pi}{2}) \\ y4 = y2 + hsin(\theta - \frac{\pi}{2}) x1=cx−dcosβy1=cy+dsinβx2=cx+dcosβy2=cy−dsinβx3=x1−hcos(θ−2π)y4=y1−hsin(θ−2π)x4=x2+hcos(θ−2π)y4=y2+hsin(θ−2π)
同样可以求其他三种情况。
转换代码见:
def convert_rolabelimg2dota(xml_path:str) -> None:
""" Args: - `xml_path` (str) : path to roLabelImg label file, like /xx/xx.xml Returns: - `box_points` (list): shape (N, 8 + 1), N is the number of objects, 8 + 1 is \ `(x1, y1, x2, y2, x3, y3, x4, y4, class_name)` """
with open(xml_path) as f:
tree = ET.parse(f)
root = tree.getroot()
size = root.find('size')
width = int(size.find('width').text)
height = int(size.find('height').text)
objects = root.iter('object')
boxes = [] # list of tuple(cz, cy, w, h, angle), angle is in [0-pi)
for obj in objects:
if obj.find('type').text == 'robndbox':
rbox_node = obj.find('robndbox')
cat = obj.find('name').text
rbox = dict()
for key in ['cx', 'cy', 'w', 'h', 'angle']:
rbox[key] = float(rbox_node.find(key).text)
boxes.append(list((*rbox.values(), cat)))
print(f"bboxes: {
boxes}")
box_points = [] # list of box defined with four vertices
for box in boxes:
cx, cy, w, h, ag, cat = box
alpha_w = math.atan(w / h)
alpha_h = math.atan(h / w)
d = math.sqrt(w**2 + h**2) / 2
if ag > math.pi / 2:
beta = ag - math.pi / 2 + alpha_w
if beta <= math.pi / 2:
x1, y1 = cx + d * math.cos(beta), cy + d * math.sin(beta)
x2, y2 = cx - d * math.cos(beta), cy - d * math.sin(beta)
elif beta > math.pi / 2:
beta = math.pi - beta
x1, y1 = cx - d * math.cos(beta), cy + d * math.sin(beta)
x2, y2 = cx + d * math.cos(beta), cy - d * math.sin(beta)
x3, y3 = x1 - h * math.cos(ag - math.pi / 2), y1 - h * math.sin(ag - math.pi / 2)
x4, y4 = x2 + h * math.cos(ag - math.pi / 2), y2 + h * math.sin(ag - math.pi / 2)
elif ag <= math.pi / 2:
beta = ag + alpha_h
if beta <= math.pi / 2:
x1, y1 = cx + d * math.cos(beta), cy + d * math.sin(beta)
x2, y2 = cx - d * math.cos(beta), cy - d * math.sin(beta)
elif beta > math.pi / 2:
beta = math.pi - beta
x1, y1 = cx - d * math.cos(beta), cy + d * math.sin(beta)
x2, y2 = cx + d * math.cos(beta), cy - d * math.sin(beta)
x3, y3 = x1 - w * math.cos(ag), y1 - w * math.sin(ag)
x4, y4 = x2 + w * math.cos(ag), y2 + w * math.sin(ag)
points = np.array([x1, y1, x3, y3, x2, y2, x4, y4], dtype=np.int32)
points[0::2] = np.clip(points[0::2], 0, width)
points[1::2] = np.clip(points[1::2], 0, height)
box_points.append([*points, cat])
return box_points
完整代码见gitee仓库object_detection_task
参考资料
边栏推荐
- 盒子模型小练习
- After docker is deployed, mysql cannot connect
- 在小程序中关于js数字精度丢失的解决办法
- Get the network input dimensions of the pretrained model
- 农场游戏果园系统+牧场养殖系统+广告联盟模式流量主游戏小程序APP V1
- 技术分析模式(十一)如何交易头肩形态
- Media query, rem mobile terminal adaptation
- 字体样式及其分类
- 前置++和后置++的区别
- The cocos interview answers you are looking for are all here!
猜你喜欢
Nacos集群搭建
八大排序之堆排序
Chengyun Technology was invited to attend the 2022 Alibaba Cloud Partner Conference and won the "Gathering Strength and Going Far" Award
LeetCode刷题记录(2)
config.js related configuration summary
Tencent Internal Technology: Evolution of Server Architecture of "The Legend of Xuanyuan"
人人AI(吴恩达系列)
图像处理、分析与机器视觉一书纠错笔记
UI刘海屏适配方式
Jenkins详细配置
随机推荐
NB-IOT智能云家具项目系列实站
MySql面试题总结
H5开发调试-Fiddler手机抓包
lingo入门——河北省第三届研究生建模竞赛B题
关于Antd的Affix突然不好用了,或者Window的scroll监听不好用了
滚动条问题,未解决
防抖函数和节流函数
js判断文字是否超过区域
docker部署完mysql无法连接
Error correction notes for the book Image Processing, Analysis and Machine Vision
Matplotlib绘图笔记
长度以及颜色单位基本概念
AH8669-AC380/VAC220V转降5V12V24V500MA内电源芯片IC方案
System basics - study notes (some command records)
Media query, rem mobile terminal adaptation
Transformer详细解读与预测实例记录
概率与期望部分题解
边缘盒子+时序数据库,美的数字化平台 iBUILDING 背后的技术选型
vs2017关于函数命名方面的注意事项
技术分析模式(八)双顶和底