当前位置:网站首页>(四)旋转物体检测数据roLabelImg转DOTA格式
(四)旋转物体检测数据roLabelImg转DOTA格式
2022-08-05 05:54:00 【恒友成】
欢迎访问个人网络日志知行空间
roLabelImg 工具仓库地址:https://github.com/cgvict/roLabelImg
1.进入可以画旋转检测框的模式
2.标注文件样式
roLabelImg
标注旋转检测框时是先画一个常规的矩形框,然后绕矩形的中心点顺时针和逆时针旋转一定的角度来实现的。标注文件中对旋转检测框的定义是使用(cx, cy, width, height, angle)
的格式定义的,如下:
<robndbox>
<cx>1178.4388</cx>
<cy>1004.6478</cy>
<w>319.635</w>
<h>273.2016</h>
<angle>0.46</angle>
</robndbox>
(cx, cy)
是旋转框的中心点像素坐标,w
的定义是在roLabelImg
中画初始矩形框时在图像x
方向上的边长, 另一条边是h
,画好初始矩形后,无论后续如何旋转,w
和h
所指的边都不会变。angle
角的定义是旋转矩形检测框的w
边和X
轴正方向顺时针所成的角度,其大小为[0,pi)
初始矩形检测框:
调整姿态后检测框:
3.DOTA数据格式
DOTA
是武汉大学开源的旋转物体检测数据集,其主页见https://captain-whu.github.io/DOTA/dataset.html。DOTA
标注文件的格式为:
x1, y1, x2, y2, x3, y3, x4, y4, category, difficult
(x1, y1, x2, y2, x3, y3, x4, y4)
分别是旋转物体检测框的四个顶点的坐标,category
是检测框物体对象的类别
4.roLabelImg标注文件转DOTA
格式
大多数旋转物体检测的开源算法的数据处理部分都支持DOTA
格式,如商汤开源的mmrotate,为了更快的在自己数据集上验证算法的有效性,最方便的算法就是将roLabelImg
标注的xml
文件转成上述的标签格式,roLabelImg
标注文件转DOTA
可分成四种情况。
- 1) θ ∈ ( π / 2 , π ) \theta \in (\pi/2, \pi) θ∈(π/2,π),且中心点C落在点1右侧
- 2) θ ∈ ( π / 2 , π ) \theta \in (\pi/2, \pi) θ∈(π/2,π),且中心点C落在点1左侧
- 3) θ ∈ [ 0 , π / 2 ] \theta \in [0, \pi/2] θ∈[0,π/2],且中心点C落在点1左侧
- 4) θ ∈ [ 0 , π / 2 ] \theta \in [0, \pi/2] θ∈[0,π/2],且中心点C落在点1右侧
以 θ ∈ ( π / 2 , π ) \theta \in (\pi/2, \pi) θ∈(π/2,π),且中心点C落在点1右侧为例,
A(x1, y1),B(x3, y3),D(x2, y2),E(x4, y4)
点的坐标由上述三角形之间的关系可以求得:
β = ∠ C A V 2 = a r c t a n h w + π − θ \beta = \angle CAV_2 = arctan\frac{h}{w}+ \pi - \theta β=∠CAV2=arctanwh+π−θ
d = w 6 2 + h 2 2 d = \frac{\sqrt{w^62+h^2}}{2} d=2w62+h2
x 1 = c x − d c o s β y 1 = c y + d s i n β x 2 = c x + d c o s β y 2 = c y − d s i n β x 3 = x 1 − h c o s ( θ − π 2 ) y 4 = y 1 − h s i n ( θ − π 2 ) x 4 = x 2 + h c o s ( θ − π 2 ) y 4 = y 2 + h s i n ( θ − π 2 ) x1 = cx - d cos\beta \\ y1 = cy + d sin\beta \\ x2 = cx + d cos\beta \\ y2 = cy - d sin\beta \\ x3 = x1 - hcos(\theta - \frac{\pi}{2}) \\ y4 = y1 - hsin(\theta - \frac{\pi}{2}) \\ x4 = x2 + hcos(\theta - \frac{\pi}{2}) \\ y4 = y2 + hsin(\theta - \frac{\pi}{2}) x1=cx−dcosβy1=cy+dsinβx2=cx+dcosβy2=cy−dsinβx3=x1−hcos(θ−2π)y4=y1−hsin(θ−2π)x4=x2+hcos(θ−2π)y4=y2+hsin(θ−2π)
同样可以求其他三种情况。
转换代码见:
def convert_rolabelimg2dota(xml_path:str) -> None:
""" Args: - `xml_path` (str) : path to roLabelImg label file, like /xx/xx.xml Returns: - `box_points` (list): shape (N, 8 + 1), N is the number of objects, 8 + 1 is \ `(x1, y1, x2, y2, x3, y3, x4, y4, class_name)` """
with open(xml_path) as f:
tree = ET.parse(f)
root = tree.getroot()
size = root.find('size')
width = int(size.find('width').text)
height = int(size.find('height').text)
objects = root.iter('object')
boxes = [] # list of tuple(cz, cy, w, h, angle), angle is in [0-pi)
for obj in objects:
if obj.find('type').text == 'robndbox':
rbox_node = obj.find('robndbox')
cat = obj.find('name').text
rbox = dict()
for key in ['cx', 'cy', 'w', 'h', 'angle']:
rbox[key] = float(rbox_node.find(key).text)
boxes.append(list((*rbox.values(), cat)))
print(f"bboxes: {
boxes}")
box_points = [] # list of box defined with four vertices
for box in boxes:
cx, cy, w, h, ag, cat = box
alpha_w = math.atan(w / h)
alpha_h = math.atan(h / w)
d = math.sqrt(w**2 + h**2) / 2
if ag > math.pi / 2:
beta = ag - math.pi / 2 + alpha_w
if beta <= math.pi / 2:
x1, y1 = cx + d * math.cos(beta), cy + d * math.sin(beta)
x2, y2 = cx - d * math.cos(beta), cy - d * math.sin(beta)
elif beta > math.pi / 2:
beta = math.pi - beta
x1, y1 = cx - d * math.cos(beta), cy + d * math.sin(beta)
x2, y2 = cx + d * math.cos(beta), cy - d * math.sin(beta)
x3, y3 = x1 - h * math.cos(ag - math.pi / 2), y1 - h * math.sin(ag - math.pi / 2)
x4, y4 = x2 + h * math.cos(ag - math.pi / 2), y2 + h * math.sin(ag - math.pi / 2)
elif ag <= math.pi / 2:
beta = ag + alpha_h
if beta <= math.pi / 2:
x1, y1 = cx + d * math.cos(beta), cy + d * math.sin(beta)
x2, y2 = cx - d * math.cos(beta), cy - d * math.sin(beta)
elif beta > math.pi / 2:
beta = math.pi - beta
x1, y1 = cx - d * math.cos(beta), cy + d * math.sin(beta)
x2, y2 = cx + d * math.cos(beta), cy - d * math.sin(beta)
x3, y3 = x1 - w * math.cos(ag), y1 - w * math.sin(ag)
x4, y4 = x2 + w * math.cos(ag), y2 + w * math.sin(ag)
points = np.array([x1, y1, x3, y3, x2, y2, x4, y4], dtype=np.int32)
points[0::2] = np.clip(points[0::2], 0, width)
points[1::2] = np.clip(points[1::2], 0, height)
box_points.append([*points, cat])
return box_points
完整代码见gitee仓库object_detection_task
参考资料
边栏推荐
猜你喜欢
Collision, character controller, Cloth components (cloth), joints in the Unity physics engine
In-depth analysis if according to data authority @datascope (annotation + AOP + dynamic sql splicing) [step by step, with analysis process]
The cocos interview answers you are looking for are all here!
浮点数基础知识
单片机期末复习大题
摆脱极域软件的限制
八大排序之快速排序
System basics - study notes (some command records)
Tencent Internal Technology: Evolution of Server Architecture of "The Legend of Xuanyuan"
MyCat安装
随机推荐
The future of cloud gaming
技术分析模式(九)三重顶部和底部
VS Code私有服务器部署(私有化)
The use of three parameters of ref, out, and Params in Unity3D
Get the network input dimensions of the pretrained model
多用户商城多商户B2B2C拼团砍价秒杀支持小程序H5+APP全开源
滚动条问题,未解决
设置文本向两边居中展示
Alibaba Cloud Video on Demand
Late night drinking, 50 classic SQL questions, really fragrant~
Quick Start to Drools Rule Engine (1)
人人AI(吴恩达系列)
Shadowless Cloud Desktop
深夜小酌,50道经典SQL题,真香~
浮点数基础知识
概率与期望部分题解
NACOS Configuration Center Settings Profile
开源中国活动合作说明书
LaTeX image captioning text column automatic line wrapping
vscode笔记