当前位置：网站首页>(4) Rotating object detection data roLabelImg to DOTA format

(4) Rotating object detection data roLabelImg to DOTA format

2022-08-05 06:57:00 【Hengyoucheng】

欢迎访问个人网络日志知行空间

文章目录

roLabelImg 工具仓库地址:https://github.com/cgvict/roLabelImg

1.Enters a mode where you can draw a rotation detection frame

在这里插入图片描述

2.Annotation file style

roLabelImgWhen marking the rotation detection frame, first draw a regular rectangular frame,Then it is achieved by rotating a certain angle clockwise and counterclockwise around the center point of the rectangle.The definition of the rotation detection frame in the annotation file is to use(cx, cy, width, height, angle)的格式定义的,如下：

<robndbox>
     <cx>1178.4388</cx>
     <cy>1004.6478</cy>
     <w>319.635</w>
     <h>273.2016</h>
     <angle>0.46</angle>
</robndbox>

(cx, cy)is the pixel coordinate of the center point of the rotation box,w的定义是在roLabelImgIn the image when drawing the initial rectanglexside length in the direction, 另一条边是h,After drawing the initial rectangle,Regardless of subsequent rotation,w和hThe pointed edge will not change.angleThe definition of the corner is the rotation of the rectangular detection framew边和XThe angle formed by the positive direction of the axis clockwise,其大小为[0,pi)

Initial rectangle detection frame：

在这里插入图片描述

Detect the frame after adjusting the pose：

在这里插入图片描述

其中$\theta$的大小为`2.541593`.

3.DOTA数据格式

DOTAIt is an open source rotating object detection dataset of Wuhan University,See its homepagehttps://captain-whu.github.io/DOTA/dataset.html.DOTA标注文件的格式为：

x1, y1, x2, y2, x3, y3, x4, y4, category, difficult

(x1, y1, x2, y2, x3, y3, x4, y4)are the coordinates of the four vertices of the rotating object detection frame, respectively,categoryis the category of the detection frame object object

4.roLabelImgAnnotation file transfer`DOTA`格式

Most of the open source algorithms for rotating object detection are supported by the data processing partDOTA格式,Such as Shangtang open sourcemmrotate,In order to verify the effectiveness of the algorithm on your own data set faster,The most convenient algorithm is to roLabelImg标注的xmlThe file is converted to the above label format,roLabelImgAnnotation file transferDOTAThere are four cases.

1） $\theta \in (\pi/2, \pi)$ ,and the center pointCfall on point1右侧
2） $\theta \in (\pi/2, \pi)$ ,and the center pointCfall on point1左侧
3） $\theta \in [0, \pi/2]$ ,and the center pointCfall on point1左侧
4） $\theta \in [0, \pi/2]$ ,and the center pointCfall on point1右侧

在这里插入图片描述

以 $\theta \in (\pi/2, \pi)$ ,and the center pointCfall on point1Example on the right,

在这里插入图片描述

A(x1, y1),B(x3, y3),D(x2, y2),E(x4, y4)The coordinates of the point can be obtained from the relationship between the above triangles：

$\beta = \angle CAV_2 = arctan\frac{h}{w}+ \pi - \theta$
$\frac{\sqrt{w^62+h^2}}{2}$
$cos\beta \\ y1 = cy + d sin\beta \\ x2 = cx + d cos\beta \\ y2 = cy - d sin\beta \\ x3 = x1 - hcos(\theta - \frac{\pi}{2}) \\ y4 = y1 - hsin(\theta - \frac{\pi}{2}) \\ x4 = x2 + hcos(\theta - \frac{\pi}{2}) \\ y4 = y2 + hsin(\theta - \frac{\pi}{2})$

The other three cases can also be found.

转换代码见：

def convert_rolabelimg2dota(xml_path:str) -> None:
    """ Args: - `xml_path` (str) : path to roLabelImg label file, like /xx/xx.xml Returns: - `box_points` (list): shape (N, 8 + 1), N is the number of objects, 8 + 1 is \ `(x1, y1, x2, y2, x3, y3, x4, y4, class_name)` """
    
    with open(xml_path) as f:
        tree = ET.parse(f)
        root = tree.getroot()
        size = root.find('size')
        width = int(size.find('width').text)
        height = int(size.find('height').text)
        objects = root.iter('object')
        boxes = [] # list of tuple(cz, cy, w, h, angle), angle is in [0-pi)
        for obj in objects:
            if obj.find('type').text == 'robndbox':
                rbox_node = obj.find('robndbox')
                cat = obj.find('name').text
                rbox = dict()
                for key in ['cx', 'cy', 'w', 'h', 'angle']:
                    rbox[key] = float(rbox_node.find(key).text)
                boxes.append(list((*rbox.values(), cat)))
        print(f"bboxes: {
      boxes}")
        
        box_points = [] # list of box defined with four vertices
        for box in boxes:
            cx, cy, w, h, ag, cat = box
            alpha_w = math.atan(w / h)
            alpha_h = math.atan(h / w)
            d = math.sqrt(w**2 + h**2) / 2 
            if ag > math.pi / 2:
                beta = ag - math.pi / 2 + alpha_w
                if beta <= math.pi / 2:
                    x1, y1 = cx + d * math.cos(beta), cy + d * math.sin(beta)
                    x2, y2 = cx - d * math.cos(beta), cy - d * math.sin(beta)
                elif beta > math.pi / 2:
                    beta = math.pi - beta
                    x1, y1 = cx - d * math.cos(beta), cy + d * math.sin(beta)
                    x2, y2 = cx + d * math.cos(beta), cy - d * math.sin(beta)
                x3, y3 = x1 - h * math.cos(ag - math.pi / 2), y1 - h * math.sin(ag - math.pi / 2)
                x4, y4 = x2 + h * math.cos(ag - math.pi / 2), y2 + h * math.sin(ag - math.pi / 2) 
            elif ag <= math.pi / 2:
                beta = ag + alpha_h
                if beta <= math.pi / 2:
                    x1, y1 = cx + d * math.cos(beta), cy + d * math.sin(beta)
                    x2, y2 = cx - d * math.cos(beta), cy - d * math.sin(beta)
                elif beta > math.pi / 2:
                    beta = math.pi - beta
                    x1, y1 = cx - d * math.cos(beta), cy + d * math.sin(beta)
                    x2, y2 = cx + d * math.cos(beta), cy - d * math.sin(beta)
                x3, y3 = x1 - w * math.cos(ag), y1 - w * math.sin(ag)
                x4, y4 = x2 + w * math.cos(ag), y2 + w * math.sin(ag)
                points = np.array([x1, y1, x3, y3, x2, y2, x4, y4], dtype=np.int32)
                points[0::2] = np.clip(points[0::2], 0, width)
                points[1::2] = np.clip(points[1::2], 0, height)
            box_points.append([*points, cat])
        return box_points