当前位置：网站首页>Mmrotate trains its dataset from scratch

Mmrotate trains its dataset from scratch

2022-07-27 08:43:00 【Jiangxiaobai JLJ】

1. Virtual environment installation

step1： Download and install Anaconda,Anaconda Domestic image of ：

https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/

It is suggested to choose a newer Anaconda edition

The above is 32 Bit system , The following is 64 Bit system （ Generally, you can choose the second one ）

step2： Update domestic sources

The following instructions are Anaconda Prompt In the operation

If the domestic source is not updated, errors may occur when installing some packages

pypi | Mirror station help | Tsinghua University open source software image station | Tsinghua Open Source Mirror

step3：Anaconda Create a virtual environment under

conda create --name mmrotate python=3.8

conda activate mmrotate

here mmrotate Is the name of the virtual environment , It can be modified to what you want , What is specified here is python3.8 edition .

step4： download torch and torchvision（ The local installation is more stable ）

https://download.pytorch.org/whl/torch_stable.html

The version I choose here is torch==1.8.1 torchvision==0.9.1（ Pay attention here python Version correspondence , For example, choose cp=38. And my environment is cuda10.1）

（ One more thing to note 30 series The above graphics cards should be downloaded cuda11 Version above , Otherwise it will go wrong ）

Download it whl After the document , Enter the download directory from the virtual environment , then pip install In turn, installation torch and torchvision , As shown in the figure ：

step5： install mmcv_full、mmdetection and mmrotate

After installation , Let's start with mmcv_full And mmdetection Installation , because mmrotate It is based on the above two model bases .

mmcv_full：https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html

Installation — mmcv 1.6.0 documentation

Download according to your version , What I download here is ：

After downloading, use pip install Command to install

mmdetection：

pip install mmdet

Finally, installation mmrotate ：

pip install mmrotate

Here I download the official code version as ：

cmd Under the interface cd Enter into mmrotate Under the table of contents , Re execution

pip install -r requirements.txt

thus , The environment construction part is over .

2. test mmrotate Is the installation successful

modify image_demo.py

# Copyright (c) OpenMMLab. All rights reserved.
"""Inference on single image. Example: ``` wget -P checkpoint https://download.openmmlab.com/mmrotate/v0.1.0/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90/oriented_rcnn_r50_fpn_1x_dota_le90-6d2b2ce0.pth # noqa: E501, E261. python demo/image_demo.py \ demo/demo.jpg \ configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py \ work_dirs/oriented_rcnn_r50_fpn_1x_dota_v3/epoch_12.pth ``` """  # nowq

from argparse import ArgumentParser

from mmdet.apis import inference_detector, init_detector, show_result_pyplot

import mmrotate  # noqa: F401
import os

ROOT = os.getcwd()


def parse_args():
    parser = ArgumentParser()
    parser.add_argument('--img', default=os.path.join(ROOT, 'demo.jpg'), help='Image file')
    parser.add_argument('--config', default=os.path.join(ROOT, '../configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py'), help='Config file')
    parser.add_argument('--checkpoint', default=os.path.join(ROOT, '../pre-models/oriented_rcnn_r50_fpn_1x_dota_le90-6d2b2ce0.pth'), help='Checkpoint file')
    parser.add_argument(
        '--device', default='cuda:0', help='Device used for inference')
    parser.add_argument(
        '--palette',
        default='dota',
        choices=['dota', 'sar', 'hrsc', 'hrsc_classwise', 'random'],
        help='Color palette used for visualization')
    parser.add_argument(
        '--score-thr', type=float, default=0.3, help='bbox score threshold')
    args = parser.parse_args()
    return args


def main(args):
    # build the model from a config file and a checkpoint file
    model = init_detector(args.config, args.checkpoint, device=args.device)
    # test a single image
    result = inference_detector(model, args.img)
    # show the results
    show_result_pyplot(
        model,
        args.img,
        result,
        palette=args.palette,
        score_thr=args.score_thr)


if __name__ == '__main__':
    args = parse_args()
    main(args)

among , You need to download the pre training weights yourself , The website is above the code . If the download is slow, you can copy the link to Xunlei download .

3. Train your own dataset

Train your own dataset , Making custom datasets is actually the most troublesome part .MMrotate The data set format used is dota Type of , Picture is .png The format and size are n×n Of （ square ）, But don't worry , There are corresponding toolkits in the official project that can be automatically converted .

part1： Training data set preparation

For this part, please refer to my previous blog ：

Record the use of yolov5 Detect rotating targets _ Jiangxiaobai jlj The blog of -CSDN Blog _yolov5 Rotating target detection

Here is given rolabelimg Generated xml File transfer dota Data format code

''' rolabelimg xml data to dota 8 points data '''
import os
import xml.etree.ElementTree as ET
import math
import cv2
import numpy as np


def edit_xml(xml_file):

    if ".xml" not in xml_file:
        return 
        
    tree = ET.parse(xml_file)
    objs = tree.findall('object')

    txt=xml_file.replace(".xml",".txt")

    png=xml_file.replace(".xml",".png")
    src=cv2.imread(png,1)

    with open(txt,'w') as wf:
        wf.write("imagesource:Google\n")
        # wf.write("gsd:0.115726939386\n")

        for ix, obj in enumerate(objs):

            x0text = ""
            y0text =""
            x1text = ""
            y1text =""
            x2text = ""
            y2text = ""
            x3text = ""
            y3text = ""
            difficulttext=""
            className=""

            obj_type = obj.find('type')
            type = obj_type.text

            obj_name = obj.find('name')
            className = obj_name.text

            obj_difficult= obj.find('difficult')
            difficulttext = obj_difficult.text

            if type == 'bndbox':
                obj_bnd = obj.find('bndbox')
                obj_xmin = obj_bnd.find('xmin')
                obj_ymin = obj_bnd.find('ymin')
                obj_xmax = obj_bnd.find('xmax')
                obj_ymax = obj_bnd.find('ymax')
                xmin = float(obj_xmin.text)
                ymin = float(obj_ymin.text)
                xmax = float(obj_xmax.text)
                ymax = float(obj_ymax.text)

                x0text = str(xmin)
                y0text = str(ymin)
                x1text = str(xmax)
                y1text = str(ymin)
                x2text = str(xmin)
                y2text = str(ymax)
                x3text = str(xmax)
                y3text = str(ymax)

                points=np.array([[int(x0text),int(y0text)],[int(x1text),int(y1text)],[int(x2text),int(y2text)],[int(x3text),int(y3text)]],np.int32)
                cv2.polylines(src,[points],True,(255,0,0)) # Draw any multilateral 

            elif type == 'robndbox':
                obj_bnd = obj.find('robndbox')
                obj_bnd.tag = 'bndbox'   #  Modify the node name 
                obj_cx = obj_bnd.find('cx')
                obj_cy = obj_bnd.find('cy')
                obj_w = obj_bnd.find('w')
                obj_h = obj_bnd.find('h')
                obj_angle = obj_bnd.find('angle')
                cx = float(obj_cx.text)
                cy = float(obj_cy.text)
                w = float(obj_w.text)
                h = float(obj_h.text)
                angle = float(obj_angle.text)

                x0text, y0text = rotatePoint(cx, cy, cx - w / 2, cy - h / 2, -angle)
                x1text, y1text = rotatePoint(cx, cy, cx + w / 2, cy - h / 2, -angle)
                x2text, y2text = rotatePoint(cx, cy, cx + w / 2, cy + h / 2, -angle)
                x3text, y3text = rotatePoint(cx, cy, cx - w / 2, cy + h / 2, -angle)

                points=np.array([[int(x0text),int(y0text)],[int(x1text),int(y1text)],[int(x2text),int(y2text)],[int(x3text),int(y3text)]],np.int32)
                cv2.polylines(src,[points],True,(255,0,0)) # Draw arbitrary polygons 

          

            # print(x0text,y0text,x1text,y1text,x2text,y2text,x3text,y3text,className,difficulttext)
            wf.write("{} {} {} {} {} {} {} {} {} {}\n".format(x0text,y0text,x1text,y1text,x2text,y2text,x3text,y3text,className,difficulttext))

        # cv2.imshow("ddd",src)
        # cv2.waitKey()


#  Convert to four point coordinates 
def rotatePoint(xc, yc, xp, yp, theta):
    xoff = xp - xc;
    yoff = yp - yc;
    cosTheta = math.cos(theta)
    sinTheta = math.sin(theta)
    pResx = cosTheta * xoff + sinTheta * yoff
    pResy = - sinTheta * xoff + cosTheta * yoff
    return str(int(xc + pResx)), str(int(yc + pResy))


if __name__ == '__main__':
    dir = r"H:\duocicaiji\biaozhu_all"
    filelist = os.listdir(dir)
    for file in filelist:
        edit_xml(os.path.join(dir, file))

part2： Data set partitioning and preprocessing

This step is mainly to The whole data set is divided into training sets 、 Verification set and test set .

The file structure is as follows ：( I divide it 80%, 10%, 10%)

datasets
--train
--images
--labels
--val
--images
--labels
--test
--images

The next step is to crop the data , Cut it to n x n The size of , It mainly uses the cutting code provided in the official project ../mmrotate-0.3.0/tools/data/dota/split/img_split.py （ Crop script ）, The script reads

./mmrotate-0.3.0/tools/data/dota/split/split_configs Under the folder json Set the parameters in the file to crop the image . We need to modify the parameters , Let it load the above train、test、val Images and labels in , And cut it .

The specific operation is as follows ：（ With train For example ,val and test Same operation for ）（ among ss_ Represents a single scale crop ,ms_ Represents multi-scale cropping ）

modify split_configs Under folder ss_train.json file

After modifying the above parameters , Revise img_split.py Medium base_json Parameters

And then run directly img_split.py Just go .

And then to val、test The same is true of cutting .

So far, the image clipping preprocessing is completed .

part3： Model training and testing

To train Rotated FasterRCNN For example ：

Training ：

First , Download the pre training weight of the model

mmrotate/README_zh-CN.md at main · open-mmlab/mmrotate · GitHub

Find the corresponding link here to download the weight file

secondly , modify ./configs/rotated_faster_rcnn/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py

The main thing is to modify num_classes Parameters , Modify the number of categories according to your own dataset .

meanwhile , modify ./mmrotate-0.3.0/mmrotate/datasets/dota.py Category name in

What needs to be revised is , ./configs/_base_/datasets/dotav1.py file

# dataset settings
dataset_type = 'DOTADataset'

#  Modify the storage path of the data set after you cut 
data_root = 'H:/jlj/mmrotate-0.3.0/datasets/split_TL_896/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='RResize', img_scale=(1024, 1024)),
    dict(type='RRandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1024, 1024),
        flip=False,
        transforms=[
            dict(type='RResize'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    
    #  Set up batch_size
    samples_per_gpu=2,

    #  Set up num_worker
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        ann_file=data_root + 'train/annfiles/',
        img_prefix=data_root + 'train/images/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'val/annfiles/',
        img_prefix=data_root + 'val/images/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'test/images/',
        img_prefix=data_root + 'test/images/',
        pipeline=test_pipeline))

also ./configs/_base_/schedules/schedule_1x.py in

# evaluation
evaluation = dict(interval=5, metric='mAP')  #  How many rounds of training are evaluated 
# optimizer
optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=1.0 / 3,
    step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=100)  #  The total number of training 
checkpoint_config = dict(interval=10)  #  How many times to save the model after training

also ./configs/_base_/default_runtime.py

# yapf:disable
log_config = dict(
    interval=50,  #  How much training iter Then print out the training log 
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])
# yapf:enable

dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]

# disable opencv multithreading to avoid system being overloaded
opencv_num_threads = 0
# set multi-process start method as `fork` to speed up the training
mp_start_method = 'fork'

Last , modify train.py

There are two main parameters ： - -config: Model file used （ I'm using faster rcnn) ; - -work-dir： The path to save the trained model and configuration information .

After everything is configured , function train.py that will do .

forecast ：

In terms of prediction , modify test.py The path parameter in .

There are three main parameters ： - -config: Model file used ; - -checkpoint： The model weight file obtained from training ; --show-dir: The path where the prediction results are stored .