当前位置：网站首页>[target detection] yolov6 theoretical interpretation + practical test visdrone data set

[target detection] yolov6 theoretical interpretation + practical test visdrone data set

2022-07-27 08:03:00 【zstar-_】

Preface

This blog post will briefly summarize YOLOv6 Principle , And use YOLOv6 Yes VisDrone Data sets are trained .

background

YOLOv6 It is a target detection framework developed by meituan visual intelligence department , Dedicated to industrial applications .
According to the official [1] Test results provided ,YOLOv6 The comprehensive performance effect of exceeds YOLOv5 and YOLOX, As shown in the figure below ,YOLOv6s stay COCO On the verification set mAP The highest value .
Insert picture description here

Network structure optimization

because YOLOv6 No relevant papers , The following descriptions of these innovations refer to the official introduction blog [1]

EfficientRep Backbone

YOLOv5/YOLOX The use of Backbone and Neck Are based on CSPNet build , Multi branch method and residual structure are adopted . about GPU Wait for the hardware , This structure will increase the delay to some extent , At the same time, reduce memory bandwidth utilization . therefore ,YOLOv6 Yes Backbone and Neck All redesigned .

Backbone part , Put forward a kind called EfficientRep Backbone Structure , Structure diagram is as follows ：
Insert picture description here
In the picture RepConv,RepBlock,SimSPPF They are all new structures , There is no more detailed exploration here .

Rep-PAN

stay Neck Design aspect ,YOLOv6 A method named Rep-PAN Structure , The structural diagram is as follows ：

Insert picture description here

Decoupled Head

In terms of detection head ,YOLOv6 Learn from it YOLOX The idea of , Use decoupling detection head （Decoupled Head） structure , And simplify it . The comparison diagram of the two is as follows ：

Insert picture description here

Other optimization strategies

Anchor-free

Anchor-free It's also a reference to YOLOX, Cancel from YOLOv2 Anchor frame all the time (Anchor) Mechanism , Directly return the location information of the target , The price is not very stable , The advantage is that the calculation can be faster .

SimOTA

SimOTA It is a positive and negative sample matching strategy , It's also YOLOX Proposed strategic approach , Blog post before me 【 object detection 】 from YOLOv1 To YOLOX( Theoretical combing ) Also mentioned .
Simply speaking , The problem to be solved in judging positive and negative samples is how to remove low-quality frames when there are too many predicted frames , Keep high quality boxes ( Positive sample ) To participate in computation .
SimOTA The defined calculation formula is as follows ：

Insert picture description here

For each prediction box , Calculate its and real box respectively IOU And category loss , Then weight the total loss . Then compare each box with the real box iou Sort , Put all boxes iou Add and round , Get the number of categories of positive samples .
such as , The following figure [2], The result after rounding is 2, Then select the first two as positive samples .

Insert picture description here

SIoU

Previous bounding box regression losses include IoU、GIoU、CIoU、DIoU.YOLOv6 Introduced SIoU The loss function introduces the vector angle between the required regressions , Redefined the distance loss .
Relevant papers can be referred to ：https://arxiv.org/abs/2205.12740

In theory ,YOLOv6 Not much new , Now let's go into the practice , See how it works .

Practice using

On the whole YOLOv6 and YOLOv5 The code of is roughly similar , However, many small areas have been modified .
For example, model training 、 test 、 The detection function is hidden tools Under folder , This causes the input file path to be awkward , For example inferer.py Inside , The path is missing a jump step , It needs to be modified manually .

Insert picture description here

Data set transformation

For data set input ,YOLOv6 It has also been transformed , So that 【 object detection 】YOLOv5 Run through VisDrone Data sets In this paper, the VisDrone Data sets cannot be used directly , The following transformation is needed .

Insert picture description here

Image data and labels need to be set up in a separate large folder , Here are three small folders , And the name is fixed as train,test,val.

The specific reason can be seen in the following lines of code for loading data .

Insert picture description here

I handled it VisDrone The data set is also sorted here , Readers can download directly ：
https://pan.baidu.com/s/1u0OZ05r48Yi6Wwi7TcqI_g?pwd=8888

notes ：VisDrone By default, there are only xml Format tags ,txt Tags are generated by script , See my last blog post for specific ways 【 object detection 】YOLOv5 Run through VisDrone Data sets

After this is processed , and YOLOv5 equally , Need to be in data New under folder mydata.yaml
Type in the following ：

train: D:/Dataset/VisDrone_for_YOLOv6/images/train  # train images
val: D:/Dataset/VisDrone_for_YOLOv6/images/val  # val images
test: D:/Dataset/VisDrone_for_YOLOv6/images/test   # test images

is_coco: False

nc: 10  # number of classes
names: [ 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor' ]

Change the path here to your own .

Effect test

YOLOv6 All in all yolov6s,yolov6n and yolov6t Three models . I use yolov6s stay VisDrone Training on dataset 100 individual epoch, Total time consuming 13 Hours (RTX 2060 The graphics card ), The training speed is faster than YOLOv5 for , A lot of improvement .
Test it out , Its IoU=0.50 AP by 32.5%,IoU=0.50:0.95 AP by 17.4%, This data is not as good as the previous YOLOv5 Two versions ( The previous data is in my last blog post 【 object detection 】TPH-YOLOv5： be based on transformer Improvement yolov5 UAV target detection )

Let's test the video .

The result is wrong ：

Switch model to deploy modality.

Check the official issue, It turns out that the current reasoning only supports pictures , Video is temporarily not supported ..

So type VisDrone Test the pictures on the test set , The effect is as follows ：
Please add a picture description

The first picture has a good detection effect , Most of the targets are identified .
The result of the second picture is surprising , Only three targets were detected , All other bicycle targets are missed ！

Please add a picture description

My feelings

YOLOv6 It focuses on the deployment of models . In the project file , It supports exporting ONNX、TensorRT And so on the format of the file , According to the experimental comparison issued by the official , The experimental environment is basically nano That kind of embedded device . Probably YOLOv6 It has more advantages in the real production environment , But in terms of pure algorithm effect , The advantages are not obvious . also , A lot of content is for reference YOLOX, Dubbed “YOLOX PLUS" Too much .
at present YOLOv4 The author team of launched YOLOv7,YOLOv6 Then it becomes a transitional work , On the whole, it was launched in a hurry , Obviously, it's not perfect yet, so it's pushed out to occupy the pit . however , As a research result launched by Chinese , Still looking forward to its subsequent development and improvement .

Code backup

Be careful , The point of this blog is just that I am using yolov6s.pt Trained 100 individual epoch The conclusion reached , The specific performance needs to be tested later . Code backup is done here by the way ( contain 3 Pre training weights of models )：
https://pan.baidu.com/s/1GIOZq3EgzzVDjs3zZP_dKQ?pwd=8888