当前位置:网站首页>[target detection] yolov6 theoretical interpretation + practical test visdrone data set
[target detection] yolov6 theoretical interpretation + practical test visdrone data set
2022-07-27 08:03:00 【zstar-_】
Preface
This blog post will briefly summarize YOLOv6 Principle , And use YOLOv6 Yes VisDrone Data sets are trained .
background
YOLOv6 It is a target detection framework developed by meituan visual intelligence department , Dedicated to industrial applications .
According to the official [1] Test results provided ,YOLOv6 The comprehensive performance effect of exceeds YOLOv5 and YOLOX, As shown in the figure below ,YOLOv6s stay COCO On the verification set mAP The highest value .
Network structure optimization
because YOLOv6 No relevant papers , The following descriptions of these innovations refer to the official introduction blog [1]
EfficientRep Backbone
YOLOv5/YOLOX The use of Backbone and Neck Are based on CSPNet build , Multi branch method and residual structure are adopted . about GPU Wait for the hardware , This structure will increase the delay to some extent , At the same time, reduce memory bandwidth utilization . therefore ,YOLOv6 Yes Backbone and Neck All redesigned .
Backbone part , Put forward a kind called EfficientRep Backbone Structure , Structure diagram is as follows :
In the picture RepConv,RepBlock,SimSPPF They are all new structures , There is no more detailed exploration here .
Rep-PAN
stay Neck Design aspect ,YOLOv6 A method named Rep-PAN Structure , The structural diagram is as follows :

Decoupled Head
In terms of detection head ,YOLOv6 Learn from it YOLOX The idea of , Use decoupling detection head (Decoupled Head) structure , And simplify it . The comparison diagram of the two is as follows :

Other optimization strategies
Anchor-free
Anchor-free It's also a reference to YOLOX, Cancel from YOLOv2 Anchor frame all the time (Anchor) Mechanism , Directly return the location information of the target , The price is not very stable , The advantage is that the calculation can be faster .
SimOTA
SimOTA It is a positive and negative sample matching strategy , It's also YOLOX Proposed strategic approach , Blog post before me 【 object detection 】 from YOLOv1 To YOLOX( Theoretical combing ) Also mentioned .
Simply speaking , The problem to be solved in judging positive and negative samples is how to remove low-quality frames when there are too many predicted frames , Keep high quality boxes ( Positive sample ) To participate in computation .
SimOTA The defined calculation formula is as follows :

For each prediction box , Calculate its and real box respectively IOU And category loss , Then weight the total loss . Then compare each box with the real box iou Sort , Put all boxes iou Add and round , Get the number of categories of positive samples .
such as , The following figure [2], The result after rounding is 2, Then select the first two as positive samples .

SIoU
Previous bounding box regression losses include IoU、GIoU、CIoU、DIoU.YOLOv6 Introduced SIoU The loss function introduces the vector angle between the required regressions , Redefined the distance loss .
Relevant papers can be referred to :https://arxiv.org/abs/2205.12740
In theory ,YOLOv6 Not much new , Now let's go into the practice , See how it works .
Practice using
On the whole YOLOv6 and YOLOv5 The code of is roughly similar , However, many small areas have been modified .
For example, model training 、 test 、 The detection function is hidden tools Under folder , This causes the input file path to be awkward , For example inferer.py Inside , The path is missing a jump step , It needs to be modified manually .

Data set transformation
For data set input ,YOLOv6 It has also been transformed , So that 【 object detection 】YOLOv5 Run through VisDrone Data sets In this paper, the VisDrone Data sets cannot be used directly , The following transformation is needed .


Image data and labels need to be set up in a separate large folder , Here are three small folders , And the name is fixed as train,test,val.
The specific reason can be seen in the following lines of code for loading data .

I handled it VisDrone The data set is also sorted here , Readers can download directly :
https://pan.baidu.com/s/1u0OZ05r48Yi6Wwi7TcqI_g?pwd=8888
notes :VisDrone By default, there are only xml Format tags ,txt Tags are generated by script , See my last blog post for specific ways 【 object detection 】YOLOv5 Run through VisDrone Data sets
After this is processed , and YOLOv5 equally , Need to be in data New under folder mydata.yaml
Type in the following :
train: D:/Dataset/VisDrone_for_YOLOv6/images/train # train images
val: D:/Dataset/VisDrone_for_YOLOv6/images/val # val images
test: D:/Dataset/VisDrone_for_YOLOv6/images/test # test images
is_coco: False
nc: 10 # number of classes
names: [ 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor' ]
Change the path here to your own .
Effect test
YOLOv6 All in all yolov6s,yolov6n and yolov6t Three models . I use yolov6s stay VisDrone Training on dataset 100 individual epoch, Total time consuming 13 Hours (RTX 2060 The graphics card ), The training speed is faster than YOLOv5 for , A lot of improvement .
Test it out , Its IoU=0.50 AP by 32.5%,IoU=0.50:0.95 AP by 17.4%, This data is not as good as the previous YOLOv5 Two versions ( The previous data is in my last blog post 【 object detection 】TPH-YOLOv5: be based on transformer Improvement yolov5 UAV target detection )
Let's test the video .
The result is wrong :
Switch model to deploy modality.
Check the official issue, It turns out that the current reasoning only supports pictures , Video is temporarily not supported ..
So type VisDrone Test the pictures on the test set , The effect is as follows :

The first picture has a good detection effect , Most of the targets are identified .
The result of the second picture is surprising , Only three targets were detected , All other bicycle targets are missed !

My feelings
YOLOv6 It focuses on the deployment of models . In the project file , It supports exporting ONNX、TensorRT And so on the format of the file , According to the experimental comparison issued by the official , The experimental environment is basically nano That kind of embedded device . Probably YOLOv6 It has more advantages in the real production environment , But in terms of pure algorithm effect , The advantages are not obvious . also , A lot of content is for reference YOLOX, Dubbed “YOLOX PLUS" Too much .
at present YOLOv4 The author team of launched YOLOv7,YOLOv6 Then it becomes a transitional work , On the whole, it was launched in a hurry , Obviously, it's not perfect yet, so it's pushed out to occupy the pit . however , As a research result launched by Chinese , Still looking forward to its subsequent development and improvement .
Code backup
Be careful , The point of this blog is just that I am using yolov6s.pt Trained 100 individual epoch The conclusion reached , The specific performance needs to be tested later . Code backup is done here by the way ( contain 3 Pre training weights of models ):
https://pan.baidu.com/s/1GIOZq3EgzzVDjs3zZP_dKQ?pwd=8888
Reference
【1】https://blog.csdn.net/MeituanTech/article/details/125437630
【2】https://blog.csdn.net/lzzzzzzm/article/details/123133069
边栏推荐
- 大家节日快乐哈
- 孙子出题难,儿子监考严。老子不会做,还我上学钱
- What about idea Chinese garbled code
- [flight control development foundation tutorial 4] crazy shell · open source formation UAV - serial port (optical flow data acquisition)
- SQL labs SQL injection platform - level 1 less-1 get - error based - Single Quotes - string (get single quote character injection based on errors)
- Internet of things industrial UART serial port to WiFi to wired network port to Ethernet Gateway WiFi module selection
- 存储过程与函数
- C语言:随机生成数+希尔排序
- Combined use of C WinForm form form event and delegate
- 代码接口自动化的有点
猜你喜欢

Leetcode54. Spiral matrix
![[ten thousand words long article] thoroughly understand load balancing, and have a technical interview with Alibaba Daniel](/img/fc/1ee8b77d675e34da2eb8574592c489.png)
[ten thousand words long article] thoroughly understand load balancing, and have a technical interview with Alibaba Daniel

【Day42 文献精读】A Bayesian Model of Perceived Head-Centered Velocity during Smooth Pursuit Eye Movement

Promise details

杂谈:把肉都烂在锅里就是保障学生权益了?

企业架构驱动的数字化转型!

北京五日游记

Five day travels to Beijing

Zero training platform course-1. SQL injection Foundation

What about idea Chinese garbled code
随机推荐
End of year summary
HU相关配置
【Golang】golang开发微信公众号网页授权功能
【目标检测】YOLOv6理论解读+实践测试VisDrone数据集
Prevent cookies from modifying ID to cheat login
Comprehensive cases
How to play with the purchase of SAP variant materials? Look at this article and you will understand
C#委托的使用案例
Opengauss stopped from the library and found that the main library could not write data
2020 International Machine Translation Competition: Volcano translation won five championships
Shell awk related exercises
Things come to conform, the future is not welcome, at that time is not miscellaneous, neither love
Mqtt instruction send receive request subscription
Lua有状态迭代器
如何更新pip3?和Running pip as the ‘root‘ user can result in broken permissions and conflicting behaviour
孙子出题难,儿子监考严。老子不会做,还我上学钱
Shell enterprise interview exercise
帮个忙呗~不关注不登录,不到一分钟的一个问卷
API 版本控制【 Eolink 翻译】
JS存取cookie示例