当前位置:网站首页>Yolov6: the fast and accurate target detection framework is open source
Yolov6: the fast and accurate target detection framework is open source
2022-06-26 21:45:00 【Meituan technical team】
In recent days, , Meituan vision intelligence department has developed a target detection framework dedicated to industrial applications YOLOv6, Be able to focus on both detection accuracy and reasoning efficiency . In the process of R & D , The visual intelligence department has continuously explored and optimized , At the same time, it draws on some cutting-edge progress and scientific research achievements of academia and industry . In the target detection authoritative data set COCO The experimental results on ,YOLOv6 In terms of detection accuracy and speed, it surpasses other algorithms with the same volume , At the same time, it supports the deployment of many different platforms , It greatly simplifies the adaptation work during project deployment . Hereby open source , I hope I can help more students .
1. summary
YOLOv6 It is a target detection framework developed by meituan visual intelligence department , Dedicated to industrial applications . This framework focuses on both detection accuracy and reasoning efficiency , In the dimension model commonly used in industry :YOLOv6-nano stay COCO The upper precision can reach 35.0% AP, stay T4 The reasoning speed can reach 1242 FPS;YOLOv6-s stay COCO The upper precision can reach 43.1% AP, stay T4 The reasoning speed can reach 520 FPS. In terms of deployment ,YOLOv6 Support GPU(TensorRT)、CPU(OPENVINO)、ARM(MNN、TNN、NCNN) And so on , It greatly simplifies the adaptation work during project deployment .
at present , The project has been open source to Github, Portal :YOLOv6. Welcome friends in need Star Collection , Available at any time .
The precision and speed are far beyond YOLOv5 and YOLOX New framework for
Object detection is a basic technology in the field of computer vision , It has been widely used in industry , among YOLO Because of its better comprehensive performance , It has gradually become the preferred framework for most industrial applications . so far , The industry has spawned many YOLO Detection framework , Among them YOLOv5[1]、YOLOX[2] and PP-YOLOE[3] be the leading example of , But in practice , We find that there is still much room for improvement in the speed and accuracy of the above framework . Based on this , We have studied and learned from the advanced technologies in the industry , A new target detection framework is developed ——YOLOv6. The framework supports model training 、 Reasoning, multi platform deployment and other industrial application requirements of the whole chain , And in the network structure 、 The training strategy and other algorithms have been improved and optimized , stay COCO On dataset ,YOLOv6 It surpasses other algorithms of the same volume in accuracy and speed , The relevant results are shown in the figure below 1 Shown :

chart 1-1 YOLOv6 Performance comparison between each size model and other models

chart 1-1 The performance comparison of each detection algorithm in different size networks is shown , The points on the curve represent the detection algorithm in different size networks (s/tiny/nano) The model performance of , As you can see from the diagram ,YOLOv6 It surpasses others in accuracy and speed YOLO A series of isometric algorithms .
chart 1-2 It shows the performance comparison of each detection network model when the input resolution changes , The points on the curve from left to right indicate that the image resolution increases in turn (384/448/512/576/640) The performance of the model , As you can see from the diagram ,YOLOv6 At different resolutions , Still maintain a large performance advantage .
2. YOLOv6 Introduction to key technologies
YOLOv6 Mainly in the Backbone、Neck、Head And the training strategy :
We have uniformly designed more efficient Backbone and Neck : Inspired by the design idea of hardware aware neural network , be based on RepVGG style[4] Design the re parameterization 、 More efficient backbone network EfficientRep Backbone and Rep-PAN Neck.
Optimized the design of more concise and effective Efficient Decoupled Head, While maintaining accuracy , It further reduces the additional delay overhead caused by the general decoupling head .
In training strategy , We use Anchor-free Anchor free paradigm , At the same time with SimOTA[2] Tag allocation strategy and SIoU[9] Boundary box regression loss to further improve the detection accuracy .
2.1 Hardware-friendly Backbone network design
YOLOv5/YOLOX The use of Backbone and Neck Are based on CSPNet[5] build , Multi branch method and residual structure are adopted . about GPU Wait for the hardware , This structure will increase the delay to some extent , At the same time, reduce memory bandwidth utilization . The figure below 2 In the field of computer architecture Roofline Model[8] Introduction chart , Shows the correlation between computing power and memory bandwidth in hardware .

therefore , We are based on the idea of hardware aware neural network design , Yes Backbone and Neck Redesigned and optimized . This idea is based on the characteristics of hardware 、 The frame of reasoning / Features of the compilation framework , The design principle is hardware and compiler friendly structure , When building a network , Comprehensively consider the hardware computing power 、 Memory bandwidth 、 Compile optimization feature 、 Network representation ability, etc , And then get a fast and good network structure . For the above two redesigned detection components , We are YOLOv6 They are called EfficientRep Backbone and Rep-PAN Neck, Its main contribution lies in :
Introduced RepVGG[4] style structure .
Based on the idea of hardware awareness, we redesigned Backbone and Neck.
RepVGG[4] Style Structure is a kind of topology with multiple branches in training , In the actual deployment, it can be equivalent to a single 3x3 A reparameterizable structure of convolution ( The fusion process is shown in the following figure 3 Shown ). Through fusion 3x3 Convolution structure , It can effectively utilize the computing power of computing intensive hardware ( such as GPU), At the same time, we can get GPU/CPU Has been highly optimized NVIDIA cuDNN and Intel MKL Compilation framework help .
Experiments show that , Through the above strategy ,YOLOv6 Reduced hardware latency , And the accuracy of the algorithm is significantly improved , Make the detection network faster and stronger . With nano Dimension model as an example , contrast YOLOv5-nano Adopted network structure , This method improves the speed 21%, At the same time, the accuracy is improved 3.6% AP.

EfficientRep Backbone: stay Backbone Design aspect , We are based on the above Rep The operator designs an efficient Backbone. Compared with YOLOv5 Adopted CSP-Backbone, The Backbone Be able to make efficient use of hardware ( Such as GPU) While calculating power , It also has strong representation ability .
The figure below 4 by EfficientRep Backbone Specific design structure diagram , We will Backbone in stride=2 Ordinary Conv Layer replaced with stride=2 Of RepConv layer . meanwhile , Will be original CSP-Block Are redesigned to RepBlock, among RepBlock One of the first RepConv Would like to be doing channel Transformation and alignment of dimensions . in addition , We will also take the original SPPF Optimized design for more efficient SimSPPF.

Rep-PAN: stay Neck Design aspect , In order to make it more efficient in hardware reasoning , To achieve a better balance between accuracy and speed , We are based on the design idea of hardware aware neural network , by YOLOv6 A more effective feature fusion network structure is designed .
Rep-PAN be based on PAN[6] Topology mode , use RepBlock To replace the YOLOv5 Used in CSP-Block, At the same time, for the whole Neck The operator in is adjusted , The purpose is to achieve efficient reasoning in hardware , Maintain good multi-scale feature fusion capability (Rep-PAN The structure diagram is shown in the figure below 5 Shown ).

2.2 More concise and efficient Decoupled Head
stay YOLOv6 in , We use the decoupling detector (Decoupled Head) structure , And the simplified design is carried out . original YOLOv5 The detection head of is realized through the fusion and sharing of classification and regression branches , and YOLOX The detection head decouples the classification and regression branches , At the same time, two additional 3x3 The convolution of layer , Although the detection accuracy is improved , But it increases the network delay to some extent .
therefore , We simplified the design of decoupling head , At the same time, the balance between the representation ability of correlation operators and the computing overhead on hardware is taken into account , use Hybrid Channels The strategy redesigns a more efficient decoupling head structure , The delay is reduced while maintaining accuracy , Ease the understanding of the coupling head 3x3 The additional delay cost caused by convolution . By means of nano Ablation experiments were performed on size models , Compare the structure of decoupling head with the same number of channels , Accuracy improvement 0.2% AP At the same time , Speed up 6.8%.

2.3 More effective training strategies
In order to further improve the detection accuracy , We have learned from the advanced research progress of other detection frameworks in academia and the industry :Anchor-free Anchor free paradigm 、SimOTA Tag allocation strategy and SIoU Bounding box regression loss .
Anchor-free Anchor free paradigm
YOLOv6 A more concise Anchor-free test method . because Anchor-based The detector needs to perform cluster analysis before training to determine the best Anchor aggregate , This will increase the complexity of the detector to some extent ; meanwhile , In some edge applications , Steps that require moving a large number of test results between hardware , There will also be additional delay . and Anchor-free The anchor free paradigm has strong generalization ability , Decoding logic is simpler , It has been widely used in recent years . Yes Anchor-free Experimental research , We found that , Compare with Anchor-based Additional delay due to the complexity of the detector ,Anchor-free The detector has 51% The promotion of .
SimOTA Label assignment policy
In order to obtain more high-quality positive samples ,YOLOv6 Introduced SimOTA [4] The algorithm dynamically allocates positive samples , Further improve the detection accuracy .YOLOv5 The label allocation strategy of is based on Shape matching , And increase the number of positive samples through cross grid matching strategy , Thus, the network converges quickly , But this method belongs to static allocation method , It will not be adjusted with the process of network training .
In recent years , There are also many methods based on dynamic label assignment , This method will allocate positive samples according to the network output in the training process , Thus, more high-quality positive samples can be produced , Then it promotes the positive optimization of the network . for example ,OTA[7] By modeling the sample matching as an optimal transmission problem , The best sample matching strategy under global information is obtained to improve the accuracy , but OTA Due to the use of Sinkhorn-Knopp The algorithm results in longer training time , and SimOTA[4] The algorithm uses Top-K Approximate strategy to get the best match of samples , It greatly speeds up the training speed . so YOLOv6 Adopted SimOTA Dynamic allocation strategy , Combined with the anchor free paradigm , stay nano The average detection accuracy on the size model is improved 1.3% AP.
SIoU Bounding box regression loss
In order to further improve the regression accuracy ,YOLOv6 Adopted SIoU[9] The boundary box regression loss function is used to supervise the learning of the network . The training of target detection network usually needs to define at least two loss functions : Classification loss and bounding box regression loss , The definition of loss function often has a great impact on the detection accuracy and training speed .
In recent years , Common boundary box regression losses include IoU、GIoU、CIoU、DIoU loss wait , These loss functions consider the degree of overlap between the prediction frame and the target frame 、 Distance from the center 、 Aspect ratio and other factors to measure the gap between the two , So as to guide the network to minimize the loss and improve the regression accuracy , However, these methods do not consider the direction matching between the prediction frame and the target frame .SIoU The loss function introduces the vector angle between the required regressions , Redefined the distance loss , It effectively reduces the degree of freedom of regression , Accelerate network convergence , The regression accuracy is further improved . By means of YOLOv6s Upper adoption SIoU loss experiment , contrast CIoU loss, Average detection accuracy is improved 0.3% AP.
3. experimental result
After the above optimization strategies and improvements ,YOLOv6 The model has achieved excellent performance in many different sizes . The following table 1 It shows YOLOv6-nano Results of ablation experiments , It can be seen from the experimental results that , Our self-designed detection network has brought great gains in accuracy and speed .

The following table 2 It shows YOLOv6 And other current mainstream YOLO Experimental results of a series of algorithms . You can see from the table that :

YOLOv6-nano stay COCO val On made 35.0% AP The accuracy of the , At the same time T4 Upper use TRT FP16 batchsize=32 Reasoning , Accessible 1242FPS Performance of , Compare with YOLOv5-nano Accuracy improvement 7% AP, Speed up 85%.
YOLOv6-tiny stay COCO val On made 41.3% AP The accuracy of the , At the same time T4 Upper use TRT FP16 batchsize=32 Reasoning , Accessible 602FPS Performance of , Compare with YOLOv5-s Accuracy improvement 3.9% AP, Speed up 29.4%.
YOLOv6-s stay COCO val On made 43.1% AP The accuracy of the , At the same time T4 Upper use TRT FP16 batchsize=32 Reasoning , Accessible 520FPS Performance of , Compare with YOLOX-s Accuracy improvement 2.6% AP, Speed up 38.6%; Compare with PP-YOLOE-s Accuracy improvement 0.4% AP Under the condition of , stay T4 Upper use TRT FP16 Carry out a single batch Reasoning , Speed up 71.3%.
4. Summary and prospect
This paper introduces the optimization and practical experience of meituan vision intelligence department in target detection framework , We aim at YOLO Series frame , In training strategies 、 Backbone network 、 Multi scale feature fusion 、 The inspection head and other aspects have been considered and optimized , A new detection framework is designed -YOLOv6, The original intention comes from solving the practical problems encountered in the implementation of industrial applications .
Building YOLOv6 Frame at the same time , We have explored and optimized some new methods , For example, based on the design idea of hardware aware neural network, we have developed EfficientRep Backbone、Rep-Neck and Efficient Decoupled Head, At the same time, it also absorbs and draws on some cutting-edge developments and achievements of academia and industry , for example Anchor-free、SimOTA and SIoU Return to loss . stay COCO The experimental results on the data set show that ,YOLOv6 It is an outstanding one in terms of detection accuracy and speed .
In the future, we will continue to build and improve YOLOv6 ecology , The main work includes the following aspects :
perfect YOLOv6 Full range of models , Continuously improve the detection performance .
On a variety of hardware platforms , Design a hardware friendly model .
Support ARM Platform deployment, quantitative distillation and other full chain adaptation .
Horizontal expansion and introduction of related technologies , Such as semi supervision 、 Self supervised learning, etc .
Explore YOLOv6 Generalization performance on more unknown business scenarios .
At the same time, you are also welcome to join us , Jointly build a faster and more accurate target detection framework suitable for industrial applications .( Add at the end of the text YOLOv6 The way of technical exchange group )
5. reference
[1] YOLOv5, https://github.com/ultralytics/yolov5
[2] YOLOX: Exceeding YOLO Series in 2021, https://arxiv.org/abs/2107.08430
[3] PP-YOLOE: An evolved version of YOLO, https://arxiv.org/abs/2203.16250
[4] RepVGG: Making VGG-style ConvNets Great Again, https://arxiv.org/pdf/2101.03697
[5] CSPNet: A New Backbone that can Enhance Learning Capability of CNN, https://arxiv.org/abs/1911.11929
[6] Path aggregation network for instance segmentation, https://arxiv.org/abs/1803.01534
[7] OTA: Optimal Transport Assignment for Object Detection, https://arxiv.org/abs/2103.14259
[8] Computer Architecture: A Quantitative Approach
[9] SIoU Loss: More Powerful Learning for Bounding Box Regression, https://arxiv.org/abs/2205.12740
6. Author's brief introduction
Chu Yi 、 Kaiheng 、 nor 、 Chengmeng 、 Qin Hao 、 Yiming 、 Hongliang 、 Lin Yuan et al , All from meituan basic R & D platform / Visual intelligence department .
--- Welcome to join YOLOv6 Open source technology exchange group ---

After joining the group , You can communicate directly with the authors of open source projects , We also hope that this open source project can help more students .
---------- END ----------
Meituan scientific research cooperation
Meituan scientific research cooperation is committed to building various departments and universities of meituan 、 Scientific research institutions 、 Think tank cooperation bridge and platform , Relying on the rich business scene of meituan 、 Data resources and real industry issues , Open innovation , Gathering upward forces , Around AI 、 big data 、 The Internet of things 、 unmanned 、 Operational research optimization 、 Digital economy 、 Public affairs, etc , We will jointly explore cutting-edge science and technology and industrial focus on macro issues , To promote the cooperation and exchange of industry, University and research institutes and the transformation of achievements , Promote the cultivation of outstanding talents . Facing the future , We look forward to cooperating with teachers and students in more universities and scientific research institutes . Welcome teachers and students to send email to :[email protected] .
Maybe you want to see it
| Practice of face detection technology in natural scenes
| NeurIPS 2021 | Twins: Rethink the design of efficient visual attention model
Read more
---
front end | Algorithm | Back end | data
Security | Android | iOS | Operation and maintenance | test
边栏推荐
- About appium trample pit: encountered internal error running command: error: cannot verify the signature of (solved)
- Shiniman household sprint A shares: annual revenue of nearly 1.2 billion red star Macalline and incredibly home are shareholders
- [protobuf] some pits brought by protobuf upgrade
- 基于QT实现简单的连连看小游戏
- MacOS环境下使用HomeBrew安装[email protected]
- How to enable Hana cloud service on SAP BTP platform
- 在哪家证券公司开户最方便最安全可靠
- CVPR 2022 | 美团技术团队精选论文解读
- YuMinHong: New Oriental does not have a reversal of falling and turning over, destroying and rising again
- Module 5 operation
猜你喜欢

YOLOv6:又快又准的目标检测框架开源啦

协同过滤进化版本NeuralCF及tensorflow2实现

Yonghui released the data of Lantern Festival: the sales of Tangyuan increased significantly, and several people's livelihood products increased by more than 150%

Kdd2022 𞓜 unified session recommendation system based on knowledge enhancement prompt learning

在Flutter中解析复杂的JSON

YOLOv6:又快又准的目標檢測框架開源啦

网易云信正式加入中国医学装备协会智慧医院分会,为全国智慧医院建设加速...

花店橱窗布置【动态规划】

网络爬虫2:抓取网易云音乐评论用户ID及主页地址

Y48. Chapter III kubernetes from introduction to mastery -- pod status and probe (21)
随机推荐
QT based "synthetic watermelon" game
Leetcode question brushing: String 06 (implement strstr())
The source code that everyone can understand (I) the overall architecture of ahooks
传纸条【动态规划】
第2章 构建自定义语料库
Web crawler 2: crawl the user ID and home page address of Netease cloud music reviews
Can compass open an account for stock trading? Is it safe?
宝藏又小众的覆盖物PBR多通道贴图素材网站分享
[LeetCode]-链表-2
Leetcode question brushing: String 02 (reverse string II)
The importance of using fonts correctly in DataWindow
curl: (35) LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection
Is there any risk in registering and opening an account for stock speculation? Is it safe?
Many gravel 3D material mapping materials can be obtained with one click
【BUG反馈】WebIM在线聊天系统发消息时间问题
Leetcode question brushing: String 01 (inverted string)
Configure redis master-slave and sentinel sentinel in the centos7 environment (solve the problem that the sentinel does not switch when the master hangs up in the ECS)
Chapter 2 construction of self defined corpus
不要做巨嬰了
在线协作文档综合评测 :Notion、FlowUs、Wolai、飞书、语雀、微软 Office、谷歌文档、金山文档、腾讯文档、石墨文档、Dropbox Paper、坚果云文档、百度网盘在线文档