当前位置:网站首页>How to deploy yolov6 with tensorrt
How to deploy yolov6 with tensorrt
2022-07-27 08:53:00 【DeepDriving】
This article was first published on WeChat public 【DeepDriving】, Pay attention to the official account back office key reply 【YOLOv6】 You can get the code link of this article .
Preface
YOLOv6 It is a target detection algorithm dedicated to industrial applications developed by meituan vision Intelligence Department , The algorithm framework focuses on both detection accuracy and reasoning efficiency . stay Official articles in , claim YOLOv6 Its precision and speed are far beyond YOLOv5 and YOLOX. In terms of deployment ,YOLOv6 Support GPU(TensorRT)、CPU(OPENVINO)、ARM(MNN、TNN、NCNN) And so on , It greatly simplifies the adaptation work during project deployment .

YOLOv6 For specific implementation details, you can go to the official article , I won't introduce too much here . This article mainly introduces how to use TensorRT Deploy YOLOv6(C++ Realization ).
Implementation process
1. download ONNX Model
YOLOv6 Of ONNX The model can be downloaded from the following link page :
https://github.com/meituan/YOLOv6/releases/tag/0.1.0
You can also download PyTorch Format of the model file , Then use the official script to convert to ONNX Model :
python deploy/ONNX/export_onnx.py --weights yolov6s.pt --img 640 --batch 1
Obtained in the above two ways ONNX The model is best used onnx-simplifier Deal with the tools again , The resulting model will be more streamlined , It will look much more comfortable .
import onnx
from onnxsim import simplify
model = onnx.load('yolov6s.onnx')
model_simple, check = simplify(model)
assert check, 'Failed to simplify model'
onnx.save(model_simple, 'yolov6s_simplify.onnx')
print('Succeed to simplify model')
2. TensorRT analysis ONNX Model
This part of the code is the same as I deployed before YOLOX The code for is the same , Interested can refer to The article I wrote before .
This step first judges ONNX Model corresponding .engine Does the file exist , If there is one, go straight from .engine Load the model in the file , Otherwise call TensorRT Interface to create a ONNX The model parser parses the model , Then serialize the model to .engine The file is convenient for next use .
if (!isFileExists(engine_path)) {
std::cout << "The engine file " << engine_path
<< " has not been generated, try to generate..." << std::endl;
engine_ = SerializeToEngineFile(model_path_, engine_path);
std::cout << "Succeed to generate engine file: " << engine_path
<< std::endl;
} else {
std::cout << "Use the exists engine file: " << engine_path << std::endl;
engine_ = LoadFromEngineFile(engine_path);
}
3. Image preprocessing
This is different from the official treatment , When I do preprocessing, I don't scale the image to the same scale, and then fill in the insufficient places , But do it directly resize 了 :
cv::Mat resize_image;
cv::resize(input_image, resize_image, cv::Size(model_width_, model_height_));
Comparison of two pretreatment methods :


You can see , direct Resize It will cause the object in the image to deform , So it is not recommended to do so , I did this because I was lazy .
Resize After the operation , You need to divide each pixel of the image by 255 Normalize , Image data should be stored in memory according to CHW To arrange in order .
5. post-processing
And YOLOX equally ,YOLOv6 It's a anchor-free Target detection algorithm , The model is still in 3 Test on a scale , The cells on each layer of the feature map predict only one box , The output of each cell is x,y,w,h,objectness this 5 Content plus the probability of each category . It can be used Netron Take a look at the structure of several layers behind the model :

You can see , If the model input size is 640x640, Separate downsampling 8,16,32 The dimensions of the feature map obtained after doubling are 80x80,40x40,20x20. because COCO The dataset has 80 There are categories, so the length of data output from each cell of the characteristic graph is 5+80=85,3 The result on the characteristic graph will finally concat Together for output , So the final output data dimension is (80x80+40x40+20x20)x85=8400x85.
It should be noted that , The operator in the red box above is doing post-processing , Is to put x,y,w,h Do the corresponding operation, and then restore the size relative to the input size of the model . Since these operations are done during reasoning , After we get the reasoning result of the model , You just need to do simple processing :
float *ptr = const_cast<float *>(output);
for (int i = 0; i < 8400; ++i, ptr += (kNumClasses + 5)) {
const float objectness = ptr[4];
if (objectness >= kObjectnessThresh) {
const int label =
std::max_element(ptr + 5, ptr + (kNumClasses + 5)) - (ptr + 5);
const float confidence = ptr[5 + label] * objectness;
if (confidence >= confidence_thresh) {
const float bx = (ptr[0]);
const float by = (ptr[1]);
const float bw = ptr[2];
const float bh = ptr[3];
Object obj;
obj.box.x = (bx - bw * 0.5f) / width_scale;
obj.box.y = (by - bh * 0.5f) / height_scale;
obj.box.width = bw / width_scale;
obj.box.height = bh / height_scale;
obj.label = label;
obj.confidence = confidence;
objs->push_back(std::move(obj));
}
}
}
Last , It is necessary to do non maximum suppression operation on the output results of the model to remove duplicate boxes , I use Soft-NMS It's done .
result
use yolov6_s.onnx Several results of the model test are as follows :



In my GeForce GTX 1050 Ti The time spent testing each model on the graphics card is shown in the following table :
| Model | Enter dimensions | Time consuming |
|---|---|---|
| yolov6n.onnx | 640x640 | 8 ms |
| yolov6t.onnx | 640x640 | 21 ms |
| yolov6s.onnx | 640x640 | 23 ms |
By the way, the YOLOX The time-consuming is also posted :
| Model | Enter dimensions | Time consuming |
|---|---|---|
| yolox_nano.onnx | 640x640 | 8 ms |
| yolox_tiny.onnx | 640x640 | 13 ms |
| yolox_s.onnx | 640x640 | 18 ms |
| yolox_m.onnx | 640x640 | 39 ms |
| yolox_l.onnx | 640x640 | 75 ms |
summary
YOLOv6 stay YOLOX On the basis of BackBone、Neck、Head And training strategies have been improved , The effect is very good , But I don't think it has reached “ Far exceed ” The level of . For a white whore like me , I'd love to see YOLOX perhaps YOLOv6 The algorithm with good effect and easy deployment is open source , I sincerely hope that such work will be more , ha-ha …
Welcome to my official account. 【DeepDriving】, I will share computer vision from time to time 、 machine learning 、 Deep learning 、 Driverless and other fields .

边栏推荐
- 【Flutter -- GetX】准备篇
- New year's goals! The code is more standardized!
- Flask request data acquisition and response
- Solve the problem of Chinese garbled code on the jupyter console
- [flutter -- geTx] preparation
- 4276. 擅长C
- Openresty + keepalived 实现负载均衡 + IPV6 验证
- NIO总结文——一篇读懂NIO整个流程
- NiO Summary - read and understand the whole NiO process
- Cookie addition, deletion, modification and exception
猜你喜欢

Minio 安装与使用

Day5 - Flame restful request response and Sqlalchemy Foundation

Digital intelligence innovation
![[flutter -- geTx] preparation](/img/5f/96075fa73892db069db51fe789715a.png)
[flutter -- geTx] preparation

4274. Suffix expression

Flask request data acquisition and response

NIO this.selector.select()

Flink1.15源码阅读flink-clients客户端执行流程(阅读较枯燥)

User management - restrictions

View 的滑动冲突
随机推荐
Tensorflow模型训练和评估的内置方法
Full Permutation (depth first, permutation tree)
“鼓浪屿元宇宙”,能否成为中国文旅产业的“升级样本”
“蔚来杯“2022牛客暑期多校训练营1
Pyqt5 rapid development and practice 4.1 qmainwindow
Matlab drawing skills and examples: stackedplot
Primary function t1744963 character writing
General Administration of Customs: the import of such products is suspended
NIO this.selector.select()
4275. Dijkstra sequence
Flink1.15源码阅读flink-clients客户端执行流程(阅读较枯燥)
HUAWEI 机试题:字符串变换最小字符串 js
“寻源到结算“与“采购到付款“两者有什么不同或相似之处?
Openresty + keepalived 实现负载均衡 + IPV6 验证
Network IO summary
Matlab画图技巧与实例:堆叠图stackedplot
Openresty + keepalived to achieve load balancing + IPv6 verification
新年小目标!代码更规范!
Implementation of registration function
Kibana uses JSON document data