当前位置:网站首页>[yolov5 6.0 | 6.1 deploy tensorrt to torch serve] environment construction | model transformation | engine model deployment (detailed packet file writing method)
[yolov5 6.0 | 6.1 deploy tensorrt to torch serve] environment construction | model transformation | engine model deployment (detailed packet file writing method)
2022-07-07 00:36:00 【Live like yourself】
I suddenly found out , About deployment TensorRT There are so few articles , So , I decided to share some successful practices and experiences about this part of the content . I hope you can share it , Let more people see !!!
QQ: 1757093754
My operating environment :
Yolov5 6.0: Currently officially updated to 6.1 —> 2022.7.5
python:3.8(anaconda3 2021.5)
CUDA: 11.4
CUDNN: 8.2.2
TensorRT: 8.2.2.1
torch:1.9.1+cu111
torchvision:0.10.1+cu111
vs:2019
OpenCV:4.5.0< If your yolov5 The version is 6.0、6.1, You don't have to OpenCV>
notes : If there is a problem with the environment during operation , Be sure to check your environment version !!!
notes :Yolov5 1~5 The version is not well encapsulated TensorRT Conversion process , Therefore, we need to use other methods to complete the model transformation , however 6.0、6.1 built-in TensorRT Of export.
Catalog
Version correspondence :( If this part has been completed , You can skip directly to the next link )
First step : install CUDA and CUDNN
The second step : Configure environment variables
The third step : install tensorrt To python Environmental Science (pip)
Model transformation (pt --> engine)
Command line parameter parsing :
edit export.py Command line :
torchserve Production of deployment files
Data loading ( Preprocessing ) Method
Execute the packaging command to generate mar file
Preface
- The operating system I use is windows10, But the same way Linux You can also use .
- This operation Not based on TensorRTx Tools .< This is a 1~5 edition yolov5 It is required for model transformation >
- The following operations will not be used OpenCV and Cmake, But the download method will be given .
- Welcome to discuss 、 communication :QQ | 1757093754
Version correspondence :( If this part has been completed , You can skip directly to the next link )
tensorrtx Download link <wang-xinyu The share of >:
GitHub - wang-xinyu/tensorrtx: Implementation of popular deep learning networks with TensorRT network definition APIhttps://github.com/wang-xinyu/tensorrtx Anaconda and python Correspondence of :
Old package lists — Anaconda documentationhttps://docs.anaconda.com/anaconda/packages/oldpkglists/anaconda Domestic download source ( Tsinghua source ):
Index of /anaconda/archive/ | Tsinghua University open source software image station | Tsinghua Open Source MirrorIndex of /anaconda/archive/ | Tsinghua University open source software image station , We are committed to providing high-quality open source software images for domestic and school users 、Linux Image source service , Help users more easily access open source software . This mirror station is sponsored by Tsinghua University TUNA The association is responsible for operation and maintenance .https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Cmake download < I won't use it this time , be limited to windows Systematic yolov5 1~5 Version deployment >:
Download | CMakehttps://cmake.org/download/OpenCV download < I won't use it this time , be limited to windows Systematic yolov5 1~5 Version deployment >:
Releases - OpenCVhttps://opencv.org/releases/opencv And vs Corresponding relation :
vc6 = Visual Studio 6
vc7 = Visual Studio 2003
vc8 = Visual Studio 2005
vc9 = Visual Studio 2008
vc10 = Visual Studio 2010
vc11 = Visual Studio 2012
vc12 = Visual Studio 2013
vc14 = Visual Studio 2015
vc15 = Visual Studio 2017
vc16 = Visual Studio 2019notes : So here I'm going to use theta vs2019, But it not only supports vc16, Also supports the vc15、vc14, When you download opencv When , The installation package will indicate its built-in vc edition .
CUDA Download version relationship ( Pay attention to your computer driver version):
CUDA Toolkit | Toolkit Driver Version | |
---|---|---|
Linux x86_64 Driver Version | Windows x86_64 Driver Version | |
CUDA 11.7 GA | >=515.43.04 | >=516.01 |
CUDA 11.6 Update 2 | >=510.47.03 | >=511.65 |
CUDA 11.6 Update 1 | >=510.47.03 | >=511.65 |
CUDA 11.6 GA | >=510.39.01 | >=511.23 |
CUDA 11.5 Update 2 | >=495.29.05 | >=496.13 |
CUDA 11.5 Update 1 | >=495.29.05 | >=496.13 |
CUDA 11.5 GA | >=495.29.05 | >=496.04 |
CUDA 11.4 Update 4 | >=470.82.01 | >=472.50 |
CUDA 11.4 Update 3 | >=470.82.01 | >=472.50 |
CUDA 11.4 Update 2 | >=470.57.02 | >=471.41 |
CUDA 11.4 Update 1 | >=470.57.02 | >=471.41 |
CUDA 11.4.0 GA | >=470.42.01 | >=471.11 |
CUDA 11.3.1 Update 1 | >=465.19.01 | >=465.89 |
CUDA 11.3.0 GA | >=465.19.01 | >=465.89 |
CUDA 11.2.2 Update 2 | >=460.32.03 | >=461.33 |
CUDA 11.2.1 Update 1 | >=460.32.03 | >=461.09 |
CUDA 11.2.0 GA | >=460.27.03 | >=460.82 |
CUDA 11.1.1 Update 1 | >=455.32 | >=456.81 |
CUDA 11.1 GA | >=455.23 | >=456.38 |
CUDA 11.0.3 Update 1 | >= 450.51.06 | >= 451.82 |
CUDA 11.0.2 GA | >= 450.51.05 | >= 451.48 |
CUDA 11.0.1 RC | >= 450.36.06 | >= 451.22 |
CUDA 10.2.89 | >= 440.33 | >= 441.22 |
CUDA 10.1 (10.1.105 general release, and updates) | >= 418.39 | >= 418.96 |
CUDA 10.0.130 | >= 410.48 | >= 411.31 |
CUDA 9.2 (9.2.148 Update 1) | >= 396.37 | >= 398.26 |
CUDA 9.2 (9.2.88) | >= 396.26 | >= 397.44 |
CUDA 9.1 (9.1.85) | >= 390.46 | >= 391.29 |
CUDA 9.0 (9.0.76) | >= 384.81 | >= 385.54 |
CUDA 8.0 (8.0.61 GA2) | >= 375.26 | >= 376.51 |
CUDA 8.0 (8.0.44) | >= 367.48 | >= 369.30 |
CUDA 7.5 (7.5.16) | >= 352.31 | >= 353.66 |
CUDA 7.0 (7.0.28) | >= 346.46 | >= 347.62 |
notes : This version is the minimum driver version , It can be larger than this version .
cudnn And cuda Correspondence and download :
cuDNN Archive | NVIDIA Developerhttps://developer.nvidia.com/rdp/cudnn-archive#a-collapse742-10 cuda And TensorRT Correspondence and download :
NVIDIA TensorRT Download | NVIDIA Developerhttps://developer.nvidia.com/nvidia-tensorrt-download notes : download TensorRT Remember to order this when you :
Click on any , You can see the given version correspondence , Download it yourself !
congratulations ! The first step is done , Successfully downloaded all environments !
Environmental installation
- We talked about the relationship between the required environment and its version , This section talks about how to build an environment .
First step : install CUDA and CUDNN
CUDA The default installation path of is set at :C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA
Don't move this , It is convenient for later modification and search . Multiple CUDA Versions can coexist without affecting ( Pay attention to distinguish between your driver and CUDA The relationship between , They are not the same , The driver must not be installed more ) , The actual work of CUDA Depends on your environment variables .
Install well CUDA after , You need to install cudnn, The method is a little special :
- take cuda\bin Copy the file in to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin
- take cuda\include Copy the file in to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include
- take cuda\lib Copy the file in to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib
The second step : Configure environment variables
- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\bin
- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\libnvvp
- D:\TensorRT-8.2.2.1.Windows10.x86_64.cuda-11.4.cudnn8.2\TensorRT-8.2.2.1\include
- D:\TensorRT-8.2.2.1.Windows10.x86_64.cuda-11.4.cudnn8.2\TensorRT-8.2.2.1\lib
- D:\opencv_4.5.0\opencv\build\x64\vc15\bin
- D:\opencv_4.5.0\opencv\build\x64\vc15\lib
- D:\Cmake\bin
notes :OpenCV、Cmake If it is not installed, there is no need to configure , Check only tensorrt、cuda、anaconda The environment is OK .
The third step : install tensorrt To python Environmental Science (pip)
Open it up and download okay tensorrt, Note the following 4 A folder , We need to install the inside whl To python In the environment .
pip install graphsurgeon\graphsurgeon-0.4.5-py2.py3-none-any.whl
pip install onnx_graphsurgeon\onnx_graphsurgeon-0.3.12-py2.py3-none-any.whl
pip install python\tensorrt-8.2.2.1-cp38-none-win_amd64.whl
pip install uff\uff-0.6.9-py2.py3-none-any.whl
notes : stay python There is... Under the folder 4 individual whl, Here mine python The environment is 3.8, Therefore, only cp38 Version of whl that will do .
notes : The above processes will not report errors under normal circumstances .
Model transformation (pt --> engine)
stay yolov5 6 In the version , The transformation scheme of the mainstream model in the market is given , The following is the weight diagram supporting the model :
We use it directly export.py You can directly transform the model :( If there is no problem with your previous process )
Command line parameter parsing :
- --data: Because the weight of conversion is no longer yolo.py Inside the model class , So there is no longer names Member variables of , We need to specify data File to load category information !
- --weights: We have finished our training pt The weight .(best.pt、epoch.pt Fine )
- --imgsz: Fix a size to simulate reasoning , Here's what I suggest you use when training resize, What do you write here .( When the specified size is completed , The generated weights can only input images of this size into the model prediction )
- --device: Be sure to use GPU!!! Be sure to use GPU!!! Be sure to use GPU!!! Say the important things 3 All over , This is for reasoning calculation , It has nothing to do with the generated model , I use GPU all Ran for half an hour !!!
- --opset: Yours onnx opset model .( If you have the same process as me , Namely 12, Otherwise 12 【doge】)( In fact, it doesn't matter if you change this place .. Because the source code defaults 13 了 【 If the opset Wrong version , You need to modify the following source code 】)
- --conf-thres: The result of your prediction Confidence threshold , Greater than this value displays
- --include: Just write engine Just go .
The other parameters :
- --workspace: In the process of reasoning Maximum overhead of video memory , The default is 4, I set it up. 7. 【 This parameter can reduce the model conversion time 】
notes : If you report opset Version error , Please change this place to 12.(opset version)
export.py Command line :
python .\export.py --data .\data\voc_xf.yaml --weights .\best.pt --imgsz 640,640 --batch-size 1 --device 0 --workspace 7 --conf-thres 0.6 --include engine
notes :
- export adopt onnx As an intermediary , Therefore, a onnx Model of , And then generate engine.
- The conversion time is very long , Probably Half an hour ( open 4G Video memory words , I opened 7G, yes 10 More minutes ).
- During the conversion, the video memory used will continue to increase , You can open the terminal of another process to check the usage of video memory .
- Warnings may appear during conversion , As long as the program doesn't end , Description is still changing , Don't pause !!!
Check the usage of video memory :
nvidia-smi
Generated engine and onnx:
You can see the generated engine Still very small .
torchserve Production of deployment files
- In previous work , We've got engine The weight , It can speed up our model prediction , We will deploy it on the server , This section will describe how to make torchserve Files for deployment .
Preface
Before , I wrote a blog , Recorded torchserve Installation and use process of : ( You can refer to it )
handler.py The writing of
The blog above mentioned ,handler.py We should realize our own Model loading method 、 Data loading ( Preprocessing ) Method 、 Reasoning method 、 Post processing method . This time, , Just do it yourself yolov5 Of handler To write :
Model loading method
How to load the model , We use yolov5 Inside detect.py The use of DetectMultiBackend class .
Let's make a custom one model.py Inherit this class , It's convenient for us to handler Call in :
model.py:
from models.common import DetectMultiBackend
class YOLOV5ObjectDetector(DetectMultiBackend):
def __init__(self, weights, device, data):
super(YOLOV5ObjectDetector, self).__init__(weights=weights, device=device, data=data)
At the time of inheritance , We just need to pass in 3 Just one parameter , One is the path of weight , One is used for prediction GPU still CPU, One is our dataset format file (voc_xf.yaml)
Then we began to realize handler.py The model loading method :
names = []
def _load_pickled_model(self, model_dir, model_file, model_pt_path):
"""
Loads the pickle file from the given model path.
Args:
model_dir (str): Points to the location of the model artefacts.
model_file (.py): the file which contains the model class.
model_pt_path (str): points to the location of the model pickle file.
Raises:
RuntimeError: It raises this error when the model.py file is missing.
ValueError: Raises value error when there is more than one class in the label,
since the mapping supports only one label per class.
Returns:
serialized model file: Returns the pickled pytorch model file
"""
# Check if there is model.py
model_def_path = os.path.join(model_dir, model_file)
if not os.path.isfile(model_def_path):
raise RuntimeError("Missing the model.py file")
# testing model.py Is there only one class
module = importlib.import_module(model_file.split(".")[0])
model_class_definitions = list_classes_from_module(module)
if len(model_class_definitions) != 1:
raise ValueError(
"Expected only one class as model definition. {}".format(model_class_definitions)
)
model_class = model_class_definitions[0] # YOLOV5ObjectDetector class
self.device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model = model_class(model_pt_path, self.device, 'voc_xf.yaml')
self.names = model.names
self.names[0] = 'xf_road'
return model
because DetectMultiBackend Class loads both the dataset format file and the model , So we can change its member variables names Assign a value to handler.py Member variables of names in ( This variable needs to be defined by yourself , As in the first line of the code block above ).
notes :
- Reciprocal in code 2 Row is to adjust my own dataset , Used to change the name of the first category , This is not necessary .
- This names Is the name of the forecast category , Again, you can use Chinese, but you need to download a Chinese ttf.
Data loading ( Preprocessing ) Method
handler.py Medium preprocess Method realizes a preprocessing of the data to be detected , Parameters data It may be diverse , It depends on how you upload .
def preprocess(self, data):
print("DEBUG--%d" % len(data))
images = []
for row in data:
image = row.get("data") or row.get("body")
if isinstance(image, str):
# if the image is a string of bytesarray.
image = base64.b64decode(image)
elif isinstance(image, (bytearray, bytes)): # if the image is sent as bytesarray
image = Image.open(io.BytesIO(image))
# # take bgr To rbg
image = cv2.cvtColor(np.asarray(image), cv2.COLOR_RGB2BGR)
else: # if the image is a data list
image = image.get('instances')[0]
image = np.divide(torch.HalfTensor(image), 255)
img0 = image # Save the original image information
img = letterbox(image, 640, stride=32, auto=self.model.pt)[0]
# Convert
img = img.transpose((2, 0, 1))[::-1] # HWC to CHW, BGR to RGB
image = np.ascontiguousarray(img)
image = torch.from_numpy(image).float().to(self.device)
image /= 255
if len(image.shape) == 3:
image = image[None] # stay batch-size Dimension expansion
images.append([image, img0])
# images = torch.stack(images).to(self.device) # Splicing and dimension expansion of data sets
return images
It is worth mentioning that :
Because after the conversion tensorrt The model has fixed the size of the input characteristic matrix , So we need to record the size of the image before and after preprocessing , It is convenient to restore the predicted dimension box in the post-processing process .
img0 Corresponding to the original image .
image Corresponding to the image after data enhancement .
Reasoning method
handler.py Pass through inference Method to realize the data processing and inference process of the model .
We use yolov5 The data processing and inference process of the model :
def inference(self, data, *args, **kwargs):
results = []
for each_data in data:
im, im0 = each_data
pred = self.model(im, augment=False, visualize=False)
results.append([each_data, pred])
return results
It is worth mentioning that :
preprocess、inference、postprocess The parameters and return values of are passed , The return value of the previous method is the parameter of the next method .
Post processing method
Here is what we are going to write handler.py The last method in , He is responsible for the processing and output of the prediction results .
Due to the confidentiality of the project , I have Hide important code :
def postprocess(self, data):
all_result = []
for each_data, pred in data:
result = []
# Non maximum suppression
pred = non_max_suppression(pred, conf_thres=0.3, iou_thres=0.45, classes=None, agnostic=False, max_det=1000)[0]
if pred is None:
print('No target detected.')
result.append({"classes": [], "scores": [], "boxes": []})
return result
else:
# Change the box from img Rescale the size to im0 size
object = []
pred[:, :4] = scale_coords(each_data[0].shape[2:], pred[:, :4], each_data[1].shape).round()
pred[:, :4] = pred[:, :4].round()
boxes = pred[:, :4].detach().cpu().numpy()
scores = pred[:, 4].detach().cpu().numpy()
classes = pred[:, 5].detach().cpu().numpy().astype(np.int)
new_classes = [self.names[i] for i in classes]
for i in range(len(classes)):
object.append([new_classes[i], boxes[i][0], boxes[i][1], boxes[i][2], boxes[i][3], scores[i]])
new_cars = object # Important code is hidden here
if new_cars:
result.append({
"classes": [classes for classes, _, _, _, _, _ in new_cars],
"scores": [str(scores) for _, _, _, _, _, scores in new_cars],
"boxes": [str([b0, b1, b2, b3]) for _, b0, b1, b2, b3, _ in new_cars]
}
)
print('GYYDEBUG--RESULT:{}'.format(result))
all_result.append(result)
return all_result
else:
result.append({"classes": [], "scores": [], "boxes": []})
return result
Last
handler.py The completion of the file means that the prerequisite work of deployment has been completed .
Finally, all files ( Include the files needed for dependencies ) Put together : Pack it up packet_trt
notes : The steps in this article are for reference only , direct cv It is no use , This is my own project handler, If you really want to learn , Please also carefully study the principle of the code . This code is successfully deployed .
Deployment work
- When you have finished preparing all your documents , You can officially start your deployment .
Pack your packet_trt Put it on the server you want to deploy :
notes : Here I move the weight out .
Execute the packaging command to generate mar file
Note the relative path of the file :
torch-model-archiver --model-name test_trt --version 1 --serialized-file best.engine --handler packets_trt/handler.py --model-file packets_trt/model.py --extra-file packets_trt -f
( Generated mar I moved it to model-store Under the folder 【 Create a new one by yourself model-store Folder 】)
open torchserve service
torchserve --start --model-store model-store --models test=test_trt.mar --ts_config ./config.properties
Display of normal deployment :( Yes TensorRT Information about )
notes :
- [I]: yes tensorrt Normal information log report .
- [W]: yes tensorrt Warning of , This is not an error report , Don't worry about it .
- [E] : This is an error report , The emergence of this represents a problem in the process .
The test image
Here I tested one-time Load multiple different formats Pictures of the :
curl http://localhost:8085/predictions/test -T "{01.jpg,02.png}"
Successful test Generated logs :( Please ignore the above lines in the figure below , I made a mistake in the test 【doge】)
Successful test Result :( notes : Nothing is detected here because I hide it with code , Due to the confidentiality of the project , Please also understand !!!)
Postscript
notes :
- Generate engine Of TensorRT Environment and model loading at deployment TensorRT The need is the same , It is suggested that the model transformation and deployment should be carried out in the same environment .
- It is recommended to open 2 Terminals , One to display the deployment information , The other is used to test , In this way, it is convenient to check whether there will be problems in the reasoning part of the model .
- If the model is not loaded after deployment torchserve Inside , Then check in turn Generated mar Is the document correct 、handler Whether the writing is correct 、 Whether the environment is installed correctly .
- It's not easy to write , Please also tweet for me ~~~【doge】
Reference material :
yolov5 The seventh step of deployment is completed tensorRT Model reasoning is accelerated _Christo3 The blog of -CSDN Blog _tensorrt Deploy yolov5https://blog.csdn.net/weixin_41552975/article/details/114398669?spm=1001.2101.3001.6661.1&utm_medium=distribute.pc_relevant_t0.none-task-blog-2~default~CTRLIST~default-1-114398669-blog-125082939.pc_relevant_aa&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-2~default~CTRLIST~default-1-114398669-blog-125082939.pc_relevant_aa&utm_relevant_index=1
TorchServe Deploy pytorch Model _lulu_ Stranger's blog -CSDN Blog _torchservehttps://blog.csdn.net/qq_41360255/article/details/116707586
1. TorchServe — PyTorch/Serve master documentationhttps://pytorch.org/serve/TensorRT SDK | NVIDIA Developerhttps://developer.nvidia.com/tensorrt
边栏推荐
- File and image comparison tool kaleidoscope latest download
- 2022年PMP项目管理考试敏捷知识点(9)
- What is web penetration testing_ Infiltration practice
- C语言输入/输出流和文件操作【二】
- dynamic programming
- kubernetes部署ldap
- AI super clear repair resurfaces the light in Huang Jiaju's eyes, Lecun boss's "deep learning" course survival report, beautiful paintings only need one line of code, AI's latest paper | showmeai info
- Model-Free Control
- QT tutorial: creating the first QT program
- [daily problem insight] prefix and -- count the number of fertile pyramids in the farm
猜你喜欢
@TableId can‘t more than one in Class: “com.example.CloseContactSearcher.entity.Activity“.
2022 PMP project management examination agile knowledge points (9)
准备好在CI/CD中自动化持续部署了吗?
Geo data mining (III) enrichment analysis of go and KEGG using David database
Mujoco finite state machine and trajectory tracking
X.509 certificate based on go language
Alexnet experiment encounters: loss Nan, train ACC 0.100, test ACC 0.100
iMeta | 华南农大陈程杰/夏瑞等发布TBtools构造Circos图的简单方法
Mujoco Jacobi - inverse motion - sensor
工程师如何对待开源 --- 一个老工程师的肺腑之言
随机推荐
Advanced learning of MySQL -- basics -- transactions
Memory optimization of Amazon memorydb for redis and Amazon elasticache for redis
Advanced learning of MySQL -- basics -- multi table query -- subquery
Leecode brush questions record sword finger offer 11 Rotate the minimum number of the array
Data analysis course notes (III) array shape and calculation, numpy storage / reading data, indexing, slicing and splicing
【vulnhub】presidential1
37 page overall planning and construction plan for digital Village revitalization of smart agriculture
Personal digestion of DDD
Imeta | Chen Chengjie / Xia Rui of South China Agricultural University released a simple method of constructing Circos map by tbtools
MIT 6.824 - raft Student Guide
Clipboard management tool paste Chinese version
Command line kills window process
Introduction au GPIO
工程师如何对待开源 --- 一个老工程师的肺腑之言
Notes of training courses selected by Massey school
【软件逆向-求解flag】内存获取、逆变换操作、线性变换、约束求解
Basic information of mujoco
Advanced learning of MySQL -- basics -- multi table query -- external connection
Operation test of function test basis
Win10 startup error, press F9 to enter how to repair?