当前位置：网站首页>Deploy crawl detection network using tensorrt (I)

Deploy crawl detection network using tensorrt (I)

2022-07-03 05:17:00 【Qianyu QY】

1. TensorRT brief introduction

tensorRT Yes, you can. NVIDIA Various GP U One running under C++ The frame of reasoning . We use Pytorch、TF Trained models , Can be converted to TensorRT The format of , And then use TensorRT The inference engine runs this model , So as to improve the model in NVIDIA GPU Running speed on , Generally, it can be increased by several times ~ Dozens of times .

Mainstream pytorch Deployment path ：

pytorch $\rightarrow$ ONNX $\rightarrow$ TensorRT
torch2trt
torch2trt_dynamic
TRTorch

2. Capture detection deployment process

The crawl detection network used here comes from my previous paper ：High-performance Pixel-level Grasp Detection based on Adaptive Grasping and Grasp-aware Network. This method has achieved 99.09% Grasp detection accuracy , And in the actual multi object stacking scene 95.71% Capture success rate , The experimental demonstration video is in youtube On ：https://www.youtube.com/watch?v=KUa3XlVwDsU. Thesis download address ：https://www.techrxiv.org/articles/preprint/High-performance_Pixel-level_Grasp_Detection_based_on_Adaptive_Grasping_and_Grasp-aware_Network/14680455

Deploy TensorRT Need to install pytorch、tensorRT、ONNX Such dependence , The specific installation methods are quite detailed on the Internet , Here is only the version information I used ：

ubuntu:           16.06
TensorRT:         7.0.0
ONNX IR version:  0.0.4
Opset version:    10
Producer name:    pytorch 1.2.0
GPU:              TITAN Xp
CUDA:             10.0
Driver Version:   430.14

have access to python perhaps C++ Deployment , Here I use C++.

2.1 take pytorch Network generation onnx file

pytorch Provides generation onnx Model approach , The code is as follows ：

import torch

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load("model_path") # pytorch Model loading 

model.eval()
x = torch.randn((1, 3, 320, 320))   #  Generating tensor 
x = x.to(device)
torch.onnx.export(model,
                  x,
                  "ckpt/sgdn.onnx",
                  verbose =True,
                  opset_version=10,
                  do_constant_folding=True,	#  Whether to perform constant folding optimization 
                  input_names=["input"],	#  Enter the name 
                  output_names=["output_able", "output_angle", "output_width"])	#  Output name

At the time of generation , There is one caveat , Do not use interpolation upsampling in the network , Otherwise, in the tensorRT Reasoning will report errors , Use torch.nn.UpsamplingNearest2d() Instead of interpolation upsampling . Discussion on this issue ：https://github.com/NVIDIA/TensorRT/issues/284.

onnx Files can be downloaded on my Google disk ：

https://drive.google.com/file/d/1AGyjRTWIw85ctwP6VsBDCmR0mE8NdRLu/view?usp=sharing

Use python Of onnx Package check sgdn.onnx Whether it works , The procedure is as follows ：

import onnx
model_path = 'sgdn.onnx'
#  Verify the validity of the model 
onnx_model = onnx.load(model_path)
onnx.checker.check_model(onnx_model)

Use netron Tools to view the network architecture and the input and output shapes of the network , online netron Address the following ：

https://netron.app/

Here's a screenshot ：
Insert picture description here

2.2 Generate txt Format image data

Normally , To use opencv Read images , Or by ROS The system subscribes to images , This is for testing purposes , Convert the image into txt Format . Since the input size of the grab detection network is $(b a t c h, 3, 320, 320)$ , So first crop the middle of the image $(320, 320)$ Area , Then save the pixel value to txt file . Storage per row 320 It's worth , common 320*3 That's ok , among , front 320 Behavior B passageway , The following is in order G and R passageway .
The program is in github download ：

https://github.com/dexin-wang/tensorRT_SGDN/tree/main/create_txt

adopt python3 create_txt.py Generate txt file .

2.3 TensorRT Reasoning

Because it's still in the testing phase , So my C++ The procedure is in TensorRT The official sample code is changed , In my github Can be downloaded from ：

https://github.com/dexin-wang/tensorRT_SGDN/tree/main/sampleOnnxSGDN

Follow the online tutorial to install TensorRT after , Link the sampleOnnxSGDN Put the folder in /home/.../TensorRT-7.0.0.11/samples/ in , And in /home/.../TensorRT-7.0.0.11/samples/Makefile File first 39 In line , Add a sampleOnnxSGDN：

samples=... sampleOnnxSGDN ...

then , Will download sgdn.onnx Put it in /home/.../TensorRT-7.0.0.11/samples/sampleOnnxSGDN/data/ Under the folder . in addition , You may need to modify the file path involved in the program .
such , You can compile and run .

compile

cd /home/.../TensorRT-7.0.0.11/samples/sampleOnnxSGDN/
make

After compilation , stay /home/.../TensorRT-7.0.0.11/bin Under the path , Two files generated ：

sample_onnx_sgdn
sample_onnx_sgdn_debug

function

cd /home/.../TensorRT-7.0.0.11/bin
./sample_onnx_sgdn

If the following goes well , You can see the results :

[07/01/2021-10:14:08] [I] Building and running a GPU inference engine for Onnx MNIST
----------------------------------------------------------------
Input filename:   /home/wangdx/tensorRT/TensorRT-7.0.0.11/samples/sampleOnnxSGDN/data/sgdn.onnx
ONNX IR version:  0.0.4
Opset version:    10
Producer name:    pytorch
Producer version: 1.2
Domain:
Model version:    0
Doc string:
----------------------------------------------------------------
[07/01/2021-10:14:13] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increa
[07/01/2021-10:14:51] [I] [TRT] Detected 1 inputs and 3 output network tensors.
[07/01/2021-10:14:51] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqu
[07/01/2021-10:14:51] [I] Output:
[07/01/2021-10:14:51] [I] (row, col) = 233, 187
confidence = 0.996655
&&&& PASSED TensorRT.sample_onnx_sgdn # ./sample_onnx_sgdn

The result shows , stay $(233, 187)$ The confidence of the grab point at the position is the highest , The confidence level is 0.996655. Because the middle of the image was cropped at the beginning $(320, 320)$ , So in the original picture , The predicted position of the grab point is $(233 + 80, 187 + 160) = (313, 347)$ . There is no code in the code to analyze the grab angle and grab width , Later I will update the code and release .

3. Error report summary

Report errors 1：

onnx->tensorRT when

While parsing node number 360 [Resize]:
ERROR: ModelImporter.cpp:124 In function parseGraph:
[5] Assertion failed: ctx->tensors().count(inputName)

solve ： Do not use bilinear interpolation , Use nn.UpsamplingNearest2d((h,w))

Report errors 2：

[06/30/2021-17:22:19] [E] [TRT] Network has dynamic or shape inputs, but no optimization profile has been defined.
[06/30/2021-17:22:19] [E] [TRT] Network validation failed.
&&&& FAILED TensorRT.sample_onnx_sgdn # ./sample_onnx_sgdn

solve ： In the generation will pytorch To onnx when , Don't set dynamic_axes. Check the correct method ： stay netron Network input shape yes $(1, 3, 320, 320)$ . instead of $batch\_size,3,320,320)$ .

Report errors 3：

Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[06/30/2021-17:52:05] [I] [TRT] Detected 1 inputs and 3 output network tensors.
[06/30/2021-17:52:06] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 
Segmentation fault (core dumped)