当前位置:网站首页>Deploy crawl detection network using tensorrt (I)
Deploy crawl detection network using tensorrt (I)
2022-07-03 05:17:00 【Qianyu QY】
1. TensorRT brief introduction
tensorRT Yes, you can. NVIDIA Various GP U One running under C++ The frame of reasoning . We use Pytorch、TF Trained models , Can be converted to TensorRT The format of , And then use TensorRT The inference engine runs this model , So as to improve the model in NVIDIA GPU Running speed on , Generally, it can be increased by several times ~ Dozens of times .
Mainstream pytorch Deployment path :
- pytorch → \rightarrow → ONNX → \rightarrow → TensorRT
- torch2trt
- torch2trt_dynamic
- TRTorch
2. Capture detection deployment process
The crawl detection network used here comes from my previous paper :High-performance Pixel-level Grasp Detection based on Adaptive Grasping and Grasp-aware Network. This method has achieved 99.09% Grasp detection accuracy , And in the actual multi object stacking scene 95.71% Capture success rate , The experimental demonstration video is in youtube On :https://www.youtube.com/watch?v=KUa3XlVwDsU. Thesis download address :https://www.techrxiv.org/articles/preprint/High-performance_Pixel-level_Grasp_Detection_based_on_Adaptive_Grasping_and_Grasp-aware_Network/14680455
Deploy TensorRT Need to install pytorch、tensorRT、ONNX Such dependence , The specific installation methods are quite detailed on the Internet , Here is only the version information I used :
ubuntu: 16.06
TensorRT: 7.0.0
ONNX IR version: 0.0.4
Opset version: 10
Producer name: pytorch 1.2.0
GPU: TITAN Xp
CUDA: 10.0
Driver Version: 430.14
have access to python perhaps C++ Deployment , Here I use C++.
2.1 take pytorch Network generation onnx file
pytorch Provides generation onnx Model approach , The code is as follows :
import torch
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load("model_path") # pytorch Model loading
model.eval()
x = torch.randn((1, 3, 320, 320)) # Generating tensor
x = x.to(device)
torch.onnx.export(model,
x,
"ckpt/sgdn.onnx",
verbose =True,
opset_version=10,
do_constant_folding=True, # Whether to perform constant folding optimization
input_names=["input"], # Enter the name
output_names=["output_able", "output_angle", "output_width"]) # Output name
At the time of generation , There is one caveat , Do not use interpolation upsampling in the network , Otherwise, in the tensorRT Reasoning will report errors , Use torch.nn.UpsamplingNearest2d() Instead of interpolation upsampling . Discussion on this issue :https://github.com/NVIDIA/TensorRT/issues/284.
onnx Files can be downloaded on my Google disk :
https://drive.google.com/file/d/1AGyjRTWIw85ctwP6VsBDCmR0mE8NdRLu/view?usp=sharing
Use python Of onnx Package check sgdn.onnx Whether it works , The procedure is as follows :
import onnx
model_path = 'sgdn.onnx'
# Verify the validity of the model
onnx_model = onnx.load(model_path)
onnx.checker.check_model(onnx_model)
Use netron Tools to view the network architecture and the input and output shapes of the network , online netron Address the following :
https://netron.app/
Here's a screenshot :
2.2 Generate txt Format image data
Normally , To use opencv Read images , Or by ROS The system subscribes to images , This is for testing purposes , Convert the image into txt Format . Since the input size of the grab detection network is ( b a t c h , 3 , 320 , 320 ) (batch,3,320,320) (batch,3,320,320), So first crop the middle of the image ( 320 , 320 ) (320,320) (320,320) Area , Then save the pixel value to txt file . Storage per row 320 It's worth , common 320*3 That's ok , among , front 320 Behavior B passageway , The following is in order G and R passageway .
The program is in github download :
https://github.com/dexin-wang/tensorRT_SGDN/tree/main/create_txt
adopt python3 create_txt.py Generate txt file .
2.3 TensorRT Reasoning
Because it's still in the testing phase , So my C++ The procedure is in TensorRT The official sample code is changed , In my github Can be downloaded from :
https://github.com/dexin-wang/tensorRT_SGDN/tree/main/sampleOnnxSGDN
Follow the online tutorial to install TensorRT after , Link the sampleOnnxSGDN Put the folder in /home/.../TensorRT-7.0.0.11/samples/ in , And in /home/.../TensorRT-7.0.0.11/samples/Makefile File first 39 In line , Add a sampleOnnxSGDN:
samples=... sampleOnnxSGDN ...
then , Will download sgdn.onnx Put it in /home/.../TensorRT-7.0.0.11/samples/sampleOnnxSGDN/data/ Under the folder . in addition , You may need to modify the file path involved in the program .
such , You can compile and run .
compile
cd /home/.../TensorRT-7.0.0.11/samples/sampleOnnxSGDN/
make
After compilation , stay /home/.../TensorRT-7.0.0.11/bin Under the path , Two files generated :
sample_onnx_sgdn
sample_onnx_sgdn_debug
function
cd /home/.../TensorRT-7.0.0.11/bin
./sample_onnx_sgdn
If the following goes well , You can see the results :
[07/01/2021-10:14:08] [I] Building and running a GPU inference engine for Onnx MNIST
----------------------------------------------------------------
Input filename: /home/wangdx/tensorRT/TensorRT-7.0.0.11/samples/sampleOnnxSGDN/data/sgdn.onnx
ONNX IR version: 0.0.4
Opset version: 10
Producer name: pytorch
Producer version: 1.2
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[07/01/2021-10:14:13] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increa
[07/01/2021-10:14:51] [I] [TRT] Detected 1 inputs and 3 output network tensors.
[07/01/2021-10:14:51] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqu
[07/01/2021-10:14:51] [I] Output:
[07/01/2021-10:14:51] [I] (row, col) = 233, 187
confidence = 0.996655
&&&& PASSED TensorRT.sample_onnx_sgdn # ./sample_onnx_sgdn
The result shows , stay ( 233 , 187 ) (233,187) (233,187) The confidence of the grab point at the position is the highest , The confidence level is 0.996655. Because the middle of the image was cropped at the beginning ( 320 , 320 ) (320,320) (320,320), So in the original picture , The predicted position of the grab point is ( 233 + 80 , 187 + 160 ) = ( 313 , 347 ) (233+80,187+160)=(313,347) (233+80,187+160)=(313,347). There is no code in the code to analyze the grab angle and grab width , Later I will update the code and release .
3. Error report summary
Report errors 1:
onnx->tensorRT when
While parsing node number 360 [Resize]:
ERROR: ModelImporter.cpp:124 In function parseGraph:
[5] Assertion failed: ctx->tensors().count(inputName)
solve : Do not use bilinear interpolation , Use nn.UpsamplingNearest2d((h,w))
Report errors 2:
[06/30/2021-17:22:19] [E] [TRT] Network has dynamic or shape inputs, but no optimization profile has been defined.
[06/30/2021-17:22:19] [E] [TRT] Network validation failed.
&&&& FAILED TensorRT.sample_onnx_sgdn # ./sample_onnx_sgdn
solve : In the generation will pytorch To onnx when , Don't set dynamic_axes. Check the correct method : stay netron Network input shape yes ( 1 , 3 , 320 , 320 ) (1,3,320,320) (1,3,320,320). instead of ( b a t c h _ s i z e , 3 , 320 , 320 ) (batch\_size,3,320,320) (batch_size,3,320,320).
Report errors 3:
Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[06/30/2021-17:52:05] [I] [TRT] Detected 1 inputs and 3 output network tensors.
[06/30/2021-17:52:06] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
Segmentation fault (core dumped)
solve : Error reading binary file , Read instead txt.
4. Reference resources
https://zhuanlan.zhihu.com/p/371239130
https://zhuanlan.zhihu.com/p/348301573
边栏推荐
- Detailed explanation of yolov5 training own data set
- Yolov5 input (II) | CSDN creative punch in
- study hard and make progress every day
- 【实战项目】自主web服务器
- Webrtc protocol introduction -- an article to understand ice, stun, NAT, turn
- Actual combat 8051 drives 8-bit nixie tube
- The principle is simple, but I don't know how to use it? Understand "contemporaneous group model" in one article
- 5-36v input automatic voltage rise and fall PD fast charging scheme drawing 30W low-cost chip
- 112 stucked keyboard (20 points)
- 聊聊如何利用p6spy进行sql监控
猜你喜欢

Differences among bio, NiO and AIO

Overview of basic knowledge of C language
![[practical project] autonomous web server](/img/99/892e600b7203c63bad02adb683c8f2.png)
[practical project] autonomous web server
![[basic grammar] C language uses for loop to print Pentagram](/img/9e/021c6c0e748e0981d4233f74c83e76.jpg)
[basic grammar] C language uses for loop to print Pentagram

Handler understands the record

Make your own dataset

乾元通多卡聚合路由器的技术解析
![[batch dos-cmd command - summary and summary] - CMD window setting and operation command - close CMD window and exit CMD environment (exit, exit /b, goto: EOF)](/img/ce/d6f4fb30727e7436b6443537429ad4.png)
[batch dos-cmd command - summary and summary] - CMD window setting and operation command - close CMD window and exit CMD environment (exit, exit /b, goto: EOF)

JQ style, element operation, effect, filtering method and transformation, event object

The principle is simple, but I don't know how to use it? Understand "contemporaneous group model" in one article
随机推荐
1118 birds in forest (25 points)
Skip table: principle introduction, advantages and disadvantages of skiplist
Three representations of signed numbers: original code, inverse code and complement code
联想R7000显卡的拆卸与安装
5-36v input automatic voltage rise and fall PD fast charging scheme drawing 30W low-cost chip
Pan details of deep learning
The IntelliJ platform completely disables the log4j component
Go practice -- design patterns in golang's singleton
Configure and use Anaconda environment in pycharm
请求数据库报错:“could not extract ResultSet; SQL [n/a]; nested exception is org.hibernate.exception.SQLGram
[set theory] relational power operation (relational power operation | examples of relational power operation | properties of relational power operation)
Technical analysis of qianyuantong multi card aggregation router
[research materials] 2021 annual report on mergers and acquisitions in the property management industry - Download attached
Yolov5 input (I) -- mosaic data enhancement | CSDN creative punch in
Go practice -- gorilla / websocket used by gorilla web Toolkit
Redis expiration elimination mechanism
1086 tree traversals again (25 points)
Objects. Requirenonnull method description
Redis Introduction et explication des types de données
Basic knowledge of reflection (detailed explanation)