当前位置:网站首页>Deploy crawl detection network using tensorrt (I)
Deploy crawl detection network using tensorrt (I)
2022-07-03 05:17:00 【Qianyu QY】
1. TensorRT brief introduction
tensorRT Yes, you can. NVIDIA Various GP U One running under C++ The frame of reasoning . We use Pytorch、TF Trained models , Can be converted to TensorRT The format of , And then use TensorRT The inference engine runs this model , So as to improve the model in NVIDIA GPU Running speed on , Generally, it can be increased by several times ~ Dozens of times .
Mainstream pytorch Deployment path :
- pytorch → \rightarrow → ONNX → \rightarrow → TensorRT
- torch2trt
- torch2trt_dynamic
- TRTorch
2. Capture detection deployment process
The crawl detection network used here comes from my previous paper :High-performance Pixel-level Grasp Detection based on Adaptive Grasping and Grasp-aware Network. This method has achieved 99.09% Grasp detection accuracy , And in the actual multi object stacking scene 95.71% Capture success rate , The experimental demonstration video is in youtube On :https://www.youtube.com/watch?v=KUa3XlVwDsU. Thesis download address :https://www.techrxiv.org/articles/preprint/High-performance_Pixel-level_Grasp_Detection_based_on_Adaptive_Grasping_and_Grasp-aware_Network/14680455
Deploy TensorRT Need to install pytorch、tensorRT、ONNX Such dependence , The specific installation methods are quite detailed on the Internet , Here is only the version information I used :
ubuntu: 16.06
TensorRT: 7.0.0
ONNX IR version: 0.0.4
Opset version: 10
Producer name: pytorch 1.2.0
GPU: TITAN Xp
CUDA: 10.0
Driver Version: 430.14
have access to python perhaps C++ Deployment , Here I use C++.
2.1 take pytorch Network generation onnx file
pytorch Provides generation onnx Model approach , The code is as follows :
import torch
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load("model_path") # pytorch Model loading
model.eval()
x = torch.randn((1, 3, 320, 320)) # Generating tensor
x = x.to(device)
torch.onnx.export(model,
x,
"ckpt/sgdn.onnx",
verbose =True,
opset_version=10,
do_constant_folding=True, # Whether to perform constant folding optimization
input_names=["input"], # Enter the name
output_names=["output_able", "output_angle", "output_width"]) # Output name
At the time of generation , There is one caveat , Do not use interpolation upsampling in the network , Otherwise, in the tensorRT Reasoning will report errors , Use torch.nn.UpsamplingNearest2d() Instead of interpolation upsampling . Discussion on this issue :https://github.com/NVIDIA/TensorRT/issues/284.
onnx Files can be downloaded on my Google disk :
https://drive.google.com/file/d/1AGyjRTWIw85ctwP6VsBDCmR0mE8NdRLu/view?usp=sharing
Use python Of onnx Package check sgdn.onnx Whether it works , The procedure is as follows :
import onnx
model_path = 'sgdn.onnx'
# Verify the validity of the model
onnx_model = onnx.load(model_path)
onnx.checker.check_model(onnx_model)
Use netron Tools to view the network architecture and the input and output shapes of the network , online netron Address the following :
https://netron.app/
Here's a screenshot :
2.2 Generate txt Format image data
Normally , To use opencv Read images , Or by ROS The system subscribes to images , This is for testing purposes , Convert the image into txt Format . Since the input size of the grab detection network is ( b a t c h , 3 , 320 , 320 ) (batch,3,320,320) (batch,3,320,320), So first crop the middle of the image ( 320 , 320 ) (320,320) (320,320) Area , Then save the pixel value to txt file . Storage per row 320 It's worth , common 320*3 That's ok , among , front 320 Behavior B passageway , The following is in order G and R passageway .
The program is in github download :
https://github.com/dexin-wang/tensorRT_SGDN/tree/main/create_txt
adopt python3 create_txt.py Generate txt file .
2.3 TensorRT Reasoning
Because it's still in the testing phase , So my C++ The procedure is in TensorRT The official sample code is changed , In my github Can be downloaded from :
https://github.com/dexin-wang/tensorRT_SGDN/tree/main/sampleOnnxSGDN
Follow the online tutorial to install TensorRT after , Link the sampleOnnxSGDN Put the folder in /home/.../TensorRT-7.0.0.11/samples/ in , And in /home/.../TensorRT-7.0.0.11/samples/Makefile File first 39 In line , Add a sampleOnnxSGDN:
samples=... sampleOnnxSGDN ...
then , Will download sgdn.onnx Put it in /home/.../TensorRT-7.0.0.11/samples/sampleOnnxSGDN/data/ Under the folder . in addition , You may need to modify the file path involved in the program .
such , You can compile and run .
compile
cd /home/.../TensorRT-7.0.0.11/samples/sampleOnnxSGDN/
make
After compilation , stay /home/.../TensorRT-7.0.0.11/bin Under the path , Two files generated :
sample_onnx_sgdn
sample_onnx_sgdn_debug
function
cd /home/.../TensorRT-7.0.0.11/bin
./sample_onnx_sgdn
If the following goes well , You can see the results :
[07/01/2021-10:14:08] [I] Building and running a GPU inference engine for Onnx MNIST
----------------------------------------------------------------
Input filename: /home/wangdx/tensorRT/TensorRT-7.0.0.11/samples/sampleOnnxSGDN/data/sgdn.onnx
ONNX IR version: 0.0.4
Opset version: 10
Producer name: pytorch
Producer version: 1.2
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[07/01/2021-10:14:13] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increa
[07/01/2021-10:14:51] [I] [TRT] Detected 1 inputs and 3 output network tensors.
[07/01/2021-10:14:51] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqu
[07/01/2021-10:14:51] [I] Output:
[07/01/2021-10:14:51] [I] (row, col) = 233, 187
confidence = 0.996655
&&&& PASSED TensorRT.sample_onnx_sgdn # ./sample_onnx_sgdn
The result shows , stay ( 233 , 187 ) (233,187) (233,187) The confidence of the grab point at the position is the highest , The confidence level is 0.996655. Because the middle of the image was cropped at the beginning ( 320 , 320 ) (320,320) (320,320), So in the original picture , The predicted position of the grab point is ( 233 + 80 , 187 + 160 ) = ( 313 , 347 ) (233+80,187+160)=(313,347) (233+80,187+160)=(313,347). There is no code in the code to analyze the grab angle and grab width , Later I will update the code and release .
3. Error report summary
Report errors 1:
onnx->tensorRT when
While parsing node number 360 [Resize]:
ERROR: ModelImporter.cpp:124 In function parseGraph:
[5] Assertion failed: ctx->tensors().count(inputName)
solve : Do not use bilinear interpolation , Use nn.UpsamplingNearest2d((h,w))
Report errors 2:
[06/30/2021-17:22:19] [E] [TRT] Network has dynamic or shape inputs, but no optimization profile has been defined.
[06/30/2021-17:22:19] [E] [TRT] Network validation failed.
&&&& FAILED TensorRT.sample_onnx_sgdn # ./sample_onnx_sgdn
solve : In the generation will pytorch To onnx when , Don't set dynamic_axes. Check the correct method : stay netron Network input shape yes ( 1 , 3 , 320 , 320 ) (1,3,320,320) (1,3,320,320). instead of ( b a t c h _ s i z e , 3 , 320 , 320 ) (batch\_size,3,320,320) (batch_size,3,320,320).
Report errors 3:
Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[06/30/2021-17:52:05] [I] [TRT] Detected 1 inputs and 3 output network tensors.
[06/30/2021-17:52:06] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
Segmentation fault (core dumped)
solve : Error reading binary file , Read instead txt.
4. Reference resources
https://zhuanlan.zhihu.com/p/371239130
https://zhuanlan.zhihu.com/p/348301573
边栏推荐
- Principles of BTC cryptography
- Hotel public broadcasting background music - Design of hotel IP network broadcasting system based on Internet +
- [set theory] relational power operation (relational power operation | examples of relational power operation | properties of relational power operation)
- 1099 build a binary search tree (30 points)
- Basic knowledge of reflection (detailed explanation)
- Audio Focus Series: write a demo to understand audio focus and audiomananger
- Go practice -- factory mode of design patterns in golang (simple factory, factory method, abstract factory)
- 1119 pre- and post order traversals (30 points)
- Pessimistic lock and optimistic lock of multithreading
- Based on RFC 3986 (unified resource descriptor (URI): general syntax)
猜你喜欢

The principle is simple, but I don't know how to use it? Understand "contemporaneous group model" in one article

Intégration profonde et alignement des séquences de protéines Google

Automatic voltage rise and fall 5-40v multi string super capacitor charging chip and solution
![[practical project] autonomous web server](/img/99/892e600b7203c63bad02adb683c8f2.png)
[practical project] autonomous web server

小学校园IP网络广播-基于校园局域网的小学IP数字广播系统设计

JS scope
![[research materials] 2021 China's game industry brand report - Download attached](/img/b7/a377b0b7c742078e2feb28ebfbca62.jpg)
[research materials] 2021 China's game industry brand report - Download attached

Compile and decompile GCC common instructions

Go practice -- design patterns in golang's singleton

leetcode435. Non overlapping interval
随机推荐
Go practice -- gorilla/rpc (gorilla/rpc/json) used by gorilla web Toolkit
Web APIs exclusivity
Automatic voltage rise and fall 5-40v multi string super capacitor charging chip and solution
[practical project] autonomous web server
Go language interface learning notes
How to connect the network: Chapter 1 CSDN creation punch in
[research materials] the fourth quarter report of the survey of Chinese small and micro entrepreneurs in 2021 - Download attached
SSM framework integration
Go practice -- gorilla / websocket used by gorilla web Toolkit
Introduction to deep learning - definition Introduction (I)
Introduction to rust Foundation (basic type)
【实战项目】自主web服务器
Gbase8s composite index (I)
Webrtc native M96 version opening trip -- a reading code download and compilation (Ninja GN depot_tools)
1119 pre- and post order traversals (30 points)
BTC-密码学原理
[batch dos-cmd command - summary and summary] - CMD window setting and operation command - close CMD window and exit CMD environment (exit, exit /b, goto: EOF)
Detailed explanation of the output end (head) of yolov5 | CSDN creation punch in
leetcode860. Lemonade change
appium1.22.x 版本后的 appium inspector 需单独安装