当前位置:网站首页>Deploy crawl detection network using tensorrt (I)
Deploy crawl detection network using tensorrt (I)
2022-07-03 05:17:00 【Qianyu QY】
1. TensorRT brief introduction
tensorRT Yes, you can. NVIDIA Various GP U One running under C++ The frame of reasoning . We use Pytorch、TF Trained models , Can be converted to TensorRT The format of , And then use TensorRT The inference engine runs this model , So as to improve the model in NVIDIA GPU Running speed on , Generally, it can be increased by several times ~ Dozens of times .
Mainstream pytorch Deployment path :
- pytorch → \rightarrow → ONNX → \rightarrow → TensorRT
- torch2trt
- torch2trt_dynamic
- TRTorch
2. Capture detection deployment process
The crawl detection network used here comes from my previous paper :High-performance Pixel-level Grasp Detection based on Adaptive Grasping and Grasp-aware Network. This method has achieved 99.09% Grasp detection accuracy , And in the actual multi object stacking scene 95.71% Capture success rate , The experimental demonstration video is in youtube On :https://www.youtube.com/watch?v=KUa3XlVwDsU. Thesis download address :https://www.techrxiv.org/articles/preprint/High-performance_Pixel-level_Grasp_Detection_based_on_Adaptive_Grasping_and_Grasp-aware_Network/14680455
Deploy TensorRT Need to install pytorch、tensorRT、ONNX Such dependence , The specific installation methods are quite detailed on the Internet , Here is only the version information I used :
ubuntu: 16.06
TensorRT: 7.0.0
ONNX IR version: 0.0.4
Opset version: 10
Producer name: pytorch 1.2.0
GPU: TITAN Xp
CUDA: 10.0
Driver Version: 430.14
have access to python perhaps C++ Deployment , Here I use C++.
2.1 take pytorch Network generation onnx file
pytorch Provides generation onnx Model approach , The code is as follows :
import torch
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load("model_path") # pytorch Model loading
model.eval()
x = torch.randn((1, 3, 320, 320)) # Generating tensor
x = x.to(device)
torch.onnx.export(model,
x,
"ckpt/sgdn.onnx",
verbose =True,
opset_version=10,
do_constant_folding=True, # Whether to perform constant folding optimization
input_names=["input"], # Enter the name
output_names=["output_able", "output_angle", "output_width"]) # Output name
At the time of generation , There is one caveat , Do not use interpolation upsampling in the network , Otherwise, in the tensorRT Reasoning will report errors , Use torch.nn.UpsamplingNearest2d()
Instead of interpolation upsampling . Discussion on this issue :https://github.com/NVIDIA/TensorRT/issues/284
.
onnx Files can be downloaded on my Google disk :
https://drive.google.com/file/d/1AGyjRTWIw85ctwP6VsBDCmR0mE8NdRLu/view?usp=sharing
Use python Of onnx Package check sgdn.onnx Whether it works , The procedure is as follows :
import onnx
model_path = 'sgdn.onnx'
# Verify the validity of the model
onnx_model = onnx.load(model_path)
onnx.checker.check_model(onnx_model)
Use netron Tools to view the network architecture and the input and output shapes of the network , online netron Address the following :
https://netron.app/
Here's a screenshot :
2.2 Generate txt Format image data
Normally , To use opencv Read images , Or by ROS The system subscribes to images , This is for testing purposes , Convert the image into txt Format . Since the input size of the grab detection network is ( b a t c h , 3 , 320 , 320 ) (batch,3,320,320) (batch,3,320,320), So first crop the middle of the image ( 320 , 320 ) (320,320) (320,320) Area , Then save the pixel value to txt file . Storage per row 320 It's worth , common 320*3 That's ok , among , front 320 Behavior B passageway , The following is in order G and R passageway .
The program is in github download :
https://github.com/dexin-wang/tensorRT_SGDN/tree/main/create_txt
adopt python3 create_txt.py
Generate txt file .
2.3 TensorRT Reasoning
Because it's still in the testing phase , So my C++ The procedure is in TensorRT The official sample code is changed , In my github Can be downloaded from :
https://github.com/dexin-wang/tensorRT_SGDN/tree/main/sampleOnnxSGDN
Follow the online tutorial to install TensorRT after , Link the sampleOnnxSGDN
Put the folder in /home/.../TensorRT-7.0.0.11/samples/
in , And in /home/.../TensorRT-7.0.0.11/samples/Makefile
File first 39
In line , Add a sampleOnnxSGDN
:
samples=... sampleOnnxSGDN ...
then , Will download sgdn.onnx
Put it in /home/.../TensorRT-7.0.0.11/samples/sampleOnnxSGDN/data/
Under the folder . in addition , You may need to modify the file path involved in the program .
such , You can compile and run .
compile
cd /home/.../TensorRT-7.0.0.11/samples/sampleOnnxSGDN/
make
After compilation , stay /home/.../TensorRT-7.0.0.11/bin
Under the path , Two files generated :
sample_onnx_sgdn
sample_onnx_sgdn_debug
function
cd /home/.../TensorRT-7.0.0.11/bin
./sample_onnx_sgdn
If the following goes well , You can see the results :
[07/01/2021-10:14:08] [I] Building and running a GPU inference engine for Onnx MNIST
----------------------------------------------------------------
Input filename: /home/wangdx/tensorRT/TensorRT-7.0.0.11/samples/sampleOnnxSGDN/data/sgdn.onnx
ONNX IR version: 0.0.4
Opset version: 10
Producer name: pytorch
Producer version: 1.2
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[07/01/2021-10:14:13] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increa
[07/01/2021-10:14:51] [I] [TRT] Detected 1 inputs and 3 output network tensors.
[07/01/2021-10:14:51] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqu
[07/01/2021-10:14:51] [I] Output:
[07/01/2021-10:14:51] [I] (row, col) = 233, 187
confidence = 0.996655
&&&& PASSED TensorRT.sample_onnx_sgdn # ./sample_onnx_sgdn
The result shows , stay ( 233 , 187 ) (233,187) (233,187) The confidence of the grab point at the position is the highest , The confidence level is 0.996655. Because the middle of the image was cropped at the beginning ( 320 , 320 ) (320,320) (320,320), So in the original picture , The predicted position of the grab point is ( 233 + 80 , 187 + 160 ) = ( 313 , 347 ) (233+80,187+160)=(313,347) (233+80,187+160)=(313,347). There is no code in the code to analyze the grab angle and grab width , Later I will update the code and release .
3. Error report summary
Report errors 1:
onnx->tensorRT when
While parsing node number 360 [Resize]:
ERROR: ModelImporter.cpp:124 In function parseGraph:
[5] Assertion failed: ctx->tensors().count(inputName)
solve : Do not use bilinear interpolation , Use nn.UpsamplingNearest2d((h,w))
Report errors 2:
[06/30/2021-17:22:19] [E] [TRT] Network has dynamic or shape inputs, but no optimization profile has been defined.
[06/30/2021-17:22:19] [E] [TRT] Network validation failed.
&&&& FAILED TensorRT.sample_onnx_sgdn # ./sample_onnx_sgdn
solve : In the generation will pytorch To onnx when , Don't set dynamic_axes. Check the correct method : stay netron Network input shape yes ( 1 , 3 , 320 , 320 ) (1,3,320,320) (1,3,320,320). instead of ( b a t c h _ s i z e , 3 , 320 , 320 ) (batch\_size,3,320,320) (batch_size,3,320,320).
Report errors 3:
Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[06/30/2021-17:52:05] [I] [TRT] Detected 1 inputs and 3 output network tensors.
[06/30/2021-17:52:06] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
Segmentation fault (core dumped)
solve : Error reading binary file , Read instead txt.
4. Reference resources
https://zhuanlan.zhihu.com/p/371239130
https://zhuanlan.zhihu.com/p/348301573
边栏推荐
- [Yu Yue education] basic reference materials of interchangeability and measurement technology of Zhongyuan Institute of Technology
- Webrtc protocol introduction -- an article to understand ice, stun, NAT, turn
- [develop wechat applet local storage with uni app]
- appium1.22. Appium inspector after X version needs to be installed separately
- Three representations of signed numbers: original code, inverse code and complement code
- About debugging the assignment of pagenum and PageSize of the formal parameter pageweb < T > (i.e. page encapsulation generic) in the controller
- Go practice -- use JWT (JSON web token) in golang
- 动态规划——相关概念,(数塔问题)
- Based on RFC 3986 (unified resource descriptor (URI): general syntax)
- Webrtc M96 release notes (SDP abolishes Plan B and supports opus red redundant coding)
猜你喜欢
Intégration profonde et alignement des séquences de protéines Google
[research materials] 2021 China's game industry brand report - Download attached
[set theory] relation properties (transitivity | transitivity examples | transitivity related theorems)
Coordinatorlayout appbarrayout recyclerview item exposure buried point misalignment analysis
Automatic voltage rise and fall 5-40v multi string super capacitor charging chip and solution
Go practice -- gorilla / websocket used by gorilla web Toolkit
Three representations of signed numbers: original code, inverse code and complement code
联想R7000显卡的拆卸与安装
[basic grammar] Snake game written in C language
BIO、NIO、AIO区别
随机推荐
JS string and array methods
study hard and make progress every day
Technical analysis of qianyuantong multi card aggregation router
Introduction to deep learning - definition Introduction (I)
Introduction to webrtc protocol -- an article to understand dtls, SRTP, srtcp
Compile and decompile GCC common instructions
【批处理DOS-CMD命令-汇总和小结】-CMD窗口的设置与操作命令-关闭cmd窗口、退出cmd环境(exit、exit /b、goto :eof)
Explanation of several points needing attention in final (tested by the author)
Realize file download through the tag of < a > and customize the file name
Making coco datasets
Basic introduction of redis and explanation of eight types and transactions
50 practical applications of R language (36) - data visualization from basic to advanced
RT thread flow notes I startup, schedule, thread
Kept hot standby and haproxy
Go practice -- factory mode of design patterns in golang (simple factory, factory method, abstract factory)
Force GCC to compile 32-bit programs on 64 bit platform
[develop wechat applet local storage with uni app]
"Hands on deep learning" pytorch edition Chapter II exercise
Redis 过期淘汰机制
Rust基础入门之(基本类型)