当前位置：网站首页>Model deployment learning notes (I)

Model deployment learning notes (I)

2022-06-12 07:29:00 【Nidiya is trying】

About deployment ： infer （Inference） There are many possibilities for the actual deployment of , May be deployed in Data Center（ Cloud Data Center ）, For example, the common voice input on mobile phones , It's still in the cloud , That is to say, your voice is transmitted to the cloud , After the cloud processing, return the data ; It may also be deployed on the embedded side , For example, embedded cameras 、 Unmanned aerial vehicle (uav) 、 Autonomous driving of a robot or vehicle , Of course, the car's autopilot may be an embedded device , It could also be a complete host , Embedded or self driving like this , Its characteristic is that it requires high real-time performance .
https://blog.csdn.net/intflojx/article/details/81778648

One 、ONNX(Open Neural Network Exchange)

1、 What is it? ：ONNX It's Microsoft and Facebook Released an ecosystem of deep learning development tools , Defines a and environment 、 Platform independent standard 、 Open format , Used to represent the deep learning model .

2、 effect ：“ translate ”

① Connect different deep learning frameworks , So that different models can be transformed ;

② Store model data in the same format and interact , Let different neural network development frameworks interoperate .

3、 Common conversion routes ：

Pytorch —— ONNX —— TensorRT
Pytorch —— ONNX —— TVM
TensorFlow —— ONNX —— NCNN

Two 、TensorFlow Serving：TensorFlow The online model reasoning scheme provided by , Put the trained model online directly and provide services . Deploy the model online , And provide appropriate interfaces for external calls .

3、 ... and 、NCNN and TensorRT： Efficiently migrate the trained algorithm from the cloud to the heterogeneous edge intelligent chip for execution .【 edge 、 Chip reasoning 】

1、NCNN： Tencent launched a reasoning framework for mobile deployment , There is no third party to rely on , On the mobile phone end CPU Faster than the open source framework , Can be in PC End to end reasoning ; in the light of CPU To optimize the performance of the deployment ; The network files and weight files required for reasoning are (.param、.bin).

2、TensorRT：NVIDIA To build the cuda Neural network inference base , It's a C++ library ; Convert all models of other frameworks to TensorRT in , And then in TensorRT Targeted at Nvidia Homemade GPU Optimize , Deployment acceleration , The trained neural network is automatically optimized for runtime performance ; in the light of GPU and CPU To optimize and accelerate model reasoning , Support INT8 Quantification and FP16 quantitative , Only in GPU The equipment carries out ;TensorRT Support .engine Format , There are two ways to convert other format files to this format file ：

① Open source framework training model ——ONNX——.engine

② Direct use tensorrt Transform the training model into .engine(tensorrt, Open source project , Transform and deploy projects into collections )

原网站

版权声明
本文为[Nidiya is trying]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/03/202203010557067643.html

当前位置：网站首页>Model deployment learning notes (I)

Model deployment learning notes (I)

边栏推荐

猜你喜欢

随机推荐