当前位置:网站首页>Model deployment learning notes (I)
Model deployment learning notes (I)
2022-06-12 07:29:00 【Nidiya is trying】
About deployment : infer (Inference) There are many possibilities for the actual deployment of , May be deployed in Data Center( Cloud Data Center ), For example, the common voice input on mobile phones , It's still in the cloud , That is to say, your voice is transmitted to the cloud , After the cloud processing, return the data ; It may also be deployed on the embedded side , For example, embedded cameras 、 Unmanned aerial vehicle (uav) 、 Autonomous driving of a robot or vehicle , Of course, the car's autopilot may be an embedded device , It could also be a complete host , Embedded or self driving like this , Its characteristic is that it requires high real-time performance .
https://blog.csdn.net/intflojx/article/details/81778648
One 、ONNX(Open Neural Network Exchange)
1、 What is it? :ONNX It's Microsoft and Facebook Released an ecosystem of deep learning development tools , Defines a and environment 、 Platform independent standard 、 Open format , Used to represent the deep learning model .
2、 effect :“ translate ”
① Connect different deep learning frameworks , So that different models can be transformed ;
② Store model data in the same format and interact , Let different neural network development frameworks interoperate .
3、 Common conversion routes :
- Pytorch —— ONNX —— TensorRT
- Pytorch —— ONNX —— TVM
- TensorFlow —— ONNX —— NCNN
Two 、TensorFlow Serving:TensorFlow The online model reasoning scheme provided by , Put the trained model online directly and provide services . Deploy the model online , And provide appropriate interfaces for external calls .
3、 ... and 、NCNN and TensorRT: Efficiently migrate the trained algorithm from the cloud to the heterogeneous edge intelligent chip for execution .【 edge 、 Chip reasoning 】
1、NCNN: Tencent launched a reasoning framework for mobile deployment , There is no third party to rely on , On the mobile phone end CPU Faster than the open source framework , Can be in PC End to end reasoning ; in the light of CPU To optimize the performance of the deployment ; The network files and weight files required for reasoning are (.param、.bin).
2、TensorRT:NVIDIA To build the cuda Neural network inference base , It's a C++ library ; Convert all models of other frameworks to TensorRT in , And then in TensorRT Targeted at Nvidia Homemade GPU Optimize , Deployment acceleration , The trained neural network is automatically optimized for runtime performance ; in the light of GPU and CPU To optimize and accelerate model reasoning , Support INT8 Quantification and FP16 quantitative , Only in GPU The equipment carries out ;TensorRT Support .engine Format , There are two ways to convert other format files to this format file :
① Open source framework training model ——ONNX——.engine
② Direct use tensorrt Transform the training model into .engine(tensorrt, Open source project , Transform and deploy projects into collections )
边栏推荐
- 私有协议的解密游戏:从秘文到明文
- 8086/8088 instruction execution pipeline disconnection reason
- [yolo-v5 learning notes]
- 12.13-12.19 summary
- Paddepaddl 28 supports the implementation of GHM loss, a gradient balancing mechanism for arbitrary dimensional data (supports ignore\u index, class\u weight, back propagation training, and multi clas
- [wax chain tour] release a free and open source alien worlds script TLM
- Adaptive personalized federated learning paper interpretation + code analysis
- Modelants II
- AcWing——4269校庆
- Construction of running water lamp experiment with simulation software proteus
猜你喜欢

Complete set of typescript Basics

Personalized federated learning with exact stochastic gradient descent

C language sizeof strlen

2022 simulated test platform operation of hoisting machinery command test questions

Study on display principle of seven segment digital tube

Set up a remote Jupiter notebook

BI技巧丨当月期初

Modelarts培训任务1

knife4j 初次使用

Pyhon的第六天
随机推荐
Noi openjudge computes the n-th power of 2
D cannot use a non CTFE pointer
Pyhon的第五天
[wax chain tour] release a free and open source alien worlds script TLM
Xshell installation
Problems encountered in learning go
MySQL索引(一篇文章轻松搞定)
12.13-12.19 summary
Detailed principle of 4.3-inch TFTLCD based on warship V3
鸿蒙os-第一次培训
Interview computer network - transport layer
@DateTimeFormat @JsonFormat 的区别
LVDS drive adapter
Embedded gd32 code read protection
Expansion of D @nogc
2022电工(初级)考试题库及模拟考试
C language queue implementation
VS2019 MFC IP Address Control 控件繼承CIPAddressCtrl類重繪
D
Gradient epic memory for continuous learning