当前位置:网站首页>Model deployment learning notes (I)
Model deployment learning notes (I)
2022-06-12 07:29:00 【Nidiya is trying】
About deployment : infer (Inference) There are many possibilities for the actual deployment of , May be deployed in Data Center( Cloud Data Center ), For example, the common voice input on mobile phones , It's still in the cloud , That is to say, your voice is transmitted to the cloud , After the cloud processing, return the data ; It may also be deployed on the embedded side , For example, embedded cameras 、 Unmanned aerial vehicle (uav) 、 Autonomous driving of a robot or vehicle , Of course, the car's autopilot may be an embedded device , It could also be a complete host , Embedded or self driving like this , Its characteristic is that it requires high real-time performance .
https://blog.csdn.net/intflojx/article/details/81778648
One 、ONNX(Open Neural Network Exchange)
1、 What is it? :ONNX It's Microsoft and Facebook Released an ecosystem of deep learning development tools , Defines a and environment 、 Platform independent standard 、 Open format , Used to represent the deep learning model .
2、 effect :“ translate ”
① Connect different deep learning frameworks , So that different models can be transformed ;
② Store model data in the same format and interact , Let different neural network development frameworks interoperate .
3、 Common conversion routes :
- Pytorch —— ONNX —— TensorRT
- Pytorch —— ONNX —— TVM
- TensorFlow —— ONNX —— NCNN
Two 、TensorFlow Serving:TensorFlow The online model reasoning scheme provided by , Put the trained model online directly and provide services . Deploy the model online , And provide appropriate interfaces for external calls .
3、 ... and 、NCNN and TensorRT: Efficiently migrate the trained algorithm from the cloud to the heterogeneous edge intelligent chip for execution .【 edge 、 Chip reasoning 】
1、NCNN: Tencent launched a reasoning framework for mobile deployment , There is no third party to rely on , On the mobile phone end CPU Faster than the open source framework , Can be in PC End to end reasoning ; in the light of CPU To optimize the performance of the deployment ; The network files and weight files required for reasoning are (.param、.bin).
2、TensorRT:NVIDIA To build the cuda Neural network inference base , It's a C++ library ; Convert all models of other frameworks to TensorRT in , And then in TensorRT Targeted at Nvidia Homemade GPU Optimize , Deployment acceleration , The trained neural network is automatically optimized for runtime performance ; in the light of GPU and CPU To optimize and accelerate model reasoning , Support INT8 Quantification and FP16 quantitative , Only in GPU The equipment carries out ;TensorRT Support .engine Format , There are two ways to convert other format files to this format file :
① Open source framework training model ——ONNX——.engine
② Direct use tensorrt Transform the training model into .engine(tensorrt, Open source project , Transform and deploy projects into collections )
边栏推荐
- [college entrance examination] prospective college students look at it, choose the direction and future, and grasp it by themselves
- Vs2019 MFC IP address control control inherits cipaddressctrl class redrawing
- TypeScript基础知识全集
- 2022电工(初级)考试题库及模拟考试
- Personalized federated learning with Moreau envelopes
- i. Mx6ul porting openwrt
- How to stop MySQL service under Linux
- Imx6q PWM drive
- Dynamic coordinate transformation in ROS (dynamic parameter adjustment + dynamic coordinate transformation)
- 8086/8088 instruction execution pipeline disconnection reason
猜你喜欢

New knowledge: monkey improved app crawler

JDE 对象管理工作平台介绍及 From 的使用

Right click the general solution of file rotation jam, refresh, white screen, flash back and desktop crash

RT thread studio learning (I) new project

面试计算机网络-传输层

Scons compiling imgui

2022年危险化学品经营单位安全管理人员特种作业证考试题库及答案

Fcpx plug-in: simple line outgoing text title introduction animation call outs with photo placeholders for fcpx

BI技巧丨当月期初

Test left shift real introduction
随机推荐
Set up a remote Jupiter notebook
Detailed explanation of memory addressing in 8086 real address mode
Day 4 of pyhon
2022起重机械指挥考试题模拟考试平台操作
Personalized federated learning using hypernetworks paper reading notes + code interpretation
Improved schemes for episodic memory based lifelong learning
knife4j 初次使用
SQL -- course experiment examination
Detailed explanation of 8086/8088 system bus (sequence analysis + bus related knowledge)
VS 2019 MFC 通过ACE引擎连接并访问Access数据库类库封装
基于eNSP加防火墙的千人中型校园/企业网络规划与设计(附所有配置命令)
Federated reconnaissance: efficient, distributed, class incremental learning paper reading + code analysis
AcWing——4268. 性感素
Test left shift real introduction
Modelarts培训任务1
Summary of machine learning + pattern recognition learning (II) -- perceptron and neural network
Summary of machine learning + pattern recognition learning (I) -- k-nearest neighbor method
Esp8266 firmware upgrade method (esp8266-01s module)
Stm32cubemx learning (I) USB HID bidirectional communication
New knowledge: monkey improved app crawler