当前位置:网站首页>Model deployment learning notes (I)
Model deployment learning notes (I)
2022-06-12 07:29:00 【Nidiya is trying】
About deployment : infer (Inference) There are many possibilities for the actual deployment of , May be deployed in Data Center( Cloud Data Center ), For example, the common voice input on mobile phones , It's still in the cloud , That is to say, your voice is transmitted to the cloud , After the cloud processing, return the data ; It may also be deployed on the embedded side , For example, embedded cameras 、 Unmanned aerial vehicle (uav) 、 Autonomous driving of a robot or vehicle , Of course, the car's autopilot may be an embedded device , It could also be a complete host , Embedded or self driving like this , Its characteristic is that it requires high real-time performance .
https://blog.csdn.net/intflojx/article/details/81778648
One 、ONNX(Open Neural Network Exchange)
1、 What is it? :ONNX It's Microsoft and Facebook Released an ecosystem of deep learning development tools , Defines a and environment 、 Platform independent standard 、 Open format , Used to represent the deep learning model .
2、 effect :“ translate ”
① Connect different deep learning frameworks , So that different models can be transformed ;
② Store model data in the same format and interact , Let different neural network development frameworks interoperate .
3、 Common conversion routes :
- Pytorch —— ONNX —— TensorRT
- Pytorch —— ONNX —— TVM
- TensorFlow —— ONNX —— NCNN
Two 、TensorFlow Serving:TensorFlow The online model reasoning scheme provided by , Put the trained model online directly and provide services . Deploy the model online , And provide appropriate interfaces for external calls .
3、 ... and 、NCNN and TensorRT: Efficiently migrate the trained algorithm from the cloud to the heterogeneous edge intelligent chip for execution .【 edge 、 Chip reasoning 】
1、NCNN: Tencent launched a reasoning framework for mobile deployment , There is no third party to rely on , On the mobile phone end CPU Faster than the open source framework , Can be in PC End to end reasoning ; in the light of CPU To optimize the performance of the deployment ; The network files and weight files required for reasoning are (.param、.bin).
2、TensorRT:NVIDIA To build the cuda Neural network inference base , It's a C++ library ; Convert all models of other frameworks to TensorRT in , And then in TensorRT Targeted at Nvidia Homemade GPU Optimize , Deployment acceleration , The trained neural network is automatically optimized for runtime performance ; in the light of GPU and CPU To optimize and accelerate model reasoning , Support INT8 Quantification and FP16 quantitative , Only in GPU The equipment carries out ;TensorRT Support .engine Format , There are two ways to convert other format files to this format file :
① Open source framework training model ——ONNX——.engine
② Direct use tensorrt Transform the training model into .engine(tensorrt, Open source project , Transform and deploy projects into collections )
边栏推荐
- Learning to continuously learn paper notes + code interpretation
- Kali and programming: how to quickly build the OWASP website security test range?
- Adaptive personalized federated learning paper interpretation + code analysis
- knife4j 初次使用
- What is the difference between < t > and object?
- Fcpx plug-in: simple line outgoing text title introduction animation call outs with photo placeholders for fcpx
- Golang quickly generates model and queryset of database tables
- BI技巧丨当月期初
- Circular linked list and bidirectional linked list - practice after class
- lambda 函数完美使用指南
猜你喜欢

2022年危险化学品经营单位安全管理人员特种作业证考试题库及答案
![[yolo-v5 learning notes]](/img/f8/713210cafd7b750df540acbe03fd29.jpg)
[yolo-v5 learning notes]

Esp8266 firmware upgrade method (esp8266-01s module)

Formatting the generalization forgetting trade off in continuous learning

Detailed explanation of 14 registers in 8086CPU
![[Li Kou] curriculum series](/img/eb/c46a6b080224a71367d61f512326fd.jpg)
[Li Kou] curriculum series

2022年G3锅炉水处理复训题库及答案

Pyhon的第四天

BI技巧丨当月期初

Federated meta learning with fast convergence and effective communication
随机推荐
[wax chain tour] release a free and open source alien worlds script TLM
RT thread studio learning (I) new project
12.13-12.19 summary
TypeScript基础知识全集
Problems encountered in learning go
Win10 list documents
Kotlin插件 kotlin-android-extensions
i. Mx6ul porting openwrt
The first demand in my life - batch uploading of Excel data to the database
Map to sort
Why must coordinate transformations consist of publishers / subscribers of coordinate transformation information?
我人生中的第一个需求——Excel数据批量上传到数据库
Modelarts培训任务1
Putty installation and use
晶闸管,它是很重要的,交流控制器件
Detailed explanation of memory addressing in 8086 real address mode
面试计算机网络-传输层
AI狂想|来这场大会,一起盘盘 AI 的新工具!
Acwing - 4269 school anniversary
Esp8266 firmware upgrade method (esp8266-01s module)