当前位置:网站首页>How to deploy pytorch lightning model to production
How to deploy pytorch lightning model to production
2020-11-08 21:03:00 【Computer and AI】
Mass service PyTorch Lightning A complete guide to the model .

Looking at the field of machine learning , One of the main trends is the proliferation of projects focused on applying software engineering principles to machine learning . for example ,Cortex It reproduces the experience of deploying server free but reasoning pipelines . Similarly ,DVC Modern version control and CI / CD The Conduit , But only for ML.
PyTorch Lightning With similar ideas , Only for training . The framework is PyTorch Provides Python Wrappers , It allows data scientists and engineers to write clean , Manageable and high performance training code .
As a build The whole person who deployed the platform , Part of the reason is that we hate to write templates , So we are PyTorch Lightning A loyal supporter of . In this spirit , I've sorted it out PyTorch Lightning Guidelines for model deployment to production environments . In the process , We are going to look at several derivations PyTorch Lightning Model to include options in the inference pipeline .
Deploy PyTorch Lightning Every method of reasoning by model
There are three ways to derive PyTorch Lightning Model launch :
-
Save the model as PyTorch checkpoint
-
Convert the model to ONNX
-
Export the model to Torchscript
We can go through Cortex For these three .
1. Package and deploy directly PyTorch Lightning modular
Start with the simplest way , Let's deploy a... Without any transformation steps PyTorch Lightning Model .
PyTorch Lightning Trainer Is an abstract template training code ( Think about training and validation steps ) Class , It has built-in save_checkpoint() function , This function will save your model as .ckpt file . To save the model as a checkpoint , Just add the following code to the training script :

Now? , Before we start serving this checkpoint , It should be noted that , Although I always say “ PyTorch Lightning Model ”, but PyTorch Lightning yes PyTorch The wrapper - Project README Literally means “ PyTorch Lightning It's just organized PyTorch.” therefore , The derived model is generic PyTorch Model , You can use... Accordingly .
With preserved checkpoints , We can do it in Cortex Easily serve the model in . If you are not familiar with it Cortex, Sure Get familiar with it quickly here , however Cortex A brief overview of the deployment process is :
-
We use Python We wrote a prediction for our model API
-
We are YAML What defines us in API Infrastructure and behavior
-
We use CLI Command deployment in API
Our prediction API Will use Cortex Of Python Predictor Class defines a init() Function to initialize our API And load the model , And use a define() Function to provide predictions at query time :

It's simple We've adapted some of the code from the training code , Added some reasoning logic , That's it . One thing to pay attention to is , If you upload the model to S3( recommend ), You need to add some logic to access it .
Next , We are YAML Configure infrastructure in :

Again , Simple . We gave it to us API Name it , tell Cortex Our prediction API Where is the , And allocate some CPU.
Next , We deploy it :

Please note that , We can also deploy to clusters , from Cortex Speed up and manage :

In all deployments ,Cortex Will container our API And make it public as Web service . Through cloud deployment ,Cortex Load balancing can be configured , Automatic extension , monitor , Updates and many other infrastructure features .
this is it ! Now? , We have a real-time Web API, Model predictions are available on request .
2. Export to ONNX And pass ONNX Launch at runtime
Now? , We've deployed a normal PyTorch checkpoint , Make things more complicated .
PyTorch Lightning Recently added a convenient abstraction , Used to export models to ONNX( before , You can use PyTorch The built-in conversion function of , Although they need more templates ). To export the model to ONNX, Just add the following code to your training script :

Please note that , Your input sample should mimic the shape of the actual model input .
export ONNX After the model , You can use Cortex Of ONNX Predictor To serve them . The code basically looks the same , And the process is the same . for example , This is a ONNX forecast API:

Basically the same . The only difference is , We don't initialize the model directly , But through onnx_client Access the data , This is a Cortex Launched to serve our model ONNX Runtime container .
our YAML It looks very similar :

I've added a watch sign here , The purpose is just to show how easy the configuration is , And there are some ONNX Specific fields , But everything else is the same YAML.
Last , We use the same as before $ cortex deploy Command to deploy , And our ONNX API Enabled .
3. Use Torchscript Of JIT The compiler serializes
For the final deployment , We will PyTorch Lightning The model is exported to Torchscript And use PyTorch Of JIT The compiler provides services . To export the model , Just add it to your training script :

For this purpose Python API With primordial PyTorch The examples are almost the same :

YAML Keep the same as before , also CLI The orders are, of course, consistent . If necessary , We can actually update our previous PyTorch API To use the new model , Just put the new and the old dictor.py The script is replaced by a new script , Then run... Again $ cortex Deploy :

Cortex Automatically perform rollover here , In this update , new API Will be activated , And then with the old API In exchange for , This avoids any downtime between model updates .
That's all . Now? , You already have fully operational predictions for real-time reasoning API, According to the Torchscript Models provide predictions .
that , Which method should you use ?
The obvious question is which method works best . The fact is that , There is no simple answer here , Because it depends on your model .
about BERT and GPT-2 etc. Transformer Model ,ONNX Can provide incredible optimization ( We measured CPU Throughput improvement 了 40 times ). For other models ,Torchscript May be better than vanilla PyTorch Better - Although there are some caveats , Because not all models are exported cleanly to Torchscript.
Fortunately, , Deployment with any option is easy , You can test all three options in parallel , And see which way works best for your particular API.
If you like this article , Welcome to like forwarding ! thank you .
Don't go after watching, and there's a surprise !
I carefully organized the computer /Python/ machine learning / Deep learning is related to 2TB Video lessons and books , value 1W element . Pay attention to WeChat public number “ Computers and AI”, Click the menu below to get the online disk link .

版权声明
本文为[Computer and AI]所创,转载请带上原文链接,感谢
边栏推荐
- RSA非对称加密算法
- Leetcode 45 jumping game II
- 【云服务】阿里云服务器ECS实例规格那么多,如何选型?最佳实践说明
- JVM Zhenxiang series: easy understanding of class files to virtual machines (Part 2)
- [200 interview experience], programmer interview, common interview questions analysis
- 国内三大云数据库测试对比
- 如何将 PyTorch Lightning 模型部署到生产中
- Looking for better dynamic getter and setter solutions
- 第一部分——第2章指针操作
- Newbe.ObjectVisitor Example 1
猜你喜欢
随机推荐
If the programming language as martial arts unique! C++ is Jiu Yin Jing. What about programmers?
Why need to use API management platform
go语言参数传递到底是传值还是传引用?
Process thread coroutine
200人的程序员面试经验,都在这里了
学会了volatile,你变心了,我看到了
Dynamic programming: maximum subarray
Dynamic ReLU:微软推出提点神器,可能是最好的ReLU改进 | ECCV 2020
新手入坑指南:工作原因“重启”Deepin系统,发现真的香啊
MYCAT build
给大家介绍下,这是我的流程图软件 —— draw.io
用两个栈实现队列
实现图片的复制
Dynamic query processing method of stored procedure
CMS垃圾收集器
Brief introduction of Integrated Architecture
Five phases of API life cycle
动态规划设计:最大子数组
[elastic search technology sharing] - ten pictures to show you the principle of ES! Understand why to say: ES is quasi real time!
getBytes之 LengthFieldBasedFrameDecoder服务端解析