当前位置:网站首页>Meetup review how Devops & mlops solve the machine learning dilemma in enterprises?
Meetup review how Devops & mlops solve the machine learning dilemma in enterprises?
2022-06-11 22:44:00 【Xingce open source community】
6 month 5 Japan , Sponsored by xingce community 「DevOps+MLOps Meetup」 Held online , The event was held by 51CTO Video Number 、CSDN studio 、 Open source China video Number 、 Polar fox GitLab Video number four platform synchronization support , The cumulative number of viewers exceeded 5000. Initiator of xingce community —— Tan Zhongyi , In this activity, we mainly introduced DevOps And MLOps Concept and similarities and differences of ; Polar fox GitLab Architects —— Liuweifeng , Shared how to use traditional code management tools and pipeline platforms Gitlab, Automate the development of machine learning models ; Fourth normal form OpenMLDB R&d leader —— Lu Mian , How to use OpenMLDB Solve the problem of online and offline consistency of features , Accelerate the development and online process of machine learning .
This article is based on the key content shared by the three teachers , See the video review at the end of the article ,PPT Attention please. Official account. Xingce open source And the reply 0605~
Review of highlights
Part1:DevOps+MLOps, It's all about efficiency — Tan Zhongyi
Initiator of xingce community , Deputy Secretary General of the open source software promotion alliance , The open atom Foundation TOC Vice chairman of the — Tan Zhongyi , In the meetup It focuses on DevOps、MLOps The origin of 、 The concept and the similarities and differences between the two .
What is? DevOps?
DevOps This word has been popular in China for nearly ten years since it was put forward , His goal is to get faster on the basis of quality assurance Deliver Software . Under the traditional working mode ,Dev( developer ) Responsible for handing the compiled code to Ops( Operation and maintenance team ) Deploy online , The former is concerned with getting online faster , Realization function , The latter is concerned with the stability and availability of the system , Because the goals are not consistent , Under the traditional R & D mode ,Dev And Ops The contradiction is getting bigger and bigger , Formed a barrier to communication “ Departmental wall ”. To solve this problem , Break the wall , A new R & D model has emerged, namely DevOps. adopt CI+CD Let R & D and operation and maintenance unite , To form a DevOps Double ring , To break team boundaries , Use a more efficient 、 Work in a more streamlined and automated way . This method was later widely used , It has been more than ten years now , At the same time, various forms of Ops And tools , The goal is to improve the efficiency of machine learning .
What is? MLOPs?
MLOps It is oriented to the field of machine learning , In order to improve the landing efficiency of machine learning . The roles involved include data scientists and software engineers . His tasks include defining scenarios 、 Data collection and collation 、 Model training and deployment 、 Continuously monitor and update , This is a complete one pipeline The four parts of , He needs faster iteration and faster feedback at every step of the life cycle . And its scope is not just code , It also includes models and data . So to put it simply MLOps It contains code 、 Continuous integration of models and data 、 Continuous deployment 、 Continuous training and continuous monitoring , It also includes various platform tools , such as FeatureStore、ModelStore、ModelMonitoring etc. .
DevOps Vs MLOps
Difference : about DevOps and MLOps Come on , They are object oriented 、 The process and triggering method are different .DevOps The triggering method of is mainly code modification , and MLOps Not just code changes , When data changes 、Model Decay Any degradation of model performance will trigger the pipeline . The same thing :DevOps And MLOps Appearance , All to improve efficiency . In fact, no matter what it is OPS, Its goal is to create value for the end users .DevOps And MLOps The basic idea is the same , Including as much automation as possible ; Key features for improving practice 、 The key approach is the same , Systematic thinking , Feedback as soon as possible , Continuous learning and improvement , It's called DevOps Of 3 A methodology , stay MLOps The same goes for .
Part2:MLOps In the extreme fox GitLab Application exploration of —— Liuweifeng
Polar fox GitLab Architects —— Liuweifeng is here meetup In this article, we mainly introduce the extreme fox GitLab Medium MLOps, And the use of GitLab Realization MLOps Challenges and Prospects .
MLOps What is it?
MLOps Is the era of machine learning DevOps. Its main role is to connect the model building team and the business , Operation and maintenance team , Establish a standardized model development , Deployment and operation and maintenance process , So that enterprises can make better use of the ability of machine learning to promote business growth .
Polar fox GitLab Medium MLOps
as everyone knows , Polar fox GitLab yes MLOps A very mature product in the field , Its strength is in DevOps. and DevOps and MLOps The height of similarity makes the fox GitLab thinking , Can you use DevOPS Platform to solve some problems MLOpd What's the matter with you ? Based on this , by GitLab To relocate , Rangji Fox GitLab Become the perfect partner and tool for machine learning engineers and data scientists , And throughout the machine learning lifecycle ( Model creation 、 test 、 Deploy 、 Monitoring and iteration ) To provide them with a better user experience . As shown in the figure below , The orange part belongs to GItlab, The white part is provided by a third-party platform or tool . The code 、 Hyperparameters 、 Deployment and other processes can use GitLab To complete . The whole process can be understood as a machine learning pipeline, This pipeline By using GItlab Self contained CI pipeline Achieved .
The current situation
GItLab The current on MLOps The improvements are as follows :
- Perfect support ipynb Format file (>v14.5)
- Integrate MLFlow ( Have in hand )
- JupyterLab plug-in unit ( Have in hand )
- With customers in big data / Explore and practice in the field of machine learning
Polar fox GitLba Exploration
The following figure shows the flow chart of the current implementation , Each step in the middle can be seen as GitLab Pipeline Medium Job, Every job is GitLab Of MLRunner in , It can be considered as an actuator . On the whole , adopt DVC hold S3 Pull down the specified training data set in , Model training , After that, the model is evaluated , The result is calculated and written to MR In the middle , Based on this information, an evaluation result is generated , The data scientist makes a re evaluation based on the evaluation results 、 change . If it doesn't meet the demand , You can choose to MR Throw it away and don't deal with it , Or back to the training data / Code / Configuration changes again , Trigger the whole process again . If it meets the needs , This MR The code will be merged into the main branch , At the same time, save the generated image and the model . Far right CD The process , It's through JH GitLab KAS Of GitOps Workflow pattern implementation , It includes the following features :
- The deployment script is saved in a separate project in
- JH GitLab Of agent server Whether the monitoring content is updated
- Agent server notice k8s Medium agentk Configuration changes
- agentk Update the deployment environment as appropriate
Challenges and Prospects
For current use GitLab Realization MLOps Facing the following challenges :1: Users need to be familiar with GitLab CI,GitOps And so on .2: Polar fox GitLab Temporarily no MLOps Templates are available .3: Lack of comparison with other mainstream ML frame / Tool integration solutions .4: The transmission and preservation of massive data face challenges . future Gitlab We will continue to explore and improve the above problems .
Part3:OpenMLDB: Open source implementation feature computing platform —— Lu Mian 、
The fourth paradigm system architect 、OpenMLDB R&d leader —— Lu Mian , In the meetup It focuses on OpenMLDB Solve the whole process of machine learning (MLOps) Characteristic problem .
As shown in the figure below ,MLOps It can be seen as the whole process from machine learning application development to online , It includes offline development and online services , These two processes include the whole process from data to feature calculation to model training . And most of them DevOps The problems encountered during landing are similar ,MLOps Also encountered a lot of difficulties , The focus includes two aspects , The first is online and offline consistency verification , The second is the real-time feature mosaic and aggregation . OpenMLDB Can be properly solved MLOps About the characteristic data in the process FeatureOps The problem of , Feature Engineering Based on time series data for decision scenarios , It can greatly improve the efficiency of online machine learning , A real-time recommendation system to meet production level on-line requirements , Lower the threshold for machine learning practitioners .
In terms of Architecture ,OpenMLDB It provides the only external development language SQL, As long as a data scientist can write SQL Can use SQL Develop feature calculation scripts . From the inside ,OpenMLDB There are two engines , One set is batch processing SQL engine , The other is real-time SQL engine . Batch SQL The engine is mainly aimed at the offline development process , be based on Spark Did some source code level optimization , It can better handle feature platform feature calculation . And real time SQL part ,OpenMLDB The team built a distributed temporal memory database from scratch , A complete optimization of timing data SQL engine . In the middle of the architecture diagram is the consistency execution plan generator , It accepts the input of SQL, It is automatically converted internally into an offline execution plan and an online execution plan , From the inside, it automatically guarantees online 、 Consistency of offline computing logic , Eliminate the manual online and offline consistency verification process , Make sure that the development goes online . To sum up, it only takes three steps to complete this process , namely 1. Offline SQL Feature script development 2. One click deployment online 3. Access real-time request data stream .
at present OpenMLDB The new version v0.5.0 Brings performance 、 cost 、 Major optimization and upgrade of flexibility . The main upgrades are :1. Through prepolymerization technology , Significantly improve the performance of long windows .2. Cost reduction . Two storage engine options based on memory and external memory are provided , Greatly reduced the cost .3. Ease of use enhancement . Support C/C++ UDF, And support UDF Dynamic registration , It is convenient for users to extend computing logic , Improved application coverage . In short , The new version of the OpenMLDB The order of magnitude of online performance has been greatly improved , Provides low-cost landing options , The flexibility of usage scenarios has also been expanded . future 0.6.0 The version will pay more attention to ease of use and operability , Including database status self-test and reporting tools 、 Query debugging and tracing Tools 、 Database performance analysis and statistical report generation tool 、Flink connector、 Integrating feature coding and other related algorithms .
summary
DevOps And MLOps The emergence of is to improve the efficiency of enterprises , Improve the efficiency of research and development , In order to better provide value to customers . and MLOpd Based on DevOps Emerging from the development , It can be applied in the field of data analysis and machine learning , Play a very big role . I believe that with the digital transformation of enterprises, they have entered the advanced stage, that is, the intelligent stage ,AI It plays a key role in a large number of enterprises ,MLOps It will also become a hot word , And in the future 10 During the year , Become an industry AI The essential default word for landing . Last , You are welcome to continue to pay attention DevOps And MLOps, Join in Discuss with us MLOps Related content .
MLOps Fans :https://sourl.cn/g7LD44
Video review
DevOps&MLOps: It's all about efficiency —— Tan Zhongyi
https://www.bilibili.com/video/BV1NL4y1T7zd?spm_id_from=333.999.0.0
MLOps In the extreme fox GitLab Application exploration in —— Liuweifeng
https://www.bilibili.com/video/BV1ZT411V7DT?spm_id_from=333.999.0.0
OpenMLDB stay MLOps Application —— Lu Mian
https://www.bilibili.com/video/BV1zA4y1o7jY?spm_id_from=333.999.0.0
Live broadcast announcement |FeatureStore Meetup V3
FeatureStore As MLOps Important and relatively new concepts in the field , Many domestic companies have their own technology implementation , Cloud products and open source projects . However, enterprises still face many problems in the actual construction and application process . This activity will work hand in hand Intermediate Algorithm Engineer of Huawei mall - Zeng Zhongming 、 Fourth normal form OpenMLDB PMC- Chen Dihao 、 Head of Zhong'an insurance financial data application team - Guo Yubo ,3 The top lecturers discussed how to implement and apply the feature platform FeatureStore, Exchange construction experience , Spread experience .
Time :6 month 12 Japan ( Sunday )14:00-17:00 Be there or be square
details : Live broadcast announcement |FeatureStore Meetup V3 The heavyweight came !
*️ To sign up :https://6684201514000.huodongxing.com/event/6653672280522
边栏推荐
- Exercise 11-3 calculate the longest string length (15 points)
- 习题11-3 计算最长的字符串长度 (15 分)
- MATLAB点云处理(二十五):点云生成 DEM(pc2dem)
- Meetup回顾|DevOps&MLOps如何在企业中解决机器学习困境?
- Php+mysql library management system (course design)
- 机器学习之线性回归简单实例
- 阿里云服务器mysql远程连接一直连不上
- 判断链表是否为回文结构
- [Yu Yue education] calculus of Zhejiang University in autumn and winter 2021 (I) reference materials
- 习题6-6 使用函数输出一个整数的逆序数 (20 分)
猜你喜欢

16 | 浮点数和定点数(下):深入理解浮点数到底有什么用?

什么是死锁?(把死锁给大家讲明白,知道是什么,为什么用,怎么用)

仅需三步学会使用低代码ThingJS与森数据DIX数据对接

Mobile terminal - picture timeline of swipe effect

Tkinter学习笔记(三)

Glory earbud 3 Pro with three global first strong breakdowns flagship earphone Market

leetcode 中的位运算

Want to be iron man? It is said that many big men use it to get started

How to view computer graphics card information in win11

学1个月爬虫就月赚6000?别被骗了,老师傅告诉你爬虫的真实情况
随机推荐
volatile的解构| 社区征文
Tkinter学习笔记(三)
Matlab point cloud processing (XXIV): point cloud median filtering (pcmedian)
Implementation of stack stack
[nodejs] electron installation
astra pro双目相机ros下启动笔记
IEEE floating point mantissa even round - round to double
STM32开发笔记113:ADS1258驱动设计——读取温度值
[data mining time series analysis] restaurant sales forecast
MATLAB点云处理(二十五):点云生成 DEM(pc2dem)
IEEE浮点数尾数向偶舍入-四舍六入五成双
Exercise 11-3 calculate the longest string length (15 points)
NLP - fastText
Exercise 11-2 find week (15 points)
Tkinter study notes (IV)
Prefabricated dishes in the trillion market have also begun to roll inside. How can brands stand out in the fierce competition?
习题8-2 在数组中查找指定元素 (15 分)
Point cloud read / write (2): read / write TXT point cloud (space separated | comma separated)
STM32 development note 113:ads1258 drive design - reading temperature value
Huawei equipment configuration hovpn