当前位置:网站首页>On fedlearner, the latest open source federated machine learning platform of byte

On fedlearner, the latest open source federated machine learning platform of byte

2020-11-10 07:37:00 osc_odp8kgup

Recently, federal machine learning is becoming more and more popular , Byte also officially publicized the open source federal machine learning platform Fedlearner. This time the headlines are open source Fedlearner With Huawei, which I have analyzed before 、 What's the difference between micro crowd's federated machine learning platform ? Mainly reflected in the following aspects :

  1. Commercialization :Fedlearner There's a lot of js、Html modular , It is also the first time that we can intuitively see what the federated machine learning platform looks like , If you make a product, you need to grow into something .

  2. Business diversification : Before Huawei 、 Weizhong pays more attention to the implementation of federal machine learning in risk control business . The headlines start to emphasize that federal learning is recommending 、 Advertising and other business landing , And it gives very clear data , In an education business sector, the effect of advertising is increased 209%

  3. Exportability : If the previous federal machine learning platform introduced more from the theoretical level , This time byte of Fedlearner It emphasizes the exportability , For example, in order to keep the environment consistency of both sides of Federated modeling , adopt K8S The deployment mode of rapid pull up and management cluster . This is for ToB Technical preparation for export services

Let's talk about Fedlearner Some work in these three areas .

Fedlearner Product work

 

image.png

Take recommendation advertising as an example , The advertiser and platform side of Federated machine learning platform should manage a set of model display service and model training service respectively .

image.png

 

Two sets of protocols are needed to guarantee the federated modeling of customers , One is data consistency . For example, in the context of vertical federal learning , The user clicks on an ad on the page , The platform and the advertiser will capture a part of the log respectively . How to ensure the consistency of the logs captured by these two parts in real time , And stitched together into training samples , Need a set of real-time data sample splicing protocol .

Another protocol is multi-party data security protocol . such as AB Two business parties ,A Yes 4 Billion users ,B Yes 3 Billion users , How to find... In some way A and B Cross users of , And don't let A and B Guess each other's data , Need to have a set of multi-party data security protocol .

Based on the above two sets of agreements , In the process of joint modeling by both parties , Use GRPC signal communication , utilize TensorFlow Do the gradient exchange of both sides for joint modeling .

image.png

Business diversity

 

  The biggest business scenario for federated machine learning is recommendation advertising , I predicted this in an article a year ago . Sure enough, the headline highlights the application of recommended scenarios . He mentioned that recommendation services are more suitable for neural network algorithms , Risk control business is suitable for tree algorithm . The author also agrees with this statement , Because risk control needs high interpretability , Tree algorithm naturally meets this requirement . The recommendation business does not require high interpretability of the model , The complexity of neural network algorithm can fully guarantee the accuracy of the recommended sorting algorithm .

Fedlearner The person in charge of the business gave a set of numbers to prove the effect of Federated machine learning in the recommendation business .

image.png

This array is still very convincing . In fact, for new technologies , Most of the time, the barriers we face are not technical problems , It's about proving business value , Need the first crab eater , In order to promote the landing of new technologies in the industry . Federal machine learning has a bright future in the recommendation advertising business .

Exportability

 Fedlearner It adopts a cloud native deployment scheme . Data stored in HDFS, use MySQL Storage system data . adopt Kubernetes Manage and pull up the task . Every Fedlearner The training tasks of the two sides need to pull up at the same time K8S Mission , adopt Master Unified management of nodes ,Worker Build communication .

This scheme fully considers the data warehouse compatibility of users who are currently doing recommendation business , Because most of the customers' warehouse system is still Hadoop ecology , The data is stored in HDFS. Simultaneous use K8S At the same time, it ensures the consistency of the computing engine environment of both sides of the joint modeling .

summary

With more and more manufacturers coming in , Federal machine learning is bound to be an inflection point in the product competition of machine learning platform .

 

Reference resources :

[1]https://www.jiqizhixin.com/articles/2020-11-03-9

[2]https://github.com/bytedance/fedlearner

 

 

版权声明
本文为[osc_odp8kgup]所创,转载请带上原文链接,感谢