当前位置:网站首页>Ant group open source trusted privacy computing framework "argot": open and universal
Ant group open source trusted privacy computing framework "argot": open and universal
2022-07-05 18:40:00 【Zhiyuan community】
The data circulation industry has entered a dense era , The trusted privacy computing framework can meet the different needs of various scenarios .

7 month 4 Japan , Ant group announced the official open source trusted privacy computing framework for global developers “ Argot ”.
The argot is that ant group lasted 6 Independent research and development in , To be safe 、 Open as the core design concept to build a trusted privacy computing technology framework , It covers almost all current mainstream privacy computing technologies .
According to introducing , Built in argot MPC、TEE、 Homomorphism and other dense computing virtual devices , Provide multi class federated learning algorithm and differential privacy mechanism . Protect data analysis through layered design and out of the box privacy 、 Machine learning and other functions , Effectively reduce the technical threshold of developers' applications , It can help privacy computing apply to AI、 Data analysis and other fields , Solve the pain points of privacy protection, data islands and other industries .
After ant group's large-scale business and external finance 、 Successful application of medical scenes , Argot gives consideration to both safety and performance . In the press conference , Ant group introduced many characteristics of argot .
What kind of privacy computing open source framework we need ?
Privacy computing is a new interdisciplinary technology field , Involving cryptography 、 machine learning 、 Hardware 、BI Analysis etc. , Including multi-party secure computing (MPC)、 Federal learning (FL)、 Trusted execution environment (TEE)、 Trusted dense state computation (TECC)、 Homomorphic encryption 、 Differential privacy and other technical routes , Involving many professional technology stacks .
As a key technology to give consideration to data security and data circulation , Privacy computing can ensure that the data provider does not disclose the original data , Analyze and calculate the data , Realize the integration of data in the process of circulation and integration “ Available not visible ”“ It's not recognizable ”.
According to the practical experience of the past few years, the industry has found , There are various directions of privacy computing technology , Different scenarios have their own appropriate technical solutions , And it involves many fields , It needs the cooperation of experts in many fields . For practitioners , Privacy computing has a high learning curve , Users with non privacy computing backgrounds have difficulty using .
In actual technology development , Privacy computing solutions are often a combination of multiple technical routes , The process involves a lot of repetitive work . such as , If developers want to use federated learning , Then use A Framework to do research and development ; If you want to use multi-party secure computing (MPC), Then use B Framework to do research and development , If you want to use trusted hardware , You need to be familiar with the architecture of the selected hardware to really start using . But the real business needs are , It often requires multiple technologies to be used together , Then there will be tedious 、 Repetitive development work . This is a technological innovation , But it brings technology “ The chimney ” Trouble .
More deadly , In the solution of cross technology route , The introduction of an underlying new technology , It will affect all the work of the upper level , Drag down technical iterations . Introduce a new technology , It will certainly change many things on the top , For users , All deployments may have to be experienced again , Feel very bad .
The current open source privacy computing framework , Such as TensorFlow Federated(TFF)、FATE、FederatedScope、Rosetta、FedLearner、Primihub Almost all of them are for a single privacy computing route . These frameworks provide some support for community research and industrial applications related to privacy Computing . However , Increasingly diverse application requirements in actual scenarios , And the limitations of technology itself , It brings new challenges to the existing privacy computing framework .
for example , First proposed “ Federal learning ” Technology giant Google , It's also TensorFlow The maker of , Recently, we have increased our support for a new platform JAX Investment , This move caused speculation in the industry :TensorFlow Will gradually be replaced .
Google's response to this is :
In recent years , We found that , A single common framework is often not applicable to all scenarios —— In particular, the needs of production and cutting-edge research often conflict .
Solve the problem of privacy computing open source framework
The argot of ant group echoes the current situation of the industry , It opens a way to the generalization of privacy Computing .
The head of the argot framework 、 Wang Lei, general manager of privacy intelligent computing Department of ant group, said , Ants from 2016 Started doing argot in , Purely technology driven forward-looking layout , It is an experiment incubated within a company .
The evolution of argot technology begins with matrix transformation , To trusted execution environment (TEE), Then to multi-party secure computing 、 Federal learning, etc , Through internal and external application scenarios , In terms of performance, it has been able to support large-scale data sets . In Finance 、 There are also successful large-scale landing experience in medical and other fields 、 Support the inter agency data flow of Shanghai Pudong Development Bank 、 Medical insurance of a third-class hospital in Zhejiang DRG(Diagnosis Related Group, Disease diagnosis related grouping ) reform , It has been awarded by the China Academy of communications “ Xinghe case ” prize ,CCF Science and technology award, science and technology progress Excellence Award 、 China Cyberspace Security Association “ Typical practice cases of data security ”, Selected by the Ministry of industry and information technology 2021 List of pilot demonstration projects for big data industry development in .
6 Years of technology accumulation , After forming a comprehensive technical system and mature landing experience , Officially open source argot , What are the advantages ?
The design goal of argot is to make it very easy for data scientists and machine learning developers to use privacy computing technology for data analysis and machine learning modeling , Without knowing the underlying technical details . Its overall architecture is divided into five layers from bottom to top :

The bottom layer is the resource management layer . It mainly undertakes two responsibilities . The first is for the business delivery team , It can shield the differences in the underlying infrastructure of different institutions , Reduce the deployment, operation and maintenance cost of the business delivery team . On the other hand , Through the unified management of resources of different institutions , Solve the problems of high availability and stability after business scale .
Above is the Ming ciphertext computing device and primitive layer . Provides a unified programmable device abstraction , Multi party secure computing (MPC)、 Homomorphic encryption (HE)、 Trusted hardware (TEE) And other privacy computing technologies are abstracted as dense devices , Abstract unilateral local computing into plaintext devices . meanwhile , It provides some basic algorithms that are not suitable for device abstraction , Such as differential privacy (DP)、 Secure aggregation (Secure Aggregation) etc. . In the future, when new dense state computing technologies appear , This loosely coupled design can be integrated into the privacy framework .
Continuing up is the Ming ciphertext hybrid scheduling layer . On the one hand, this layer provides the upper layer with an interface for mixed programming of Ming and ciphertext , It also provides a unified device scheduling abstraction . By describing the upper algorithm as a directed acyclic graph , Where the node represents the calculation on a device , Edges represent data flow between devices , Logic calculation diagram . Then the distributed framework further splits the logical calculation diagram and schedules it to physical nodes . At this point , Argot draws on the mainstream deep learning framework , The latter represents the neural network as a calculation diagram composed of operators on devices and tensor flows between devices .

Follow up is AI & BI Privacy algorithm layer . The purpose of this layer is to shield the details of privacy computing technology , But keep the concept of privacy Computing , Its purpose is to reduce the development threshold of privacy computing algorithm , Improve development efficiency . Students with privacy computing algorithm development demands , According to their own scenarios and business characteristics , Design some specialized privacy computing algorithms , To meet their own business and scenario security 、 Balance between computational performance and computational accuracy . On this level , Argot itself will also provide some general algorithmic capabilities , such as MPC Of LR/XGB/NN, Federated learning algorithm ,SQL Ability, etc .
The top layer is the user interface layer : The goal of argot is not to make an end-to-end product , But to enable different businesses to have comprehensive privacy computing capabilities through rapid integration of argots . Therefore, argot will provide a thin layer of products at the top API, And some atomized front and rear ends SDK, To reduce the cost of business integration argot .
Integrate the current mainstream privacy computing technologies and provide flexible assembly to meet the needs of scenarios , Is the most intuitive advantage of argot presentation . The bottom line is this , Under this framework , Developers have a variety of choices , Do experiments in their field through argot 、 Do iteration , Can lower the cost 、 Do technical verification more quickly . At the same time, the verified technology can also be used by other developers in other technical directions . Wang Lei thinks , Argot is more like a developer's platform , It is to gather these developers with different specialties , It is in line with the spirit of open source .
Take it apart in detail , The highlight of the first open source version of this argot , As shown in the figure, the lighting module .

- MPC equipment . Support most Numpy API, Support automatic derivation , Provide LR and NN dependent demo, Support pade High precision fixed-point number fitting algorithm , Support ABY3、 Cheetah agreement . Users can use the traditional algorithm programming mode , I don't know MPC Protocol based development MPC Agreed AI Algorithm ;
- HE equipment . Support Paillier Homomorphic encryption algorithm , Offer to the top Numpy Programming interface (API) , Users can use Numpy The interface performs matrix addition or ciphertext matrix multiplication . And realize the connection with MPC Data can be transferred between dense devices ;
- Differential privacy security primitive . Some differential privacy noise mechanisms are implemented 、 Safety noise generator 、 Privacy cost calculator ;
- Ming ciphertext mixed programming . Support centralized programming mode , Use @device Mark up the mixed computing diagram of plaintext and ciphertext devices , Parallel based on computational graph 、 Asynchronous task scheduling ;
- Data preprocessing . Provide data standardization in horizontal scenarios 、 discretization 、 Sub box function , Provide correlation coefficient matrix in vertical scene 、WOE Sub box function . Seamlessly connect existing dataframe, Provide and sklearn Consistent use of body feel ;
- AI & BI Privacy algorithms - Multiparty secure computing . Provide XGBoost Algorithm 、 Add HESS-LR Algorithm , Combined with differential privacy, the privacy protection of split learning is enhanced ;
- AI & BI Privacy algorithms - Federal learning . Provide federal learning model construction and include SecureAggregation,MPC Aggregation, PlaintextAggregation Gradient aggregation of multiple security modes including , Users only need to give the participants when building the model list And polymerization methods , Subsequent data reading , The experience from preprocessing to model training is almost the same as that of traditional plaintext programming .
In short , Mainly as follows :
- For algorithms / Model development : The programming ability provided by using argots , It can easily and quickly migrate more algorithms and models , And enhanced privacy protection .
- For the bottom Security Co Construction : The underlying password can be / Security research results are embedded in the argot , Improve the capability of dense equipment 、 Performance and safety , Transform actual business applications .
- The argot will also be updated in the subsequent open source version , Gradually light up more modules .
边栏推荐
- 2022年阿里Android高级面试题分享,2022阿里手淘Android面试题目
- U-Net: Convolutional Networks for Biomedical Images Segmentation
- 关于服装ERP,你想知道的都在这里了
- Quickly generate IPA package
- 怎么自动安装pythn三方库
- 小程序 修改样式 ( placeholder、checkbox的样式)
- How to obtain the coordinates of the aircraft passing through both ends of the radar
- 项目中遇到的问题 u-parse 组件渲染问题
- 音视频包的pts,dts,duration的由来.
- 7-1 链表也简单fina
猜你喜欢

About statistical power

Nacos distributed transactions Seata * * install JDK on Linux, mysql5.7 start Nacos configure ideal call interface coordination (nanny level detail tutorial)

让更多港澳青年了解南沙特色文创产品!“南沙麒麟”正式亮相

node_ Exporter memory usage is not displayed

【HCIA-cloud】【1】云计算的定义、什么是云计算、云计算的架构与技术说明、华为云计算产品、华为内存DDR配置工具说明

Solutions contents have differences only in line separators

技术分享 | 接口测试价值与体系

The 11th China cloud computing standards and Applications Conference | China cloud data has become the deputy leader unit of the cloud migration special group of the cloud computing standards working

中文版Postman?功能真心强大!

Oracle日期格式转换 to_date,to_char,to_timetamp 相互转换
随机推荐
进程间通信(IPC):共享内存
Case sharing | integrated construction of data operation and maintenance in the financial industry
node_exporter内存使用率不显示
《力扣刷题计划》复制带随机指针的链表
RPC protocol details
The easycvr platform reports an error "ID cannot be empty" through the interface editing channel. What is the reason?
第十一届中国云计算标准和应用大会 | 云计算国家标准及白皮书系列发布 华云数据全面参与编制
New words new words new words new words [2]
【在优麒麟上使用Electron开发桌面应】
Simple query cost estimation
2022最新Android面试笔试,一个安卓程序员的面试心得
c期末复习
Solutions contents have differences only in line separators
怎么自动安装pythn三方库
金太阳开户安全吗?万一免5开户能办理吗?
SAP feature description
vs2017 qt的各种坑
Vulnhub's darkhole_ two
Nacos distributed transactions Seata * * install JDK on Linux, mysql5.7 start Nacos configure ideal call interface coordination (nanny level detail tutorial)
AI表现越差,获得奖金越高?纽约大学博士拿出百万重金,悬赏让大模型表现差劲的任务