当前位置:网站首页>Heavy! Ant open source trusted privacy computing framework "argot", flexible assembly of mainstream technologies, developer friendly layered design
Heavy! Ant open source trusted privacy computing framework "argot", flexible assembly of mainstream technologies, developer friendly layered design
2022-07-06 17:51:00 【CSDN information】
7 month 4 Japan , Ant group announced the official open source trusted privacy computing framework for global developers “ Argot ”, use Apache-2.0 agreement , Code managed to GitHub、Gitee Two big platform .“ Argot ” Through good extensible architecture design , A set of general framework is used to uniformly support the following MPC、TEE、FL、HE、DP Including a variety of mainstream privacy computing technologies , It can flexibly combine various technologies , Provide different solutions for different application scenarios .
Six years of technological precipitation ,“ Argot ” Break through a privacy computing application problem
2016 year ,“ Argot ” As a “ Experimental projects ” In the birth of ants , Step on the first footprint from matrix transformation technology , To trusted execution environment (TEE), Then to multi-party secure computing (MPC)、 Federal learning (FL) etc. , We have been enriching our technical connotation all the way , In Finance 、 Successful landing application experience in practical application scenarios in medical and other fields .
Although privacy computing theory has developed for more than 40 years , At the application level , There are still many industries that must be crossed obstacle :
There are various directions of privacy computing technology , Different scenarios have their own more appropriate technical solutions ;
Privacy computing has a high learning curve , Users with non privacy computing backgrounds have difficulty using ;
Privacy computing involves many fields It requires the cooperation of experts in the field .
Privacy computing is still a relatively new interdisciplinary field at this stage , Involving cryptography 、 machine learning 、 database 、 Trusted hardware and other fields , Including multi-party secure computing (MPC)、 Federal learning (FL)、 Trusted execution environment (TEE)、 Trusted dense state computation (TECC) And other technical routes , Involving many professional technology stacks , It is not easy to achieve perfection and ensure safety .“ Argot ” The design goal of , Is to make data scientists and machine learning developers do not need to understand the underlying technical details , It is very easy to use privacy computing technology for data analysis and machine learning modeling .
that , How can we adapt to the different needs of developers at different levels ?
To achieve this goal , Argots provide a layer of device abstraction , Multi party secure computing (MPC)、 Homomorphic encryption (HE) And trusted execution environment (TEE) And other privacy computing technologies are abstracted as ciphertext devices , Abstract unilateral computing into plaintext devices .
Based on this level of abstraction , Data analysis and machine learning workflow can be represented as a calculation diagram , Where the node represents the calculation on a device , Edges represent data flow between devices , The data flow between different types of devices will automatically carry out protocol conversion . At this point , Argot draws on the mainstream deep learning framework , The latter represents the neural network as a calculation diagram composed of operators on devices and tensor flows between devices .
The above process corresponds to disassembly to frame layering ,“ Argot ” From bottom to top , The following design and research have been carried out :
Resource Management : It mainly undertakes two responsibilities . The first is for the business delivery team , It can shield the differences in the underlying infrastructure of different institutions , Reduce the deployment, operation and maintenance cost of the business delivery team . On the other hand , Through the unified scheduling and management of resources of different institutions , It solves the problems of large-scale and high availability in production scenarios .
Ming ciphertext computing equipment and primitive layer : Provides a unified programmable device abstraction , Multi party secure computing (MPC)、 Homomorphic encryption (HE)、 Trusted hardware (TEE) And other privacy computing technologies are abstracted as dense devices , Abstract unilateral local computing into plaintext devices . meanwhile , It provides some basic algorithms that are not suitable for device abstraction , Such as differential privacy (DP)、 Secure aggregation (Secure Aggregation) etc. . In the future, when new dense state computing technologies appear , This loosely coupled design can be integrated into the privacy framework .
Ming ciphertext Mixed Scheduling layer : On the one hand, this layer provides the upper layer with an interface for mixed programming of Ming and ciphertext , It also provides a unified device scheduling abstraction . By describing the upper algorithm as a directed acyclic graph , Where the node represents the calculation on a device , Edges represent data flow between devices , Logic calculation diagram . Then the distributed framework further splits the logical calculation diagram and schedules it to physical nodes .
AI & BI Privacy algorithm layer : The purpose of this layer is to shield the technical details of privacy Computing , But preserve the essence of privacy Computing , The purpose is to reduce the development threshold of privacy computing algorithm , Improve development efficiency . Developers who have privacy computing algorithm development demands , According to their own scenarios and business characteristics , Design some specialized privacy computing algorithms , To meet their own business and scenario security 、 Balance between computational performance and computational accuracy . On this level , Argot itself will also provide some general algorithmic capabilities , such as MPC Of LR/XGB/NN, Federated learning algorithm ,SQL Ability, etc .
User interface layer : The goal of argot is not to make an end-to-end product , But to enable different businesses to have comprehensive privacy computing capabilities through rapid integration of argots . Therefore, argot will provide a thin layer of products at the top API, And some atomized front and rear ends SDK, To reduce the cost of business integration argot .
Taking openness as the core “ Argot ” We are committed to making the developer experience the best
Summarize the structural layering of argot , It can be seen that the argot framework always revolves around The core idea of openness , Through different levels of design abstraction , It can provide good development experience for different types of developers :
Good equipment interface and protocol interface in the equipment layer , Support plug-in access of more devices and protocols , Pair cryptography 、 Trusted hardware 、 Developers with hardware acceleration and other backgrounds are friendly , It is conducive to expanding the types and functions of dense state Computing , Continuously improve the security and computing performance of the Protocol .
The algorithm layer provides a flexible programming interface for machine learning , Friendly to algorithm developers , They can define their own algorithms in the same way as using traditional machine learning frameworks .
So in The first open source version in , Argot has opened those modules ? What functions are supported ?
chart : Argot frame V0.6 Open source module
MPC equipment
Support most Numpy API, Support automatic derivation , Provide LR and NN dependent demo, Support pade High precision fixed-point number fitting algorithm , Support ABY3、 Cheetah agreement . Users can use the traditional algorithm programming mode , I don't know MPC Protocol based development MPC Agreed AI Algorithm
HE equipment
Support Paillier Homomorphic encryption algorithm , Offer to the top Numpy Programming interface (API) , Users can use Numpy The interface performs matrix addition or ciphertext matrix multiplication . And realize the connection with MPC Data can be transferred between dense devices .
Differential privacy security primitive
Some differential privacy noise mechanisms are implemented 、 Safety noise generator 、 Privacy cost calculator .
Ming ciphertext mixed programming
Support centralized programming mode , Use @device Mark up the mixed computing diagram of plaintext and ciphertext devices , Parallel based on computational graph 、 Asynchronous task scheduling .
Data preprocessing
Provide data standardization in horizontal scenarios 、 discretization 、 Sub box function , Provide correlation coefficient matrix in vertical scene 、WOE Sub box function . Seamlessly connect existing dataframe, Provide and sklearn Consistent use of body feel .
AI & BI Privacy algorithms - Multiparty secure computing
Provide XGBoost Algorithm 、 Add HESS-LR Algorithm , Combined with differential privacy, the privacy protection of split learning is enhanced .
AI & BI Privacy algorithms - Federal learning
Provide federal learning model construction and include SecureAggregation,MPC Aggregation, Gradient aggregation of multiple security modes including , Users only need to give the participants when building the model list And polymerization methods , Subsequent data reading , The experience from preprocessing to model training is almost the same as that of traditional plaintext programming .
In short , Mainly as follows :
For algorithms / Model development : The programming ability provided by using argots , It can easily and quickly migrate more algorithms and models , And enhanced privacy protection .
For the bottom Security Co Construction : The underlying password can be / Security research results are embedded in the argot , Improve the capability of dense equipment 、 Performance and safety , Transform actual business applications .
According to the release conference of Argyle open source ,“ Argot ” It will also be updated in the subsequent open source version , Gradually light up more modules .
Go to developers , Penetrate technical barriers and practice “ A unique skill ”
Return to this Practical problems , There are many privacy computing frameworks on the market , such as TFE,CrypTen,MP-SPDZ etc. , Because the existing is based on AI Framework (TFE/CrypTen), It is also a framework starting from secure computing (SPDZ), There are certain limitations . The former is often difficult to deploy , It is difficult to make specific optimization in the security field . The latter often needs to write something Toy AI frame , High learning cost .
stay “ Argot ” A whole set of precipitated “ A unique skill ” in , Dense computing equipment SPU It is one of the highlights of innovative research and development .
SPU yes Secretflow Processing Unit For short , She acts as a cryptic computing unit of the argot platform , Provide secure computing services for argots :
In recent years, , Dense state calculation (MPC/HE) Great progress has been made in computing power , But dense computing power and AI The algorithm requirements of are still difficult to match . For example, federal learning , Implement a sub step of the algorithm with secure computing , Sacrifice local security for higher performance . When the computational power cannot match the algorithm ,“ Argot ” The idea is “ Ming ciphertext mixed ”, To achieve a balance between safety and performance .
Argot provides a very free Ming ciphertext hybrid programming paradigm , We do not restrict the plaintext engine , Nor does it restrict the ciphertext engine , Developers can use their familiar framework to develop , Then mark some part of it and run it with the plaintext engine , The other part uses SPU run . such as :
【 notes 】 In the figure MPC Device Namely SPU Realized
As a contrast , From the perspective of safety and performance , No matter what TFE/CrypTen/SPDZ It's hard to make such a balance .
Besides ,SPU The deployment mode of is transparent , You don't have to change any line of code , The existing models can be safely and correctly implemented in any of the above deployment scenarios . also ( As opposed to based on AI Privacy computing framework of the platform )SPU The runtime is very lightweight , Unwanted Python runtime, It can be easily deployed and integrated .
As AI developer , No security background is required , The existing model can be safely applied to multi-party data .
As a security developer , No need for any AI background , Only the basic operators of secure computing , Can support a variety of front-end frameworks . also , You can easily deploy and operate , Compromise between safety and performance , Find the best landing plan .
SPU take AI Front end and MPC Back end decoupling , Make in SPU Any security protocol extended in can support a variety of front ends without feeling . This part , There are already teams “ Argot ” Some achievements have been made in the framework Build and realize , For example, Alibaba security Gemini laboratory will Cheetah( Cheetah ) The agreement is partly contributed to the argot , And better optimization .
Another bright spot is : At present, the fastest two-party secure computing protocol in the industry “ Cheetah ”, Contributed to the argot , Realize deep collaboration .
At present, the privacy computing demand scenario in the industry is dominated by two-party Computing :Alice( Data demanders ) With the help of Bob( data source ) Data to enhance their business capabilities , however Bob I don't want to give my own data directly . So how to efficiently implement secure two-party Computing (2PC), It has become the key to solve this problem . To solve this problem, Alibaba security Gemini laboratory has developed Cheetah( Cheetah ) Secure two-party computing framework , stay 2PC Breakthroughs have been made in many underlying bottlenecks , The overall performance of both parties' computing has been greatly improved , It can be faster than the best results before - Microsoft CryptFLOW2(CCS20) promote 5 More than times , Has been one of the four major international security conferences USENIX Security Symposium receive .
In addition to the public content of some papers , The cheetah is already “ Argot ” Implemented in the Better optimization ( Compared with public code support 30-40 Bit's secret sharing , What cheetah realizes in argot is to support such as 64 Bit's greater secret sharing ) And some algorithms not disclosed in the paper . The most important thing is that this implementation has no perception of the upper business logic of the argot , That is, the logic code of the argot already exists and does not need to be changed to adapt .
“ Argot ” Future planning of the open source community
“ Argot ” Logical device abstraction provides great flexibility for algorithm developers , They can freely combine these devices like building blocks , Customize the calculation on the device , So as to build their own privacy computing algorithm . at present ,“ Argot ” Open source adoption Apache-2.0, Allow free download and use , Not only will more modules and functions be gradually opened to developers in the code base , Some have also been provided in the developer documentation Example of privacy protection algorithm development , Such as image classification tasks based on Federated learning , For developers to download, run and feel the effect .
In addition to focusing on technology itself , Programmability in the framework 、 Scalability is enhanced .“ Argot ” The open source community has also been officially established , Around the open source community , Ant group and argot will also cooperate with developers in many aspects 、 Researchers jointly build a privacy computing ecosystem :
One is to use words through various channels 、 Videos and other diverse content , Popularize the technology of privacy Computing , Enhance communication with developers through open communication discussion ;
The second is the linkage of scientific research institutions in Colleges and universities “ Online teaching ”, Form the combination of industrial perspective and Teaching Perspective , Create more diverse communication activities for developers , Precipitate systematic privacy computing learning materials , Share publicly , Help developers grow ;
besides , At the Argyle open source press conference , Ant group announced a joint venture with the Chinese computer society ( abbreviation CCF) To set up “CCF- Special scientific research fund for ant privacy Computing ”, Incubation support for privacy computing researchers , Open recruitment 、 Selection 、 Support the in-depth development of innovative and valuable topics , Support privacy computing frontier research .
Access content instantly , Explore more interesting uses based on argots :
Code :
https://github.com/secretflow
https://gitee.com/secretflow
file :
SecretFlow:https://secretflow.readthedocs.io
SPU:https://spu.readthedocs.io
边栏推荐
- The art of Engineering (1): try to package things that do not need to be exposed
- 一体化实时 HTAP 数据库 StoneDB,如何替换 MySQL 并实现近百倍性能提升
- The art of Engineering (2): the transformation from general type to specific type needs to be tested for legitimacy
- Debug xv6
- 微信小程序获取手机号
- The integrated real-time HTAP database stonedb, how to replace MySQL and achieve nearly a hundredfold performance improvement
- Xin'an Second Edition: Chapter 24 industrial control safety demand analysis and safety protection engineering learning notes
- Basic configuration and use of spark
- Zen integration nails, bugs, needs, etc. are reminded by nails
- What is the reason why the video cannot be played normally after the easycvr access device turns on the audio?
猜你喜欢
TCP connection is more than communicating with TCP protocol
In terms of byte measurement with an annual salary of 30W, automated testing can be learned in this way
【Elastic】Elastic缺少xpack无法创建模板 unknown setting index.lifecycle.name index.lifecycle.rollover_alias
It doesn't make sense without a distributed gateway
分布式(一致性协议)之领导人选举( DotNext.Net.Cluster 实现Raft 选举 )
FMT开源自驾仪 | FMT中间件:一种高实时的分布式日志模块Mlog
Summary of Android interview questions of Dachang in 2022 (II) (including answers)
视频融合云平台EasyCVR增加多级分组,可灵活管理接入设备
Easy introduction to SQL (1): addition, deletion, modification and simple query
[elastic] elastic lacks xpack and cannot create template unknown setting index lifecycle. name index. lifecycle. rollover_ alias
随机推荐
The art of Engineering (1): try to package things that do not need to be exposed
The art of Engineering
分布式不来点网关都说不过去
C # nanoframework lighting and key esp32
The art of Engineering (3): do not rely on each other between functions of code robustness
Awk command exercise
学 SQL 必须了解的 10 个高级概念
Nodejs 开发者路线图 2022 零基础学习指南
Debug xv6
Pytest learning ----- pytest confitest of interface automation test Py file details
VR全景婚礼,帮助新人记录浪漫且美好的场景
How to submit data through post
How to use scroll bars to dynamically adjust parameters in opencv
78 岁华科教授逐梦 40 载,国产数据库达梦冲刺 IPO
遠程代碼執行滲透測試——B模塊測試
[translation] principle analysis of X Window Manager (I)
2022年大厂Android面试题汇总(一)(含答案)
There is a gap in traditional home decoration. VR panoramic home decoration allows you to experience the completion effect of your new house
《ASP.NET Core 6框架揭秘》样章发布[200页/5章]
Guidelines for preparing for the 2022 soft exam information security engineer exam