当前位置:网站首页>Privacy Computing Overview
Privacy Computing Overview
2022-08-04 23:38:00 【Du Baokun】
一 背景
*时光荏苒,Flash from his start to write the public to today,已经有十个月了,刚刚看了下,This ten months,Original articles84篇,Combining his own experience in the direction of the,Write five column:隐私计算、机器学习框架、机器学习算法、高性能计算、Mathematics, etc.Generally familiar with my friends all know that I was originally pure engineering,To do search push architecture project,Subsequent to the algorithm field(Machine learning framework and algorithm involves),Later because the project need to,Reimbursing responsible for federal study of jingdong,Began to make privacy calculation,Can be said to be rather for the way,I feel there is no a few days time to relax,Not in the study is in the way of learning,没办法,Who let you love STH over and over again!高TRequirement is to horizontal development and technology transfer capacity,But I expand areas of basic it is a new field,And I to own request is an expert in every field to do level,So is very difficult,But I personally just like STH over and over again,Curious about new things,So would be pushed all the way to come,In the past results prove I do all is good,Is not a shame.Writing the public number,I also want to write a special field or fields,But feel write a domain is a waste of,So their areas were wrote.Lateral extension area has a big advantage is that when it comes to large projects need more direction in the field of,You will be well,Such as the federal study the typical multidisciplinary cross emerging field,For the overall technology selection、技术规划、协议设计、算法设计、分布式架构、Performance optimization is of great help,But the difficulty is really very big.
*
对于很多事情,I am is too late.Is engaged in the Internet ten years,Have long to write some articles and share the idea of,But has no implementation,In October last year,由于某些原因,His is one of the enlightenment,开始写了起来.到目前为止,Privacy computing has written27篇文章,Basic covers all aspects of privacy to calculate,The follow-up will also update.Recall you just do the federal study,也是懵懵懂懂,A lot of knowledge of cryptography understand profound enough,But the indomitable character driven forward continuously,From the theory and the ground got breakthrough,在这个过程中,Many bloggers write articles in the network gave me great help,Help I go now.So the idea of yourself on the public has sprouted a number,Share your some experience in the field of privacy to calculate,Also for the whole industry into a force of microblog,If students can benefit from it,I really a great honor to have this,In view of personal ability and limit,If you have write wrong place,欢迎大家帮忙指正,共同进步.
Memories began writing the public number,Because of the article has certain requirements,So the original article written up not so smooth,Once also to have to down,也想过放弃,Nature of work itself is busy,Spare time is less,So the challenge is very big.But fortunately his stick.正所谓“万事开头难,Afterwards are all calm”.After wrote a dozen articles in succession,Gradually find the feeling,Fell in love with to write articles to share,Also don't think that is a chore,Now the state is not a week to write a,You will feel whole body uncomfortable,哈哈.
Because I myself is axis,So writing articles is axis,Every time before writing are done plenty of research,Strive to make things clear、说明白,At least on my own here,I am efforts in this direction to do.
Thank you a lot of readers to encourage and support,Also thanks to a lot of friend's trust,Technology is a borderless,Hope everyone together to promote the progress of technology.
The article once more,May be about how to read is a more troublesome thing,所以今天写了这篇文章,对“隐私计算”Articles on reading,The current privacy calculation comprehensive,除了TEE(可信执行环境)Outside are more in.As a privacy calculation-Federal study embodies a few years,并且从0到1True jingdong retail federal learning platform and realize the business start practitioners,The whole column article emphasizes the combination of theory and practice.
二 The future of private computing
2.1 政府法律法规
「National government agencies and some organizations have a clear aware of the seriousness of the private data,From the aspect of policies and regulations for the specification,In recent years under the list some major measures about data privacy and case:」
「GDPR」是 (The European) General Data Protection Regulation 的缩写,翻译成中文是:「通用数据保护条例」,是欧盟议会和欧盟理事会在 2016 年 4 月通过,在 2018 年 5 月开始强制实施的规定. The European Union and the United States policy makers emphasized the privacy technology is2021Years of mutual priorities; 据报道,2021年7月,The unified law commission(ULC)投票通过了《Unified personal Data Protection Act》( UPDPA). UPDPAModel is data privacy act,States is dedicated to provide you with a template,To his own legislature is introduced,And finally as a binding legal.In the final revised,UPDPA将于2022年1Month ago submitted state legislature. 2021年6月《中华人民共和国数据安全法》(以下称《数据安全法》)After SanShen three read,于2021年6月10The nikkei the thirteenth session of the standing committee of the National People's Congress conference by 29.On the basis of two reviewing deleted1条,增加了3条,Official text altogether7章55条,将于2021年9月1日起正式实施. 2021年7月10日,国家互联网信息办公室发布《网络安全审查办法(修订草案征求意见稿)》公开征求意见的通知,Article 6 points out “掌握超过100万用户个人信息的运营者赴国外上市,必须向网络安全审查办公室申报网络安全审查.” 由此可以看出,Regulation of privacy protection、Data security problem determination. 经过三次审议,十三届全国人大常委会第三十次会议表决通过了《中华人民共和国个人信息保护法》,并与2021年11月1日起施行.确立个人信息保护原则、规范处理活动保障权益、禁止“大数据杀熟”规范自动化决策、严格保护敏感个人信息、赋予个人充分权利等.
2.2 Giant layout
「Based on the importance of data privacy,Specific stepping up the Internet's devotion to privacy calculation,In view of data privacy laws and regulations more and more strict case,The future for the Internet“采 传 存 算”Model big challenge,Cross domain data transmission is bigger risk.So, in order to ensure that emerge in the next track,Don't fall for people,And keep the first-mover advantage,Are stepping up the giant layout privacy calculation.」
FaceBookUse privacy enhancement technique(PETs)Privacy when advertising,应用MPC、联邦学习、Difference related all links in the form of privacy and security.Explore the homomorphic encryption technology evolution,With end-to-end encryption,实现“Encrypted data operations replace the clear data operations,Achieve the same results“的密保,解决数据隐私问题. Google利用本地化差分隐私保护技术从Chrome浏览器每天采集超过1400万用户行为统计数据.GoogleLaunch a aims to disruptive technology to protect the privacy of usersFLoC,Nature is also a federal study technology. 2020年,Apple mobile phone in theIOS14里,Want to use each of theseIDDevice number are need to ask the user for users when they first use authorization,用户可以选择“允许追踪”和“禁止追踪”,If the user wants to provide more data for advertisers in order to obtain more accurate advertising push,You need a clear mandate to allow,进一步保护用户隐私. 2021年6月24日,Officially launched in Microsoft Windows 11 操作系统之后,At the same time issued must meet Windows 的最低硬件要求,即WIT(Wintel Trust),Specifies the must includeTPMTrusted computing hardware and software,即不包含TPMHardware equipment cannot be usedWindows 11. 2020年,京东(广告部门)Cooperate with byte to beat the federal study in the field of marketing,Successfully implement the learning platform landing,And based on the business for the modeling,Business in us,效果显著. 阿里达摩院发布了2022十大科技趋势,The whole domain privacy calculation strength list.并且阿里云、Ali mother and ants are invested heavily layout privacy calculation. 百度研究院发布了2022In the top ten science and technology of trend prediction,其中提到,「隐私计算技术备受关注,将成为数据价值释放的突破口和构建信任的基础设施」
*So from the policy level and the layout of the industry's giants,Can see clearly the importance of privacy computing industry in the future,未来一片光明.So for the prospects of privacy calculation,You really don't have to worry too much about.
*
三 An overview of privacy computing
*Privacy calculation is essentially a在保护数据隐私的前提下,解决数据流通、数据应用等数据服务问题,在保证数据提供方不泄露原始数据的前提下,对数据进行计算、Analysis and modeling of a series of information technology,涵盖数据的产生、采集、存储、计算、应用、Destruction of data flow of the whole life cycle such as.说得更通俗一些,就是在保证数据安全的前提下,让数据可以自由流通或共享,消除数据孤岛问题,To release more data value,提升生产效率,推进产业创新.
*
3.1 Privacy computing history of
2016 年发布的《隐私计算研究范畴及发展趋势》正式提出“隐私计算”一词,And privacy calculation is defined as:“For the calculation of whole life cycle of information privacy protection theory and method,是隐私信息的所有权、管理权和使用权分离时隐私度量、隐私泄漏代价、隐私保护与隐私分析复杂性的可计算模型与公理化系统.”
如上图所示,The concept of privacy calculation about1995欧盟提出《数据保护指令》的时候首次提出,And in later days continuously with the emergence of new laws and regulations and industry technology,Based on component technology privacy(同态加密、秘密分享、混淆电路等)基础技术,Spawned many safety calculation、TEEThe trusted execution environment, with the federal privacy such as calculating the track,尤其是联邦学习,Based on the balance of its privacy and performance,By means of multilateral joint modeling,In more than one scene ground application,And has a huge value.
3.2 Privacy computing technology route
The concept of privacy calculation include:”数据可用不可见,数据不动模型动“、“数据可用不可见,数据可控可计量”、“不共享数据,而是共享数据价值”等.这门技术是门综合性非常强的领域,涉及到众多方向,比如密码学、数学、大数据、实时计算、高性能计算、分布式、传统机器学习框架与算法、网络安全体系、计算机体系结构、数学领域、深度学习框架与算法、Privacy computing based technology(差分隐私、秘密分享、混淆点、不经意传输等.)等等,整体技术非常复杂,Is a master in the field of multiple technology.
Can be said to be very high comprehensive quality requirements for practitioners,If can control or master many, of course, is a good thing,But from the interview situation basic unlikely(But as far as possible in depth at the same time,多点开花,The hardest thing is the best things),So make sure in several fields including diligence,Do the rest of the field to understand、Be familiar with the even master.
根据目前市场上隐私计算的主要相关技术特性,Overall summary can be divided into three major direction with five major base
三大方向 方向一:安全多方计算 方向二:TEE -- 基于硬件的可信执行环境 方向三:联邦学习
Five bases: base one:隐私计算基础组件,Contains the homomorphic encryption、秘密分享、不经意传输、混淆电路等; base two:传统的安全机制,包含网络安全、主机安全、破解与反破解(The ability to transverse federal need end side calculation,Security protection is required) Pedestal three:机器学习能力,Algorithm and a frame of the traditional machine learning and deep learning ability base four:工程架构,分布式、高并发、大数据、实时计算等 Pedestal five: Mathematics and cryptography knowledge,Traditional disciplines knowledge,例如数学、统计学习、密码学等;
3.3 Privacy calculation of talent
Privacy computing is an emerging field,有很多难题需要解决,Only have mastered enough knowledge reserve,To blossom in the feast of the privacy calculation gorgeous color.In the field of privacy calculation,There are two main schools of,A cryptography is the group,A machine learning is the group.Schools on cryptography cryptography knowledge,Based on the theory of cryptography,Combining with engineering practice to explore,Main application direction in the field of multiparty secure computing;Machine learning schools mainly relying on the traditional machine learning and depth,And combining with the relevant theory of cryptography and distributed parallel computing scheme to explore,Main application direction in the field of federal study.
The following to make some development Suggestions respectively for the two schools.
机器学习从业者(The federal study direction): 平台方向:Suggested that machine learning practitioners,To master relevant knowledge of cryptography(base one、Base 2 and base five),And grasp the underlying principle of the algorithm,From the underlying implementation to encrypt privacy protection in. 算法方向:Based on the learning platform,The federal study modeling,If it's just for business modeling,In fact, the nature and search widely pushing scenarios such as little difference algorithm engineer.
Cryptographers: If it is not from the federal study direction,Basic need to graspML技术,Ability to expand the base four.
3.4 Privacy calculation model of development
The five big base technology are calculated in the privacy of the three directions to use,Is the cornerstone of the privacy computing system.But ordinary people difficult to to dabble in all areas,Recommended to master one or two door field he is good at,For the other areas gradually familiar with and understand,技术都是相通的,Some train of thought and ideas can reuse.
从上面的描述中,It can be seen that privacy calculation is a very multidisciplinary cross field,Really to do the top is very difficult,Because the current work mode of the basic are professionals,There are very few generalists,The benefits of professionals have a specialist,To have the benefits of a liberal,Can the meaning of the generalist is the multidisciplinary cross field,在探索的过程中,Based on the more comprehensive、More in line with the actual situation of considerations,Design the most feasible、The most elegant solution,Privacy will calculate the organic combination of multiple components,Blossom a maximum value.
Can the meaning of the generalist is the multidisciplinary cross field,在探索的过程中,Based on the more comprehensive、More in line with the actual situation of considerations,Design the most feasible、The most elegant solution,Privacy will calculate the organic combination of multiple components,Blossom a maximum value,The difficulty lies in how to do it to every field in. The advantage of the know-how is that for a limited time,For a particular field to deepen to strengthen,Can make breakthrough in a bit.The difficulty lies in the need of interdisciplinary,会比较麻烦,Can't good design scheme of whole,This time need a number of interdisciplinary talents collective wisdom.
But I'm strongly opposed to a liberal don't specifically,Generalists can also every field is very deep,这个取决于个人,Also depends on the time+勤奋.Because I was in the Internet industry for more than ten years,across multiple fields,So I to own request is to do collection professionals in various fields,That is, generalist.
Kung fu is the most stupid,Can to go more robust,So suggest everyone don't put limits on yourself,At the same time of master a door,Other gradually for both,Especially the company leaders of privacy to calculate,Need more from the depth and breadth of knowledge to strengthen.And then design a more feasible、更加优雅、A more efficient solution,Promote the development of privacy computing.
*Actually no matter what direction is,All must be refined,Based on the spirit of bear hardships,脚踏实地,A solid step by step through each hill,At the end of the day looking back,the road traveled,皆是坦途!与诸君共勉!
*
五 The public, the article introduction
*In public, has accumulated a lot of article,And calculated according to the privacy「Five bases、三大方向」进行分类阐述,The article number in the face of the public for navigation summary,方便大家阅读,To accelerate the development of privacy computing industry together,For privacy computing to contribute a little.
*
base one:隐私计算基础组件
base two:传统的安全机制
Pedestal three:机器学习能力
机器学习算法
Vernacular machine learning series of(一)基础概念 Vernacular of machine learning(二)感知机 Vernacular of machine learning(三)线性回归 Vernacular of machine learning(四)逻辑斯蒂回归 Vernacular of machine learning(五)梯度下降法 「白话机器学习-最优化方法-牛顿法」 白话机器学习-卷积神经网络CNN 白话机器学习-深度神经网络RNN 白话机器学习-长短期记忆网络LSTM 白话机器学习-Loop neural network fromRNN、LSTM到GRU 白话机器学习-Encoder-Decoder框架 白话机器学习-Attention 白话机器学习-Self Attention 白话机器学习-Transformer 图神经网络概述
机器学习框架
深度学习框架-The exploration of super model of distributed training(一) 深度学习框架Tensorflow系列之(一)开发环境部署 深度学习框架TensorFlow系列-OP开发 深度学习框架TensorFlow系列-基础概念 深度学习框架Tensorflow系列之-数据流图 深度学习框架TensorFlow系列之(三)Basic concept framework of vector data carrier of tensorTensor Series of deep learning framework(三)Tensor related operations 深度学习框架TensorFlow系列-Single machine programming framework 深度学习框架TensorFlow系列之(五)优化器(一)
base four:工程架构
Pedestal five:Mathematics and cryptography
安全多方计算
联邦学习-Sample alignment of privacyPSI算法 - The federal study inside and secure multi-party computation involves privacy set intersectionPSI 安全多方计算-匿踪查询
联邦学习
纵向联邦学习 联邦学习-Sample alignment of privacyPSI算法 - The federal study inside and secure multi-party computation involves privacy set intersectionPSI 联邦学习-线性回归 联邦学习-安全树模型 SecureBoost系列之(一)Desicion Tree 联邦学习-安全树模型 SecureBoost系列之(二)- 集成学习 联邦学习-安全树模型 SecureBoost系列之(三)XGBoost 联邦学习-安全树模型 SecureBoost系列之(四)终章 纵向联邦学习-Neural network model of privacy protection technology
横向联邦学习
六 番外篇
*个人介绍:杜宝坤,隐私计算行业从业者,从0到1带领团队构建了京东的联邦学习解决方案9N-FL,同时主导了联邦学习框架与联邦开门红业务. 框架层面:实现了电商营销领域支持超大规模的工业化联邦学习解决方案,支持超大规模样本PSI隐私对齐、安全的树模型与神经网络模型等众多模型支持. 业务层面:实现了业务侧的开门红业务落地,开创了新的业务增长点,产生了显著的业务经济效益. 个人比较喜欢学习新东西,乐于钻研技术.基于从全链路思考与决策技术规划的考量,研究的领域比较多,从工程架构、大数据到机器学习算法与算法框架均有涉及.欢迎喜欢技术的同学和我交流,邮箱:「[email protected]」
*
七 公众号导读
自己撰写博客已经很长一段时间了,由于个人涉猎的技术领域比较多,所以对高并发与高性能、分布式、传统机器学习算法与框架、深度学习算法与框架、密码安全、隐私计算、联邦学习、大数据等都有涉及.主导过多个大项目包括零售的联邦学习,社区做过多次分享,另外自己坚持写原创博客,多篇文章有过万的阅读.公众号「秃顶的码农」大家可以按照话题进行连续阅读,里面的章节我都做过按照学习路线的排序,话题就是公众号里面下面的标红的这个,大家点击去就可以看本话题下的多篇文章了,比如下图(话题分为:一、隐私计算 二、联邦学习 三、机器学习框架 四、机器学习算法 五、高性能计算 六、广告算法 七、程序人生),知乎号同理关注专利即可.
一切有为法,如梦幻泡影,如露亦如电,应作如是观.
边栏推荐
- 测试技术:关于上下文驱动测试的总结
- 请你说一下final关键字以及static关键字
- 被领导拒绝涨薪申请,跳槽后怒涨9.5K,这是我的心路历程
- uniapp sharing function - share to friends group chat circle of friends effect (sorting)
- node中package解析、npm 命令行npm详解,node中的common模块化,npm、nrm两种方式查看源和切换镜像
- 大师教你3D实时角色制作流程,游戏建模流程分享
- 【无标题】
- 为何越来越多人选择进入软件测试行业?深度剖析软件测试的优势...
- The role of the annotation @ EnableAutoConfiguration and how to use
- Community Sharing|Tencent Overseas Games builds game security operation capabilities based on JumpServer
猜你喜欢
文献阅读十——Detect Rumors on Twitter by Promoting Information Campaigns with Generative Adversarial Learn
The market value of 360 has evaporated by 390 billion in four years. Can government and enterprise security save lives?
Basic web in PLSQL
什么是次世代建模(附学习资料)
从单体架构迁移到 CQRS 后,我觉得 DDD 并不可怕
kernel hung_task死锁检测机制原理实现
【手撕AHB-APB Bridge】~ AMBA总线 之 AHB
Flutter启动流程(Skia引擎)介绍与使用
【字符串函数内功修炼】strcpy + strcat + strcmp(一)
App测试和Web测试的区别
随机推荐
情人节---快来学习一下程序员的专属浪漫吧
Xiaohei's leetcode journey: 95. Longest substring with at least K repeating characters
当panic或者die被执行时,或者发生未定义指令时,如何被回调到
一点点读懂cpufreq(一)
怎么将自己新文章自动推送给自己的粉丝(巨简单,学不会来打我)
生产者消费者问题
Laravel 实现redis分布式锁
线性DP(下)
加解密在线工具和进制转化在线工具
2022年华数杯数学建模
Since a new byte of 20K came out, I have seen what the ceiling is
[Cultivation of internal skills of string functions] strcpy + strcat + strcmp (1)
没有这些「伪需求」,产品经理的 KPI 怎么完成?
3年,从3K涨薪到20k?真是麻雀啄了牛屁股 — 雀食牛逼呀
TypeScript - the use of closure functions
【手撕AHB-APB Bridge】~ AMBA总线 之 AHB
应用联合、体系化推进。集团型化工企业数字化转型路径
NebulaGraph v3.2.0 Release Note,对查询最短路径的性能等多处优化
建模师经验分享:模型学习方法
深度|医疗行业勒索病毒防治解决方案