当前位置:网站首页>On the confirmation of original data assets
On the confirmation of original data assets
2020-11-08 11:26:00 【osc_vnopm】
After the development of data element market , Naturally, a lot of data assets will be formed . On the macro , Data elements value flow , The process of forming data assets is shown below . The timing of data assets on the balance sheet , It mainly needs to solve such problems as right confirmation 、 pricing 、 Trading and measurement . With the gradual implementation and implementation of relevant policies and supporting laws and regulations, the research on these aspects has become increasingly active .
This paper makes some preliminary analysis and Research on the confirmation of data assets right . The method used is , Building a market for simple data elements , Build some core concepts and analytical frameworks . And then apply these core concepts 、 The framework analyzes some key issues of data assets right confirmation , Put forward some solutions and problems that need further study .
Imagine a market for simple data elements It's made up of the primary market 、 The secondary market consists of . There are two sellers Each has two independent data sources The original data set of After simple processing , Form data assets separately
, Enter the primary market to trade . buyers Using different strategies to enter the market to trade . among , buyers C Just simply buy After their own consumption ; buyers D purchase after , By processing two data assets Output a new dataset And eventually form data assets , Enter the secondary market to trade . market The graph is as follows :
The reason for setting up primary and secondary markets , This is mainly because of the confirmation of the ownership of the data assets generated by the original data set , It is quite different from the data assets formed after processing the original data set . therefore , The primary market is set to trade data assets generated from the original data set ; The secondary market deals with the data assets produced by the primary market data assets after processing .
The generation of data assets requires a series of processes and paths , Including which data source to get the original record from (Records)、 What transport communication network is used to transfer records to records / Storage facilities , And after cleaning 、 mark 、 Synthesis and a series of processes , Eventually, it's a deliverable asset . For simple , This paper calls it data asset production chain .
The characteristics of the data asset production chain are made up of data “5V1P” Characteristics determine .“5V1P” It refers to the amount of data (Volume)、 Speed (Velocity)、 type (Variety)、 variability (Variability)、 accuracy (Veracity) And data sources (Provenance). In general , The delivery of data assets is not one-off , It's a continuous dynamic process .
When data assets enter the market , If it wasn't for one-off consumption , How to use these data assets is beyond the seller's control . Follow up buyers in order to better use the data , There is bound to be an initial source of data 、 And how to deal with the historical evolution needs more understanding and grasp .
From the diagram above , It can be seen that , When data assets enter the market , It's going to be processed over and over again 、 Reprocessing 、 Generate new data assets 、 This is the iterative process of re-entry . To ensure that the value of each data asset is maintained throughout the circulation process , We need the integrity of its production chain 、 Consistency and accuracy ( hereinafter referred to as “ Sexuality ”) Take the necessary measures to guarantee . otherwise , The value of data assets is not guaranteed to buyers .
thus , The market is bound to demand that data asset owners not only need static control , Also need to be able to dynamically control the production chain , That is to say, to be able to dominate and decide “ The purpose of production activities 、 object 、 methods 、 Methods and results ”. Limited to space , This paper only discusses the primary market , The problem of confirming the rights of data assets generated from the original data set .
One 、 Data assets in the primary market are confirmed
Data assets in the primary market From the original data set Generate . Its production chain can be formally expressed as : General , Data assets And datasets Although it's two different levels of concepts , but
The data in is A subset of , Expressed as Usually , The owner of the data set is consistent with the owner of the data asset .
data source yes IoT Or a node in a sensor network ( equipment ), It will sensor /IoT Of “ perception ”( For simple , It can be recorded as a function f) Encoded as a string of data bytes , And is recorded and stored in the medium . according to This same path records and stores the data set , It's a data set
1、 Data birthplace
Data sets The data in is raw data , They don't exist naturally , It's generated . In order to solve the need of subsequent confirmation of rights , The place where the original data is first recorded and stored is called the birthplace of the data (DBP:Data Birth Place), And can be recorded for the first time by / Stored device information 、 A combination of geographic information and network address information (DBP-ID). That is, any data set has a DBP-ID With the corresponding . The place of birth is a very important piece of evidence to confirm ownership of a dataset , In the later analysis , You will also see , It's also a very critical foundation for building the entire data market .
2、 Data birth certificate
In order to prove that a data set is composed of some data source and function f Generated , It can be done by issuing a data birth certificate (DBC:Data Brith Certification) The way to achieve . This is a very important measure to ensure consistency in the process of data set generation . because , If the data source or function f change , So the data is not what it used to be .
Data birth certificate is to verify the consistency and invariance of the generating path of the original data set , That is, the authentication data set The data in is all from the data source And the function f Generate , namely Symbol For consistency and invariance .
The data birth certificate is issued by a third party . The certification body that issued the certificate (Issuer) It can be centralized , It can also be an alliance . Theoretically , When the dataset produces a new batch of raw data , You should apply to the certification authority for this batch of data DBC.
This paper is to simplify , Suppose the entire data set is in its lifecycle , Do not change data sources and functions , therefore , Do it once DBC Certification is enough . thus , There is at least one in any dataset DBC With the corresponding . The process diagram is as follows :
3、 Production chain status
From the above , Data sets There is one DBP-ID、DBC With the corresponding , That is to say, it can be used for such a production chain state Create a description :
{ datasetID: xxxxx
dataset name:
data birth place: DBP-ID;
data birth certificationID: xxxxxxxx
data source:
sensor device ID: xxxxxxxxx
sensor function:f
timestamp: xx-xx-xx
}
4、 Ownership confirmation
Such as A Data source 、DBP The owner of the device on , You can determine the dataset 、 Data assets The owner of is A. But according to the previous analysis , thus , The process of confirming the right of the owner has not been completed yet .A As the owner , It has to be proved to the market that , Every time new data is generated , Data is still coming from the same data sources and functions , Promise and guarantee the data assets to produce the chain of production “ Sexuality ”, Otherwise, there is no proof that A The right of control as the owner is valid , It can not be determined as the owner .
Confirm the ownership of data assets , In fact, the owner is required to be able to 、 object 、 methods 、 Methods and results , That is, the production chain “ Sexuality ” The ability to control is effectively verified . that , How to achieve the above-mentioned task goal of confirming rights ?
Back to the simple market above , In order to complete the To confirm the ownership of ,A There's no way to prove it , It needs some infrastructure support to complete . To illustrate , The author simply constructs a right confirmation infrastructure ( The schematic diagram is as follows ).
First , Between the data source and the birthplace , Using secure trusted computing environment (TEE). And use zero knowledge proof in the data source (ZKP) The way , Proof is written to the dataset Data in :1) It all comes from ;2) Recording and storing for the first time . thus , You can build datasets and DBP-ID The consistent correspondence of .
secondly , Throughout the life cycle of the data set , Whenever new data is generated , Just apply for a birth certificate DBC. Each block of the dataset has DBC. And map the data of the dataset to when , Will also be DBC Mapping together .
Last , Put data assets Real time status information of production chain Write to blockchain .
With these three infrastructures , At the moment Yes When the ownership is confirmed , Only the following steps are needed to confirm the ownership of A:
1) Data assets Of all blocks of data DBC Agreement ;
2) Data blocks all come from the same data birthplace DBP;
3) Production chain status is consistent , namely
4) Data source device 、DBP The ownership of the equipment and software is A.
The above simplified discussion , It is mainly to facilitate the establishment of basic core concepts and analysis framework . Next , We apply these basic core concepts and frameworks , This paper gives a brief discussion on the confirmation of the original data assets generated by application services .
Two 、 Raw data assets generated by application services
By application service ( hereinafter referred to as “App”) Generated raw data assets , It means that the original dataset was born in a App in . Data sources perceive subjects with civil rights , This is collectively referred to as the user (User).
One App We can think of it as a service A collection of components , namely . For simple , We assume that the perception function only includes User Here it is App Using different services on Generated behavioral data , It can be expressed as Application service App By the provider (SP) Provide . The user's behavior data forms a data set , And form data assets , The formal representation and schematic diagram of its production chain are as follows :
For data assets To confirm the right of , It mainly needs to investigate the application service usage agreement between users and service providers (Service agreement). In such a scenario , It can be simply understood that users and service providers in accordance with the agreement , Generated raw data assets . Confirmation of rights should be agreed in accordance with the agreement . So , The service provider should notarize these agreements , And tell buyers of data assets .
therefore , In the state description information of the production chain , Need to add notarization status . Because the specific terms of each user's service agreement may be different , therefore , Notarization needs to maintain a dynamic scene . In terms of efficiency , This kind of notarization mostly uses verifiable unilateral privacy computation to solve . It is impossible to adopt the traditional mode of third-party notarization . thus , We are based on the above framework , Build a schematic diagram of right confirmation, as shown in the figure . thus , Can carry on the effective confirmation right .
Conclusion
The author thinks , The ownership confirmation of original data assets is the cornerstone of the whole data element market . because , If the property rights of data assets generated from raw data cannot be clearly defined in the primary market , So once the data is in circulation , The subsequent confirmation of rights will become very complicated 、 Inefficiency and chaos , Make the market finally fall into the plight of unsustainable operation . therefore , It is necessary to build a clear property right 、 Effective operation of data elements primary market . Building an efficient infrastructure for right determination , Straighten out the relationship between property rights at the source .
meanwhile , Because of the data 5V1P characteristic , Which determines the production chain of data assets “ Sexuality ” Importance . therefore , The core of ownership confirmation is to control and decide the owner “ The purpose of production activities 、 object 、 methods 、 Methods and results ” The identification of . And to achieve that goal , It can't be accomplished only by the perfection of theory and legal system , We must rely on a certain supporting infrastructure to achieve .
reference
This article refers to big data .
Zhangjialin ,《 Data is valuable —— Research on data asset pricing 》,2019
“ The father of big data ” Victor · Maier · Schoenberg .
Central government on data elements 、 Data element market construction documents and data related laws 、 A series of regulations .
NIST 《 Big data reference architecture 》
edit : Wang Jing
proofreading : Lin Yilin
版权声明
本文为[osc_vnopm]所创,转载请带上原文链接,感谢
边栏推荐
- Windows10关机问题----只有“睡眠”、“更新并重启”、“更新并关机”,但是又不想更新,解决办法
- 供货紧张!苹果被曝 iPhone 12 电源芯片产能不足
- How to write a resume and project
- 2018中国云厂商TOP5:阿里云、腾讯云、AWS、电信、联通 ...
- Understanding design patterns
- 华为云重大变革:Cloud&AI 升至华为第四大 BG ,火力全开
- python基本语法 变量
- Oops, the system is under attack again
- 渤海银行百万级罚单不断:李伏安却称治理完善,增速呈下滑趋势
- PMP考试通过心得分享
猜你喜欢
Win10 Terminal + WSL 2 安装配置指南,精致开发体验
When kubernetes encounters confidential computing, see how Alibaba protects the data in the container! (Internet disk link attached)
It's worth seeing! EMR elastic low cost offline big data analysis best practice (with network disk link)
AMD Zen3首发评测:频率超5GHz,IPC提升不止19%,这次真的Yes了 - 知乎
分布式文档存储数据库之MongoDB基础入门
笔试面试题目:求丢失的猪
维图PDMS切图软件
2018中国云厂商TOP5:阿里云、腾讯云、AWS、电信、联通 ...
Top 5 Chinese cloud manufacturers in 2018: Alibaba cloud, Tencent cloud, AWS, telecom, Unicom
A scheme to improve the memory utilization of flutter
随机推荐
2020-11-05
年轻一代 winner 的程序人生,改变世界的起点藏在身边
入门级!教你小程序开发不求人(附网盘链接)
个人目前技术栈
Xamarin deploys IOS from scratch Walterlv.CloudKeyboard application
WLAN 直连(对等连接或 P2P)调研及iOS跨平台调研
Shell uses. Net objects to send mail
VC++指定目录下文件按时间排序输出
阿里教你深入浅出玩转物联网平台!(附网盘链接)
2018中国云厂商TOP5:阿里云、腾讯云、AWS、电信、联通 ...
C language I blog assignment 03
211 postgraduate entrance examination failed, stay up for two months, get the byte offer! [face to face sharing]
阿里出品!视觉计算开发者系列手册(附网盘链接)
Istio traffic management -- progress gateway
笔试面试题目:求丢失的猪
2018中国云厂商TOP5:阿里云、腾讯云、AWS、电信、联通 ...
Recommend an economic science video, very valuable!
临近双11,恶补了两个月成功拿下大厂offer,跳槽到阿里巴巴
Adobe media encoder /Me 2021软件安装包(附安装教程)
漫画|讲解一下如何写简历&项目