当前位置:网站首页>On the confirmation of original data assets
On the confirmation of original data assets
2020-11-08 11:26:00 【osc_vnopm】
After the development of data element market , Naturally, a lot of data assets will be formed . On the macro , Data elements value flow , The process of forming data assets is shown below . The timing of data assets on the balance sheet , It mainly needs to solve such problems as right confirmation 、 pricing 、 Trading and measurement . With the gradual implementation and implementation of relevant policies and supporting laws and regulations, the research on these aspects has become increasingly active .
This paper makes some preliminary analysis and Research on the confirmation of data assets right . The method used is , Building a market for simple data elements , Build some core concepts and analytical frameworks . And then apply these core concepts 、 The framework analyzes some key issues of data assets right confirmation , Put forward some solutions and problems that need further study .
Imagine a market for simple data elements It's made up of the primary market 、 The secondary market consists of . There are two sellers
Each has two independent data sources
The original data set of
After simple processing , Form data assets separately
, Enter the primary market to trade . buyers Using different strategies to enter the market to trade . among , buyers C Just simply buy
After their own consumption ; buyers D purchase
after , By processing two data assets
Output a new dataset
And eventually form data assets
, Enter the secondary market to trade . market
The graph is as follows :
The reason for setting up primary and secondary markets , This is mainly because of the confirmation of the ownership of the data assets generated by the original data set , It is quite different from the data assets formed after processing the original data set . therefore , The primary market is set to trade data assets generated from the original data set ; The secondary market deals with the data assets produced by the primary market data assets after processing .
The generation of data assets requires a series of processes and paths , Including which data source to get the original record from (Records)、 What transport communication network is used to transfer records to records / Storage facilities , And after cleaning 、 mark 、 Synthesis and a series of processes , Eventually, it's a deliverable asset . For simple , This paper calls it data asset production chain .
The characteristics of the data asset production chain are made up of data “5V1P” Characteristics determine .“5V1P” It refers to the amount of data (Volume)、 Speed (Velocity)、 type (Variety)、 variability (Variability)、 accuracy (Veracity) And data sources (Provenance). In general , The delivery of data assets is not one-off , It's a continuous dynamic process .
When data assets enter the market , If it wasn't for one-off consumption , How to use these data assets is beyond the seller's control . Follow up buyers in order to better use the data , There is bound to be an initial source of data 、 And how to deal with the historical evolution needs more understanding and grasp .
From the diagram above , It can be seen that , When data assets enter the market , It's going to be processed over and over again 、 Reprocessing 、 Generate new data assets 、 This is the iterative process of re-entry . To ensure that the value of each data asset is maintained throughout the circulation process , We need the integrity of its production chain 、 Consistency and accuracy ( hereinafter referred to as “ Sexuality ”) Take the necessary measures to guarantee . otherwise , The value of data assets is not guaranteed to buyers .
thus , The market is bound to demand that data asset owners not only need static control , Also need to be able to dynamically control the production chain , That is to say, to be able to dominate and decide “ The purpose of production activities 、 object 、 methods 、 Methods and results ”. Limited to space , This paper only discusses the primary market , The problem of confirming the rights of data assets generated from the original data set .
One 、 Data assets in the primary market are confirmed
Data assets in the primary market From the original data set
Generate . Its production chain can be formally expressed as :
General , Data assets
And datasets
Although it's two different levels of concepts , but
The data in is A subset of , Expressed as
Usually , The owner of the data set is consistent with the owner of the data asset .
data source yes IoT Or a node in a sensor network ( equipment ), It will sensor /IoT Of “ perception ”( For simple , It can be recorded as a function f) Encoded as a string of data bytes , And is recorded and stored in the medium . according to
This same path records and stores the data set , It's a data set
1、 Data birthplace
Data sets The data in is raw data , They don't exist naturally , It's generated . In order to solve the need of subsequent confirmation of rights , The place where the original data is first recorded and stored is called the birthplace of the data (DBP:Data Birth Place), And can be recorded for the first time by / Stored device information 、 A combination of geographic information and network address information (DBP-ID). That is, any data set has a DBP-ID With the corresponding . The place of birth is a very important piece of evidence to confirm ownership of a dataset , In the later analysis , You will also see , It's also a very critical foundation for building the entire data market .
2、 Data birth certificate
In order to prove that a data set is composed of some data source and function f Generated , It can be done by issuing a data birth certificate (DBC:Data Brith Certification) The way to achieve . This is a very important measure to ensure consistency in the process of data set generation . because , If the data source or function f change , So the data is not what it used to be .
Data birth certificate is to verify the consistency and invariance of the generating path of the original data set , That is, the authentication data set The data in is all from the data source
And the function f Generate , namely
Symbol
For consistency and invariance .
The data birth certificate is issued by a third party . The certification body that issued the certificate (Issuer) It can be centralized , It can also be an alliance . Theoretically , When the dataset produces a new batch of raw data , You should apply to the certification authority for this batch of data DBC.
This paper is to simplify , Suppose the entire data set is in its lifecycle , Do not change data sources and functions , therefore , Do it once DBC Certification is enough . thus , There is at least one in any dataset DBC With the corresponding . The process diagram is as follows :
3、 Production chain status
From the above , Data sets There is one DBP-ID、DBC With the corresponding , That is to say, it can be used for such a production chain state
Create a description :
{ datasetID: xxxxx
dataset name:
data birth place: DBP-ID;
data birth certificationID: xxxxxxxx
data source:
sensor device ID: xxxxxxxxx
sensor function:f
timestamp: xx-xx-xx
}
4、 Ownership confirmation
Such as A Data source 、DBP The owner of the device on , You can determine the dataset
、 Data assets
The owner of is A. But according to the previous analysis , thus , The process of confirming the right of the owner has not been completed yet .A As the owner , It has to be proved to the market that , Every time new data is generated , Data is still coming from the same data sources and functions , Promise and guarantee the data assets to produce the chain of production “ Sexuality ”, Otherwise, there is no proof that A The right of control as the owner is valid , It can not be determined as the owner .
Confirm the ownership of data assets , In fact, the owner is required to be able to 、 object 、 methods 、 Methods and results , That is, the production chain “ Sexuality ” The ability to control is effectively verified . that , How to achieve the above-mentioned task goal of confirming rights ?
Back to the simple market above , In order to complete the
To confirm the ownership of ,A There's no way to prove it , It needs some infrastructure support to complete . To illustrate , The author simply constructs a right confirmation infrastructure ( The schematic diagram is as follows ).
First , Between the data source and the birthplace , Using secure trusted computing environment (TEE). And use zero knowledge proof in the data source (ZKP) The way , Proof is written to the dataset Data in :1) It all comes from
;2) Recording and storing for the first time . thus , You can build datasets and DBP-ID The consistent correspondence of .
secondly , Throughout the life cycle of the data set , Whenever new data is generated , Just apply for a birth certificate DBC. Each block of the dataset has DBC. And map the data of the dataset to when , Will also be DBC Mapping together .
Last , Put data assets Real time status information of production chain
Write to blockchain .
With these three infrastructures , At the moment Yes
When the ownership is confirmed , Only the following steps are needed to confirm the ownership of A:
1) Data assets Of all blocks of data DBC Agreement ;
2) Data blocks all come from the same data birthplace DBP;
3) Production chain status is consistent , namely
4) Data source device 、DBP The ownership of the equipment and software is A.
The above simplified discussion , It is mainly to facilitate the establishment of basic core concepts and analysis framework . Next , We apply these basic core concepts and frameworks , This paper gives a brief discussion on the confirmation of the original data assets generated by application services .
Two 、 Raw data assets generated by application services
By application service ( hereinafter referred to as “App”) Generated raw data assets , It means that the original dataset was born in a App in . Data sources perceive subjects with civil rights , This is collectively referred to as the user (User).
One App We can think of it as a service A collection of components
, namely . For simple , We assume that the perception function only includes User Here it is App Using different services on
Generated behavioral data , It can be expressed as
Application service App By the provider (SP) Provide . The user's behavior data forms a data set
, And form data assets
, The formal representation and schematic diagram of its production chain are as follows :
For data assets To confirm the right of , It mainly needs to investigate the application service usage agreement between users and service providers (Service agreement). In such a scenario , It can be simply understood that users and service providers in accordance with the agreement , Generated raw data assets . Confirmation of rights should be agreed in accordance with the agreement . So , The service provider should notarize these agreements , And tell buyers of data assets .
therefore , In the state description information of the production chain , Need to add notarization status . Because the specific terms of each user's service agreement may be different , therefore , Notarization needs to maintain a dynamic scene . In terms of efficiency , This kind of notarization mostly uses verifiable unilateral privacy computation to solve . It is impossible to adopt the traditional mode of third-party notarization . thus , We are based on the above framework , Build a schematic diagram of right confirmation, as shown in the figure . thus , Can carry on the effective confirmation right .
Conclusion
The author thinks , The ownership confirmation of original data assets is the cornerstone of the whole data element market . because , If the property rights of data assets generated from raw data cannot be clearly defined in the primary market , So once the data is in circulation , The subsequent confirmation of rights will become very complicated 、 Inefficiency and chaos , Make the market finally fall into the plight of unsustainable operation . therefore , It is necessary to build a clear property right 、 Effective operation of data elements primary market . Building an efficient infrastructure for right determination , Straighten out the relationship between property rights at the source .
meanwhile , Because of the data 5V1P characteristic , Which determines the production chain of data assets “ Sexuality ” Importance . therefore , The core of ownership confirmation is to control and decide the owner “ The purpose of production activities 、 object 、 methods 、 Methods and results ” The identification of . And to achieve that goal , It can't be accomplished only by the perfection of theory and legal system , We must rely on a certain supporting infrastructure to achieve .
reference
This article refers to big data .
Zhangjialin ,《 Data is valuable —— Research on data asset pricing 》,2019
“ The father of big data ” Victor · Maier · Schoenberg .
Central government on data elements 、 Data element market construction documents and data related laws 、 A series of regulations .
NIST 《 Big data reference architecture 》
edit : Wang Jing
proofreading : Lin Yilin
版权声明
本文为[osc_vnopm]所创,转载请带上原文链接,感谢
边栏推荐
- Automatically generate RSS feeds for docsify
- The young generation of winner's programming life, the starting point of changing the world is hidden around
- It's 20% faster than python. Are you excited?
- python基本语法 变量
- Oops, the system is under attack again
- Get PMP certificate at 51CTO College
- 2020-11-05
- Entry level! Teach you how to develop small programs without asking for help (with internet disk link)
- Windows10关机问题----只有“睡眠”、“更新并重启”、“更新并关机”,但是又不想更新,解决办法
- How to write a resume and project
猜你喜欢
随机推荐
为 Docsify 自动生成 RSS 订阅
仅用六种字符来完成Hello World,你能做到吗?
“1024”征文活动结果新鲜出炉!快来看看是否榜上有名?~~
浅谈单调栈
2018中国云厂商TOP5:阿里云、腾讯云、AWS、电信、联通 ...
Xamarin deploys IOS from scratch Walterlv.CloudKeyboard application
Top 5 Chinese cloud manufacturers in 2018: Alibaba cloud, Tencent cloud, AWS, telecom, Unicom
Xamarin 从零开始部署 iOS 上的 Walterlv.CloudKeyboard 应用
运维人员常用到的 11 款服务器监控工具
还不快看!对于阿里云云原生数据湖体系全解读!(附网盘链接)
PCIe 枚举过程
Enabling education innovation and reconstruction with science and technology Huawei implements education informatization
Oops, the system is under attack again
入门级!教你小程序开发不求人(附网盘链接)
Rust: command line parameter and environment variable operation
Service architecture and transformation optimization process of e-commerce trading platform in mogujie (including ppt)
Share the experience of passing the PMP examination
C语言I博客作业03
阿里教你深入浅出玩转物联网平台!(附网盘链接)
供货紧张!苹果被曝 iPhone 12 电源芯片产能不足