当前位置:网站首页>Data asset management: how to manage the data assets of an enterprise?

Data asset management: how to manage the data assets of an enterprise?

2022-06-09 11:17:00 Big data V

f108156b55e1ca4d89027981874a14bc.gif

Reading guide : Let's talk to you today , Sorting and inventory of data assets !

author : Shi Xiufeng

source : Talking about data (ID:learning-bigdata)

3de462cf63ad6064f5fc644d8924332f.png

For enterprises , Asset count is not a new thing .

All enterprises regularly or irregularly assess the company's assets 、 Inventory is counted in whole or in part , So as to know the inventory quantity of the enterprise at the end of the period 、 value , And improve accordingly , strengthen management . Find out the inventory and usage of various assets through asset inventory , So as to reasonably arrange production and operation activities , Make full use of various properties and materials , Accelerate capital turnover , Improve asset efficiency . there “ assets ”, It refers to the fixed assets of the enterprise .

And data as a kind of “ Special assets ”, Included in the balance sheet of the enterprise , Sooner or later . Of course, data assets also need to be counted , Only the overall planning of data resources , To comb in an all-round way ,“ Find out where you are ”, In order to make the data better serve the business applications of the enterprise .

01 Why do I need to count data assets ?

“ Digitization ” In modern society , Has become one of the hottest topics , And data is the realization “ Digitization ” The basis of . Enterprises are in the process of promoting digitalization , The first problem encountered is “ No data available and no data available ”.

No data available , It is not true that the enterprise does not have any data , The contrary is , For some large enterprises, dozens of 、 Even hundreds of application systems , These systems actually deposit a lot of data , However, due to the lack of overall planning and comprehensive sorting of these data resources , As a result, enterprises do not know what data they have , How much data , Where is the data , And then the enterprise “ No data available ”

No data available , Because the data is scattered in various application systems , Lack of uniform data standards , The systems cannot communicate well with each other , Thus forming an information island , And the data quality level of each system is not uniform 、 There are different standards , Sensitive data is not effectively processed, etc . The existence of these problems , Leading to the process of enterprise digitalization , No data available .

Data asset inventory is one of the main means to solve the above problems , Counting the data owned by the enterprise will help the enterprise to understand the following problems :

  1. What data does the enterprise have ? Focus on data classification ;

  2. How much data does the enterprise have ? Focus on the stock of data 、 The incremental ;

  3. Where is the enterprise data stored ? Pay attention to the storage and access of data ;

  4. Who is managing the data of the enterprise ? Pay attention to the Department and person in charge of the data ;

  5. Identify what is important data , What are sensitive data ? Pay attention to the classification of data 、 Sharing conditions and scope .

02 Where does the data asset count start ?

Enterprise data is scattered in various heterogeneous systems 、 Even in the computers of business personnel , data structure 、 data type 、 Storage form 、 Sensitivity level 、 The importance varies , The whole looks like a tangled string , It's not easy to sort things out .

8cca3d98cc3f9a942513c161d510f308.png

Shear constantly , Richard also disorderly

It's not about separation

It's the inconsistencies of the enterprise 、 inaccurate 、 Incomplete , disorder 、 Dispersed 、 Tangled data

……

The counting of enterprise data assets begins with defining a reasonable counting plan !

1. Define the scope of data counting

The scope of data counting is generally defined from three perspectives :

  1. Organizational scope , That is, which organizations and departments should be covered by the inventory , for example : Group headquarters 、 The group + Subsidiary companies, etc .

  2. scope of business , That is, what business data should be counted , for example : Procurement business 、 Marketing business 、 HR business, etc ;

  3. System scope , That is, which application system data should be counted , for example :SCM System 、CRM System 、HR Systems, etc .

2. Identify the data counting personnel

Who will take the lead in data inventory , Who is responsible for cooperation 、 Who is responsible for the audit ? How much human resources need to be invested , How long will it take , Part time participation or full-time participation ? These issues need to be clearly defined in the inventory plan , And reach a consensus with relevant personnel .

3. Define the contents of data inventory

Data counting should be based on business needs , Determine what needs to be clarified , Such as :

  • Classification of data : purchase 、 marketing 、 production 、 financial 、 Personnel, etc

  • Structure of data : Structured data 、 Semi-structured data 、 Unstructured data, etc

  • Type of data : Basic data 、 Trading data 、 statistics 、 Time series data, etc

  • Data storage :SQL database 、 File store 、 Streaming data, etc ;

  • The sensitivity level of the data : The core 、 important 、 Generally wait

  • The sharing type of data : Do not share 、 Conditional sharing 、 Unconditional sharing, etc

  • Open type of data : Not open 、 Conditional openness 、 Unconditional opening, etc

  • The stock of data : How many entries 、 How much capacity, etc ;

4. Define the inventory plan

Data inventory should be carried out step by step in a planned way , For example, when does it start 、 What time does it end 、 The time of release needs to be clearly defined .

After clarifying the above four questions , Your data review journey can start !

34ee1cfca05e91758a50f060cb24ab32.png

03 Who should count the data assets ? 

As we all know , The inventory of fixed assets of an enterprise is generally led by the financial department , The management department and the user department of fixed assets shall cooperate to count and check , To ensure that the accounts are consistent with the facts .

And data as a special asset , It's hard to confirm power 、 Virtuality 、 Replicability is its main feature , This also creates some difficulties for data asset inventory . Data asset inventory , Who should take the lead , Who is responsible for cooperation 、 Who is responsible for the audit , This problem is not clear , It will be difficult to advance the data inventory work !

The principle of data asset inventory is “ Who makes , Who is responsible for ”,“ Who uses , Who is responsible for ”,“ Who manages , Who is responsible for ”, Generally speaking , The business department is the production department of the data , It is also the main user of data , and IT Departments are often responsible for data management .

Ideally , The inventory of data should be led by the business department , Because they are more familiar with their own data , Many textbooks and textbooks say so . But in the real world , We see that the data count is still based on IT Most departments take the lead .

“ Business is closer to data , More familiar with data , That is true ”, But business departments are often only familiar with the part they are responsible for , Lack of overall thinking and perspective . therefore , An inventory of data assets led by the business unit , Easy to cause “ elephant ”, There will be many problems in the process, which will make the inventory inefficient .

So I think , The inventory of enterprise data assets needs to find a person with overall thinking to make overall planning , Plan out the relevant principles of data inventory 、 Framework and blueprint , Define the contents of data counting , Develop a template for data counting , Then the business department that produces or uses the data will sort it out , Complete the data inventory . The coordinator can be IT department 、 Data management department , Or external data experts .

04 The basic method of data asset inventory

There are two basic methods to count data assets : Top down sorting and bottom-up inventory can help us sort out the data asset list or data asset directory of the enterprise . The two methods are used together , It constitutes two aspects of data asset inventory .

f85fddacd191e6c6cf8a7753dba80fff.png

1. Sort it out from top to bottom

Top down sorting is a way to sort data from a business perspective , Through the relevant system documents of the enterprise 、 Functional system 、 The business process 、 Comprehensive analysis of business documents , Decompose layer by layer , Sort out the three-level directory of data assets 、 Business attributes and related management attributes .

Three level directory , That is, the classification of data assets , It is the sorting and decomposition of enterprise data assets from a business perspective , for example : Data fields - Data topics - Data subtopic - Data objects ,( notes : The level 3 directory is not limited to level 3 , However, it is generally recommended to control within five levels ).

Business attributes , That is, business metadata used to describe data assets . As shown in the figure above , Common business attributes include : Data field 、 Data subject and other classification attributes , Data objects 、 Business definition 、 Business rules 、 Sensitivity level, etc .

Management properties , It is used to describe the management of data assets 、 maintain 、 Use related metadata . As shown in the figure above , Common management attributes include : administrative department 、 Management 、 Contact information 、 The update frequency 、 Last update time 、 Data sharing conditions, etc .( notes : From a business perspective , The management attributes of data assets may not be fully sorted out , This needs to be supplemented and improved in the technical inventory link )

2. Bottom up inventory

Another aspect of the data asset inventory is from a technical perspective , from IT System -- Database table -- Data structure , From the bottom up , Gradually define the system information items related to data assets ( Technical properties ).

Technical properties , That is, the technical metadata used to describe data assets . As shown in the figure above , Common technical attributes include : Source system 、 Database table 、 Field type 、 Field format 、 Value range 、 storage 、 Blood relationship, etc .

Last , Associate the data items in the directory sorted out from the business perspective with the system information items counted out from the technical perspective , Establish a mapping relationship between the two , Such a complete data resource directory is formed . Through the data asset catalog, you can see from multiple perspectives ( Business or IT) Search for data , And ensure that each data item in the directory can be in the real IT Found in the system .

05 The basic process of data asset counting

The sorting and inventory of enterprise data assets can generally be divided into the following five steps , Here's the picture :

ab74b653c32d467384fd32bad247cb05.png

1. Make an inventory plan

At this stage, the counting range needs to be determined 、 Inventory target 、 Inventory content 、 Stocktaking personnel 、 Time to plan ( The details have been explained above , No more details here );

2. Develop inventory template

This stage needs to be based on the inventory contents , Develop data sorting template and define data asset standard items . Internal training, publicity and implementation of inventory work , Relevant personnel are responsible for the scope of inventory 、 The goal is 、 Content, etc , Understand and learn how to use the data asset sorting template .

2ad34d7db0554df2e5eb6b6b4fefa023.png

▲ picture source : You know , Author Tan Xing

3. Data asset inventory

One side , Sort out and plan data resources from a business perspective , Include : Interpretation of system documents 、 Process form sorting 、 Identification of key data, etc , And define the classification system of data and the business attributes of data assets . On the other hand , Check the system data from a technical perspective , Include : System data exploration 、 data structure 、 Data stock 、 Data increment 、 Storage mode, etc , And define the technical attributes of data assets .

4. Review of inventory results

For the sorted data asset list 、 The core data model 、 Review and collect opinions on data distribution map and other achievements , And complete the revision of relevant achievements and problems according to the feedback .

5. Release and Application

Release of data asset inventory results , It is not just to publish the list of data assets by email or other means , Instead, a professional data asset management platform needs to be built , Implement the data asset directory through the platform , The data assets are divided into “ service ” In the form of , Realize the sharing of data assets in the enterprise , And external data opening .

06 Data directory VS Data asset catalog

An important result of the data asset inventory is “ Data asset catalog ”. What is the difference between a data catalog and a data asset catalog ?

essentially , Whether it's a data catalog or a data asset catalog , There are “ Dictionaries ” The meaning of , All for locating data , Interpreting data , And help users find data quickly . This is what the two directories have in common .

In project practice , Data directory refers to metadata management tools , For relevant data sources ( Business system database 、 Data warehouse 、 Data lake, etc ) To collect metadata , And form a data directory . Because the data collected directly are basically database table structures 、 Data flow 、ETL Script 、 Database operation logs and other technical metadata , therefore The data directory must have a certain technical foundation to understand , And its positioning is for technicians , for example :ETL The engineer 、BI The engineer 、 Development Engineer .

The difference between data asset catalogs is :

First of all , The data asset catalog is from a business perspective , Data resource system planning aiming at the data needs of stakeholders , for example : Definition of data business attributes , The division of data field , Construction of classification and grading system , Data sharing and open design are based on business use , In this way, a data category structure that business personnel can understand . therefore , In the process of combing and cataloging the whole catalog system , Business engagement is key , It is the guarantee for the use and promotion of the directory system .

second , The data asset directory needs to confirm the rights and responsibilities of each cataloged data resource , Clarify the management right of data assets 、 Right to use , And determine its sharing conditions and scope .

Third , The data asset catalog manages data assets , namely : Those used more frequently , Data that can bring value to the business . How to achieve it ? This requires the data asset directory to have “ tagging / mark ” The function of , By identifying the characteristics of the data 、 meaning 、 Data quality 、 Frequency of use , Use scenarios 、 Use objects to label data objects . The labeling method can be manual labeling , More advanced is through machine learning 、 Model training automatically classifies and labels data .

Last , Of course, metadata tools are also needed for the data asset catalog , Collect and manage technical metadata . And map through data relation , Map data asset directories to physical library tables and fields , So that the desired data can be found from multiple perspectives .

88902ba0593caec935aebc3bf6d32d2b.gif

Extended reading

94400a09c25dbc81d99bed1201a0ef99.png

Extended reading 《 A book on Data Governance 》

Dry goods go straight to

More exciting

Enter the following dialog box in the official account dialog box key word

See more quality content !

read  |  book  |  dried food   Make it clear  |  God operation  |  handy

big data  |  Cloud computing  |  database  | Python |  Reptiles  |  visualization

AI |  Artificial intelligence  |  machine learning  |  Deep learning  | NLP

5G |  Zhongtai  |  User portrait   mathematics  |  Algorithm   Number twin

According to statistics ,99% The big coffee is concerned about the official account

原网站

版权声明
本文为[Big data V]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206091028240902.html