当前位置:网站首页>Data warehouse (1) what is data warehouse and what are the characteristics of data warehouse
Data warehouse (1) what is data warehouse and what are the characteristics of data warehouse
2022-06-26 08:55:00 【Zhang Fei's pig】
The original link of this article : What is a data warehouse , What are the characteristics of data warehouse
Data warehouse , It's called data warehouse for short , English name is Data Warehouse, It can be abbreviated as DW or DWH. Data warehouse , It's a decision-making process for all levels of the enterprise , A strategic set that provides support for all types of data . It's a single data store , Created for analytical reporting and decision support purposes . For businesses that need business intelligence , Provide guidance for business process improvement 、 Monitoring time 、 cost 、 Quality and control . Here we will introduce the data warehouse data development technology involved , Function of data warehouse , Characteristics of data warehouse, etc .
I simply make a metaphor , Data warehouse can be understood as a usage warehouse , Data is the goods in this warehouse , The developer of the data warehouse is the administrator of the warehouse , So a data warehouse is a way to manage data well , So that the data can be put in the warehouse , Easy BI、AI And other aspects of using data can better use the data in the warehouse , Make the data more valuable , Obviously, there are rules in a pile , Look for something in the tidy goods , It's more efficient than looking for things that haven't been sorted out .
Data warehouse is a decision support system (dss) And a structured data environment for online analytical application data sources . Data warehouse studies and solves the problem of obtaining information from database . The characteristics of data warehouse are subject oriented 、 Integration 、 Stability and time-varying .
Data warehouse , By bill, the father of data warehouse · Enmen (Bill Inmon) On 1990 in , The main function is still the online transaction processing of the organization through the information system (OLTP) A great deal of information accumulated over the years , Through the data warehouse theory unique data storage architecture , Do systematic analysis and sorting , To facilitate various analytical methods such as on-line analytical processing (OLAP)、 data mining (Data Mining) It's going on , And then support decision support system (DSS)、 In charge of information system (EIS) To create , Help decision-makers quickly and effectively from a large amount of information , Analyze valuable information , In order to facilitate decision-making and rapid response to changes in the external environment , Help build business intelligence (BI).
Bill, father of data warehouse · Enmen (Bill Inmon) stay 1991 Published in 2002 “Building the Data Warehouse”(《 Building a data warehouse 》) The definition proposed in the book is widely accepted —— Data warehouse (Data Warehouse) It's a theme oriented (Subject Oriented)、 Integrated (Integrated)、 Relatively stable (Non-Volatile)、 Reflect historical changes (Time Variant) Data set for , Used to support management decisions (Decision Making Support).
Characteristics of data warehouse :
- The data warehouse is themed ; The data organization of operational database is oriented to transaction processing tasks , The data in the data warehouse is organized according to a certain subject field . Topics refer to the key aspects that users care about when making decisions using data warehouse , A topic is usually related to multiple operational information systems .
- Data warehouse is integrated , The data of data warehouse comes from scattered operation data , Extract the required data from the original data , Process and integrate , Only after unification and integration can we enter the data warehouse ;
The data in the data warehouse is extracting the original scattered database data 、 After systematic processing on the basis of cleaning 、 Sum up and sort out , Inconsistencies in the source data must be eliminated , To ensure that the information in the data warehouse is the consistent global information about the whole enterprise .
The data of data warehouse is mainly used for enterprise decision analysis , The data operations involved are mainly data query , Once a certain data enters the data warehouse , In general, it will be retained for a long time , That is to say, there are a lot of query operations in data warehouse , But there are few modifications and deletions , It usually only needs to be loaded on a regular basis 、 Refresh .
Data in a data warehouse usually contains historical information , The system records the enterprise from a certain point in the past ( For example, the time when the data warehouse is applied ) Information to the current stages , Through this information , It can make quantitative analysis and forecast on the development process and future trend of the enterprise .- The data warehouse is not updatable , Data warehouse mainly provides data for decision analysis , The operations involved are mainly data query ;
- Data warehouses change over time , Traditional relational database system is more suitable for processing formatted data , Can better meet the needs of business processing . Stable data in read-only format , And it doesn't change over time .
- A summary of the . Operational data is mapped into formats available for decision making .
- The large capacity . Time series data sets are usually very large .
- Nonstandard .Dw Data can be and often is redundant .
- Metadata . Save the data that describes the data .
- data source . Data comes from internal and external non integrated operating systems .
Data warehouse , It's when there are a lot of databases , In order to further mine data resources 、 For the sake of decision-making , It's not “ Large databases ”. The purpose of data warehouse scheme construction , It is based on front-end query and analysis , Due to large redundancy , Therefore, the storage required is also large .
In the concrete practice , In order to better serve the data application , That is, for data analysis , Efficient development of data reports . Data warehouse often has the following characteristics :
- Efficient enough .
The analysis data of data warehouse is generally divided into days 、 Zhou 、 month 、 season 、 Years etc. , It can be seen that , The data with daily cycle requires the highest efficiency , requirement 24 Hours and even 12 Within hours , Customers can see yesterday's data analysis . Because some enterprises have a large amount of data every day , Poor design
The data warehouse of often has problems , Delay 1-3 Data can only be given in a day , Obviously not .- Data quality .
All kinds of information provided by data warehouse , Be sure to have accurate data , But because the data warehouse process is usually divided into multiple steps , Including data cleaning , load , Inquire about , Show, etc , A complex architecture will have more layers , Then, because the data source has dirty data or the code is not rigorous , Can cause data distortion , When the customer sees the wrong information, it may lead to the analysis of the wrong decision , Losses caused , Not benefits .- Extensibility .
The reason why the architecture design of some large-scale data warehouse systems is complex , Because of the future 3-5 Year scalability , In this case , In the future, you don't have to spend money to rebuild the data warehouse system , It can run stably . It is mainly reflected in the rationality of data modeling , There are more middle layers in the data warehouse scheme , Make the massive data stream have enough buffer , Not a lot of data , It won't work .
As can be seen from the introduction above , Data warehouse technology can wake up the data accumulated by enterprises for many years , Not only for enterprises to manage these massive data , And mining the potential value of data , Thus, it becomes one of the highlights of the operation and maintenance system of communication enterprises .
In a broad sense , The decision support system based on data warehouse is composed of three parts
: Data warehouse technology , Online analytical processing technology and data mining technology , Data warehouse technology is the core of the system , In later articles in this series , It will focus on data warehouse technology , This paper introduces the main technology of modern data warehouse and the main steps of data processing , Discuss how to use these technologies in the communication operation and maintenance system to help the operation and maintenance .- subject-oriented
The data organization of operational database is oriented to transaction processing tasks , Each business system is separated from each other , The data in the data warehouse is organized according to a certain subject field . The topic is corresponding to the application-oriented of traditional database , It's an abstract concept , It is to integrate the data in the enterprise information system at a higher level 、 The abstraction of categorizing and analyzing . Each topic corresponds to a macro analysis field . The data warehouse eliminates data that is useless for decision-making , Provides a concise view of a specific topic .
边栏推荐
- 读书笔记:SQL 查询中的SQL*Plus 替换变量(DEFINE变量)和参数
- WBC learning notes (II): practical application of WBC control
- OpenCV Learning notes iii
- Koa_ mySQL_ Integration of TS
- Record the problem yaml file contains Chinese message 'GBK' error
- The solution of positioning failure caused by framework jump
- OpenGL display mat image
- 软件工程-个人作业-提问回顾与个人总结
- Use of PCL
- Detailed explanation of traditional image segmentation methods
猜你喜欢

Trimming_ nanyangjx

Text to SQL model ----irnet

Digital image processing learning (II): Gaussian low pass filter

Yolov5进阶之五GPU环境搭建

直播回顾 | smardaten李鸿飞解读中国低/无代码行业研究报告:风向变了

Remote centralized control of distributed sensor signals using wireless technology

Koa_ mySQL_ Integration of TS

Implementation of ffmpeg audio and video player

Relation extraction model -- spit model

Exploration of webots and ROS joint simulation (I): software installation
随机推荐
在哪个软件上开户比较安全
Drawing with MATLAB (1)
1.17 daily improvement of winter vacation learning (frequency school and Bayesian school) and maximum likelihood estimation
XSS cross site scripting attack
WBC learning notes (II): practical application of WBC control
Convert verification code image to tfrecord file
编程训练7-日期转换问题
Yolov5进阶之二安装labelImg
Degree of freedom analysis_ nanyangjx
Intra class data member initialization of static const and static constexpr
1.27 pytorch learning
SRv6----IS-IS扩展
Corn image segmentation count_ nanyangjx
Yolov5进阶之三训练环境
Principle of playing card image segmentation
【程序的编译和预处理】
Bezier curve learning
Yolov5进阶之一摄像头实时采集识别
First character that appears only once
Using MySQL and Qt5 to develop takeout management system (I): environment configuration