当前位置:网站首页>About the two architectures of ETL (ETL architecture and ELT architecture)
About the two architectures of ETL (ETL architecture and ELT architecture)
2022-08-04 17:32:00 【Microservice mall technology sharing】
ETL, the abbreviation of English Extract-Transform-Load, is used to describe the process of extracting, transforming, and loading data from the source to the destination.The term ETL is more commonly used in data warehouses, but its objects are not limited to data warehouses.

ETL is an important part of building a data warehouse. The user extracts the required data from the data source, and after data cleaning, finally loads the data into the data warehouse according to the pre-defined data warehouse model.

ETL is mainly reflected in the following aspects in the process of conversion:
- Null value processing: It can capture the null value of the field, load it or replace it with other meaning data, and load it to different target libraries according to the null value of the field.
- Normalized data format: It can realize the definition of field format constraints. For data such as time, value, and character in the data source, the loading format can be customized.
- Split data: Fields can be decomposed according to business requirements.For example, the calling number is 861082585313-8148, the area code and telephone number can be decomposed.
- Verify data correctness: Lookup and split functions can be used to verify data.For example, the calling number is 861082585313-8148. After the area code and the phone number are decomposed, Lookup can be used to return the calling area recorded by the calling gateway or switch for data verification.
- Data replacement: For business factors, invalid data and missing data can be replaced.
- Lookup: Find missing data Lookup implements sub-queries and returns missing fields obtained by other means to ensure field integrity.
- Establish the primary and foreign key constraints of the ETL process: illegal data without dependencies can be replaced or exported to the wrong data file to ensure the loading of the unique records of the primary key.
The advantages of ETL architecture:
- ETL can share the load of the database system (using a separate hardware server)
- Compared with EL-T architecture, ETL can implement more complex data transformation logic
- ETL uses a separate hardware server..
- ETL has nothing to do with the underlying database data store.
ELT
In the ELT architecture, ELT is only responsible for providing a graphical interface to design business rules. The entire process of data processing flows between target and source databases. ELT coordinates related database systems to execute related applications and data.The processing process can be executed either on the source database side or on the target data warehouse side (mainly depending on the architecture design and data attributes of the system).When the ETL process needs to improve efficiency, it can be achieved by tuning the relevant database or changing the server that performs processing.General database vendors will strongly promote this kind of architecture, such as Oracle and Teradata are strongly promoting the ELT architecture.

Advantages of ELT architecture:
- ELT mainly realizes the scalability of the system through the database engine (especially when the data processing process is at night, the resources of the database engine can be fully utilized)
- ELT can keep all data in the database at all times, avoid data loading and exporting, thus ensuring efficiency and improving system monitorability.
- ELT can optimize parallel processing according to the distribution of data, and can optimize disk I/O by utilizing the inherent capabilities of the database.
- The scalability of ELT depends on the scalability of the database engine and its hardware server.
- It is generally not particularly difficult to obtain a 3 to 4 times efficiency improvement in the ETL process through performance tuning of the relevant database.
边栏推荐
- The use of QCompleter for Qt auto-completion
- Cholesterol-PEG-DBCO,CLS-PEG-DBCO,胆固醇-聚乙二醇-二苯基环辛炔科研试剂
- Thrift IDL示例文件
- 下一代 AutoAI:从模型为中心,到数据为中心
- 动态数组底层是如何实现的
- 面试官:可以谈谈乐观锁和悲观锁吗
- 荣耀互联对外开放,赋能智能硬件合作伙伴,促进全场景生态产品融合
- (1), the sequential storage structure of linear table chain storage structure
- 小程序笔记2
- R语言ggplot2可视化:使用ggpubr包的ggbarplot函数可视化柱状图、color参数指定柱状图的边框的色彩
猜你喜欢

《中国综合算力指数》《中国算力白皮书》《中国存力白皮书》《中国运力白皮书》在首届算力大会上重磅发出

】 【 LeetCode daily one problem - 540. The order of a single element of the array

启动项目(瑞吉外卖)

Cholesterol-PEG-Maleimide,CLS-PEG-MAL,胆固醇-聚乙二醇-马来酰亚胺一种修饰性PEG

基于层次分析法的“内卷”指数分析

44. 通配符匹配 ●●● & HJ71 字符串通配符 ●●

88.(cesium之家)cesium聚合图

Digital-intelligent supply chain management system for chemical manufacturing industry: build a smart supply system and empower enterprises to improve production efficiency

荣耀互联对外开放,赋能智能硬件合作伙伴,促进全场景生态产品融合

如何模拟后台API调用场景,很细!
随机推荐
【日记】mysql数据库连接池
88.(cesium之家)cesium聚合图
R语言ggplot2可视化:使用patchwork包的plot_layout函数将多个可视化图像组合起来,nrow参数指定行的个数、byrow参数指定按照列顺序排布图
To eliminate asynchronous callbacks, it has to be async-await
The second step through MySQL in four steps: MySQL index learning
mmdetection/mmdetection3d多机多卡训练
SRM供应商协同管理系统功能介绍
CAS:385437-57-0,DSPE-PEG-Biotin,生物活性分子磷脂-聚乙二醇-生物素
基于clipboard.js对复制组件的封装
为什么买域名必须实名认证?这样做什么原因?
两个对象相同数据赋值
消灭异步回调,还得是async-await
吃透Chisel语言.32.Chisel进阶之硬件生成器(一)——Chisel中的参数化
WPF 修改 ItemContainerStyle 鼠标移动到未选中项效果和选中项背景
yarn detailed introductory tutorial
44. 通配符匹配 ●●● & HJ71 字符串通配符 ●●
区间贪心(区间合并)
软件测试高频面试题真实分享/网上银行转账是怎么测的,设计一下测试用例。
嵌入式开发:使用堆栈保护提高代码完整性
CF86D Powerful array