当前位置:网站首页>Hudi of data Lake (1): introduction to Hudi
Hudi of data Lake (1): introduction to Hudi
2022-07-06 00:01:00 【Electro optic flicker】
Catalog
4. Hudi Release time of each version
0. Links to related articles
Basic knowledge points of big data A summary of the article
1. What is? Hudi
Apache Hudi( pronunciation “hoodie”) It is the next generation of streaming data Lake platform .Apache Hudi Bring core warehouse and database functions directly to the data Lake .Hudi Tables are provided , Business , Efficient upserts / Delete , Advanced index , Streaming ingestion service , Data cluster / Compression optimization and concurrency , At the same time, keep the data in open source file format .
Apache Hudi Not only for streaming workloads , It also allows the creation of effective incremental batch pipelines . Include Uber, Amazon, ByteDance, Robinhood And more companies are using Hudi Transform their production data Lake .
Apache Hudi It can be easily used on any cloud storage platform .Hudi Advanced performance optimization for , Analyze workloads using any popular query engine , Include Apache Spark,Flink,Presto,Trino,Hive etc. .
2. Hudi Position in big data
Hudi Introducing stream processing into big data , Provide fresh data , At the same time, it is one data order of magnitude higher than the traditional batch processing efficiency .
3. Hudi Characteristics of
- Fast upsert, Insertable index
- Operate data atomically and have rollback function
- Snapshot isolation between writer and query
- savepoint Save point for user data recovery
- Manage file size , Use statistics layout
- Asynchronously compress row and column data
- Have a timeline to track metadata lineage
- Optimize the data set by clustering
4. Hudi Release time of each version
github Official website address :Tags · apache/hudi · GitHub
Hudi Download address and feature description of each historical version :Download | Apache Hudi
notes :Hudi The series of blog posts are through Hudi Written in the official website learning records , One of them is to add personal understanding , If there is any deficiency , Please understand
notes : Links to other related articles go here ( Include Hudi Blog posts related to big data, including ) -> Basic knowledge points of big data A summary of the article
边栏推荐
- JVM details
- 4 points tell you the advantages of the combination of real-time chat and chat robots
- Tips for using pads router
- 教你在HbuilderX上使用模拟器运行uni-app,良心教学!!!
- 什么叫做信息安全?包含哪些内容?与网络安全有什么区别?
- What are the functions of Yunna fixed assets management system?
- 【在线聊天】原来微信小程序也能回复Facebook主页消息!
- QT--线程
- Bao Yan notebook IV software engineering and calculation volume II (Chapter 8-12)
- 【DesignMode】装饰者模式(Decorator pattern)
猜你喜欢
Knowledge about the memory size occupied by the structure
Use mapper: --- tkmapper
Biased sample variance, unbiased sample variance
FFmpeg学习——核心模块
Detailed explanation of APP functions of door-to-door appointment service
单商户V4.4,初心未变,实力依旧!
Problems encountered in the database
My colleagues quietly told me that flying Book notification can still play like this
20220703 week race: number of people who know the secret - dynamic rules (problem solution)
How much do you know about the bank deposit business that software test engineers must know?
随机推荐
【SQL】各主流数据库sql拓展语言(T-SQL 、 PL/SQL、PL/PGSQL)
Asynchronous task Whenall timeout - Async task WhenAll with timeout
Zhuan: in the future, such an organization can withstand the risks
5. Logistic regression
[QT] QT uses qjson to generate JSON files and save them
Configuring OSPF GR features for Huawei devices
4 points tell you the advantages of the combination of real-time chat and chat robots
如何获取localStorage中存储的所有值
Initialiser votre vecteur & initialisateur avec une liste Introduction à la Liste
第16章 OAuth2AuthorizationRequestRedirectWebFilter源码解析
Cloudcompare & PCL point cloud randomly adds noise
NSSA area where OSPF is configured for Huawei equipment
Research notes I software engineering and calculation volume II (Chapter 1-7)
Doppler effect (Doppler shift)
软件测试工程师必会的银行存款业务,你了解多少?
MySQL之函数
DEJA_ Vu3d - cesium feature set 055 - summary description of map service addresses of domestic and foreign manufacturers
【QT】Qt使用QJson生成json文件并保存
Transport layer protocol ----- UDP protocol
[designmode] composite mode