当前位置:网站首页>Hudi of data Lake (1): introduction to Hudi
Hudi of data Lake (1): introduction to Hudi
2022-07-06 00:01:00 【Electro optic flicker】
Catalog
4. Hudi Release time of each version
0. Links to related articles
Basic knowledge points of big data A summary of the article
1. What is? Hudi
Apache Hudi( pronunciation “hoodie”) It is the next generation of streaming data Lake platform .Apache Hudi Bring core warehouse and database functions directly to the data Lake .Hudi Tables are provided , Business , Efficient upserts / Delete , Advanced index , Streaming ingestion service , Data cluster / Compression optimization and concurrency , At the same time, keep the data in open source file format .
Apache Hudi Not only for streaming workloads , It also allows the creation of effective incremental batch pipelines . Include Uber, Amazon, ByteDance, Robinhood And more companies are using Hudi Transform their production data Lake .
Apache Hudi It can be easily used on any cloud storage platform .Hudi Advanced performance optimization for , Analyze workloads using any popular query engine , Include Apache Spark,Flink,Presto,Trino,Hive etc. .
2. Hudi Position in big data
Hudi Introducing stream processing into big data , Provide fresh data , At the same time, it is one data order of magnitude higher than the traditional batch processing efficiency .

3. Hudi Characteristics of
- Fast upsert, Insertable index
- Operate data atomically and have rollback function
- Snapshot isolation between writer and query
- savepoint Save point for user data recovery
- Manage file size , Use statistics layout
- Asynchronously compress row and column data
- Have a timeline to track metadata lineage
- Optimize the data set by clustering
4. Hudi Release time of each version
github Official website address :Tags · apache/hudi · GitHub




Hudi Download address and feature description of each historical version :Download | Apache Hudi

notes :Hudi The series of blog posts are through Hudi Written in the official website learning records , One of them is to add personal understanding , If there is any deficiency , Please understand
notes : Links to other related articles go here ( Include Hudi Blog posts related to big data, including ) -> Basic knowledge points of big data A summary of the article
边栏推荐
- The difference of time zone and the time library of go language
- MySQL之函数
- 【GYM 102832H】【模板】Combination Lock(二分图博弈)
- How to rotate the synchronized / refreshed icon (EL icon refresh)
- 【DesignMode】适配器模式(adapter pattern)
- Online yaml to CSV tool
- FFMPEG关键结构体——AVCodecContext
- What is information security? What is included? What is the difference with network security?
- 认识提取与显示梅尔谱图的小实验(观察不同y_axis和x_axis的区别)
- shardingsphere源码解析
猜你喜欢

My colleagues quietly told me that flying Book notification can still play like this

Senparc. Weixin. Sample. MP source code analysis
![[designmode] composite mode](/img/9a/25c7628595c6516ac34ba06121e8fa.png)
[designmode] composite mode

JVM details

What are the functions of Yunna fixed assets management system?

20220703 week race: number of people who know the secret - dynamic rules (problem solution)

用列表初始化你的vector&&initializer_list简介

Detailed explanation of APP functions of door-to-door appointment service

Teach you to run uni app with simulator on hbuilderx, conscience teaching!!!

How to rotate the synchronized / refreshed icon (EL icon refresh)
随机推荐
PADS ROUTER 使用技巧小记
Mathematical model Lotka Volterra
My colleagues quietly told me that flying Book notification can still play like this
【在线聊天】原来微信小程序也能回复Facebook主页消息!
Open source CRM customer relationship system management system source code, free sharing
GD32F4xx uIP协议栈移植记录
Laser slam learning record
FFmpeg学习——核心模块
Bao Yan notebook IV software engineering and calculation volume II (Chapter 8-12)
Add noise randomly to open3d point cloud
[online chat] the original wechat applet can also reply to Facebook homepage messages!
【DesignMode】适配器模式(adapter pattern)
转:未来,这样的组织才能扛住风险
Spreadjs 15.1 CN and spreadjs 15.1 en
JS can really prohibit constant modification this time!
C # input how many cards are there in each of the four colors.
How to rotate the synchronized / refreshed icon (EL icon refresh)
USB Interface USB protocol
Use CAS instead of synchronized
Spire Office 7.5.4 for NET