当前位置:网站首页>Hudi of data Lake (1): introduction to Hudi
Hudi of data Lake (1): introduction to Hudi
2022-07-06 00:01:00 【Electro optic flicker】
Catalog
4. Hudi Release time of each version
0. Links to related articles
Basic knowledge points of big data A summary of the article
1. What is? Hudi
Apache Hudi( pronunciation “hoodie”) It is the next generation of streaming data Lake platform .Apache Hudi Bring core warehouse and database functions directly to the data Lake .Hudi Tables are provided , Business , Efficient upserts / Delete , Advanced index , Streaming ingestion service , Data cluster / Compression optimization and concurrency , At the same time, keep the data in open source file format .
Apache Hudi Not only for streaming workloads , It also allows the creation of effective incremental batch pipelines . Include Uber, Amazon, ByteDance, Robinhood And more companies are using Hudi Transform their production data Lake .
Apache Hudi It can be easily used on any cloud storage platform .Hudi Advanced performance optimization for , Analyze workloads using any popular query engine , Include Apache Spark,Flink,Presto,Trino,Hive etc. .
2. Hudi Position in big data
Hudi Introducing stream processing into big data , Provide fresh data , At the same time, it is one data order of magnitude higher than the traditional batch processing efficiency .
3. Hudi Characteristics of
- Fast upsert, Insertable index
- Operate data atomically and have rollback function
- Snapshot isolation between writer and query
- savepoint Save point for user data recovery
- Manage file size , Use statistics layout
- Asynchronously compress row and column data
- Have a timeline to track metadata lineage
- Optimize the data set by clustering
4. Hudi Release time of each version
github Official website address :Tags · apache/hudi · GitHub
Hudi Download address and feature description of each historical version :Download | Apache Hudi
notes :Hudi The series of blog posts are through Hudi Written in the official website learning records , One of them is to add personal understanding , If there is any deficiency , Please understand
notes : Links to other related articles go here ( Include Hudi Blog posts related to big data, including ) -> Basic knowledge points of big data A summary of the article
边栏推荐
- Redis high availability - master-slave replication, sentinel mode, cluster
- How to rotate the synchronized / refreshed icon (EL icon refresh)
- Convert Chinese into pinyin
- Spire. PDF for NET 8.7.2
- C # input how many cards are there in each of the four colors.
- Add noise randomly to open3d point cloud
- Effet Doppler (déplacement de fréquence Doppler)
- What is a humble but profitable sideline?
- Huawei equipment configuration ospf-bgp linkage
- MySQL global lock and table lock
猜你喜欢
【二叉搜索树】增删改查功能代码实现
After summarizing more than 800 kubectl aliases, I'm no longer afraid that I can't remember commands!
Spire Office 7.5.4 for NET
教你在HbuilderX上使用模拟器运行uni-app,良心教学!!!
wx.getLocation(Object object)申请方法,最新版
What are Yunna's fixed asset management systems?
激光slam学习记录
Redis high availability - master-slave replication, sentinel mode, cluster
XML配置文件(DTD详细讲解)
用列錶初始化你的vector&&initializer_list簡介
随机推荐
2022.7.5-----leetcode.729
[day39 literature extensive reading] a Bayesian perspective on magnetic estimation
How to get all the values stored in localstorage
Zero rhino technology joined hands with the intelligence Club: the "causal faction" forum was successfully held, and the "causal revolution" brought the next generation of trusted AI
Tips for using pads router
2022.7.5-----leetcode. seven hundred and twenty-nine
[designmode] adapter pattern
认识提取与显示梅尔谱图的小实验(观察不同y_axis和x_axis的区别)
QT a simple word document editor
18.(arcgis api for js篇)arcgis api for js点采集(SketchViewModel)
Open3D 点云随机添加噪声
总结了 800多个 Kubectl 别名,再也不怕记不住命令了!
20220703 week race: number of people who know the secret - dynamic rules (problem solution)
Detailed explanation of APP functions of door-to-door appointment service
Choose to pay tribute to the spirit behind continuous struggle -- Dialogue will values [Issue 4]
第16章 OAuth2AuthorizationRequestRedirectWebFilter源码解析
What if the C disk is not enough? Let's see how I can clean up 25g of temp disk space after I haven't redone the system for 4 years?
Effet Doppler (déplacement de fréquence Doppler)
USB Interface USB protocol
Bao Yan notes II software engineering and calculation volume II (Chapter 13-16)