当前位置:网站首页>Data Lake (20): Flink is compatible with iceberg, which is currently insufficient, and iceberg is compared with Hudi
Data Lake (20): Flink is compatible with iceberg, which is currently insufficient, and iceberg is compared with Hudi
2022-07-27 03:16:00 【Lanson】
Flink compatible Iceberg Current deficiencies and Iceberg And Hudi contrast
One 、Flink compatible Iceberg Not enough at present
- Iceberg Currently not supported Flink SQL Query metadata information of table , Need to use Java API Realization .
- Flink Creating with hidden partitions is not supported Iceberg surface
- Flink Do not support with WaterMark Of Iceberg surface
- Flink Adding columns is not supported 、 Delete column 、 Rename column operation .
- Flink Yes Iceberg Connector Support is not perfect .
Two 、Iceberg And Hudi contrast
Iceberg and Hudi It's all data Lake Technology , From the perspective of community activity ,Iceberg There is transcendence Hudi The trend of . They have the following in common :
- Both are data organization methods based on storage formats
- Provide ACID Ability , Provide certain transactions 、 Parallel execution capability
- Provide row level data modification capability .
- Provide a certain amount of Schema Expand capabilities , for example : newly added 、 modify 、 Delete column operation .
- Support data consolidation , Working with small files .
- Support Time travel Query snapshot data .
- Support batch and real-time data reading and writing
Iceberg And Hudi The difference between them lies in the following points :
- Iceberg Support Parquet、avro、orc data format ,Hudi Support Parquet and Avro Format .
- The data storage and query mechanisms of the two are different
Iceberg Only one table storage mode is supported , There is metadata file、manifest file and data file Form a storage structure , When querying, first find Metadata The metadata is then filtered to find the corresponding SnapShot Corresponding manifest files , Then find the corresponding data file .Hudi Two table storage modes are supported :Copy On Write( Merge on write ) and Merge On Read( Merge while reading ), When querying, directly read the corresponding snapshot data .
- When dealing with small file merging ,Iceberg Only support API Method to manually process and merge small files ,Hudi For small files, merge processing can be performed automatically according to the configuration .
- Spark And Iceberg and Hudi Integration time ,Iceberg Yes SparkSQL At present, our support is better .Spark And Hudi Integration is more Spark DataFrame API operation .
- About Schema aspect ,Iceberg Schema It is decoupled from the computing engine , Do not rely on any computing engine , and Hudi Of Schema Rely on the computing engine Schema.
边栏推荐
- “date: write error: No space left on device”解决
- Okaleido tiger is about to log in to binance NFT in the second round, which has aroused heated discussion in the community
- typora详细教程
- be based on. NETCORE development blog project starblog - (16) some new functions (monitoring / statistics / configuration / initialization)
- day6
- The most complete basic knowledge of software testing in the whole network (a must for beginners)
- 调用JShaman的Web API接口,实现JS代码加密。
- 数据湖(二十):Flink兼容Iceberg目前不足和Iceberg与Hudi对比
- [hash table] question collection
- 力扣(LeetCode)207. 课程表(2022.07.26)
猜你喜欢

仿知乎论坛社区社交微信小程序

OpenTelemetry 在服务网格架构下的最佳实践

次轮Okaleido Tiger即将登录Binance NFT,引发社区热议

window对象的常见事件

在线问题反馈模块实战(十五):实现在线更新反馈状态功能

次轮Okaleido Tiger即将登录Binance NFT,引发社区热议

What did kubedmin do?

Complete source code of mall applet project (wechat applet)

The EXE compiled by QT is started with administrator privileges

Best practices of opentelemetry in service grid architecture
随机推荐
[SQL simple question] leetcode 627. change gender
DNS记录类型及相关名词解释
185. All employees with the top three highest wages in the Department (mandatory)
[paper]PointLaneNet论文浅析
177. 第N高的薪水(简单)
全网最全的软件测试基础知识整理(新手入门必学)
最低票价(DAY 80)
185. 部门工资前三高的所有员工(必会)
coco test-dev 测试代码
安全员及环保员岗位职责
Make ppt timeline
围圈报数(北理工机试题)(DAY 83)
MarqueeView实现滑动展示效果
次轮Okaleido Tiger即将登录Binance NFT,引发社区热议
论构造函数的原型是谁
[SQL简单题] LeetCode 627. 变更性别
Manually build ABP framework from 0 -abp official complete solution and manually build simplified solution practice
localStorage与sessionStorage
记录一次,php程序访问系统文件访问错误的问题
Analysis of [paper] pointlanenet papers