当前位置:网站首页>Flink SQL Hudi actual combat
Flink SQL Hudi actual combat
2022-07-29 01:21:00 【hyunbar】

1、 summary
Official website :https://hudi.apache.org
gitee:https://gitee.com/apache/Hudi
1.1 framework

1.2 characteristic
Upserts, Deletes with fast, pluggable indexing.
Incremental queries, Record level change streams
Transactions, Rollbacks, Concurrency Control.
SQL Read/Writes from Spark, Presto, Trino, Hive & more
Automatic file sizing, data clustering, compactions, cleaning.
Streaming ingestion, Built-in CDC sources & tools.
Built-in metadata tracking for scalable storage access.
Backwards compatible schema evolution and enforcement

2、 start-up flink
Modify the configuration
taskmanager.numberOfTaskSlots: 4
To configure HADOOP Environment variables of
export HADOOP_CLASSPATH=`hadoop classpath`
2.1 Start the local cluster (Standalone)
[email protected]:/flink-1.14.4$ ./bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host bigdata100.
Starting taskexecutor daemon on host bigdata100.
[email protected]:/flink-1.14.4$ jps
18226 Jps
15333 NameNode
16038 ResourceManager
16649 JobHistoryServer
17900 StandaloneSessionClusterEntrypoint
15756 SecondaryNameNode
16381 NodeManager
15534 DataNode
start-up sql-client
[email protected]:/bigdata/module/flink-1.14.4$ ./bin/sql-client.sh embedded -j /home/duo/hudi-flink-bundle_2.11-0.10.1.jar
2.2 start-up yarn-session colony
Depend on hadoop Storage , So we can only yarn Start the cluster in mode
[email protected]:/bigdata/module/flink-1.14.4$ ./bin/yarn-session.sh -nm duo -d
[email protected]:/bigdata/module/flink-1.14.4$ jps
15333 NameNode
16038 ResourceManager
25191 YarnSessionClusterEntrypoint
16649 JobHistoryServer
25290 Jps
15756 SecondaryNameNode
16381 NodeManager
15534 DataNode
[email protected]:/bigdata/module/flink-1.14.4$ ./bin/sql-client.sh embedded -s yarn-session -j /home/duo/hudi-flink-bundle_2.11-0.10.1.jar
3、 test
3.1 sql Execute statement
-- Create table CREATE TABLE student( uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED, name VARCHAR(10), age INT, ts TIMESTAMP(3), `partition` VARCHAR(20))PARTITIONED BY (`partition`)WITH ( 'connector' = 'hudi', 'path' = 'hdfs:///flink/hudi/student', 'table.type' = 'MERGE_ON_READ');-- insert data INSERT INTO student VALUES ('id1','Danny',23,TIMESTAMP '2022-07-01 12:12:12','par1'), ('id2','Stephen',33,TIMESTAMP '2022-07-01 12:12:02','par1'), ('id3','Julian',53,TIMESTAMP '2022-07-01 12:12:03','par2'), ('id4','Fabian',31,TIMESTAMP '2022-07-01 12:12:04','par2'), ('id5','Sophia',18,TIMESTAMP '2022-07-01 12:12:05','par3'), ('id6','Emma',20,TIMESTAMP '2022-07-01 12:12:06','par3'), ('id7','Bob',44,TIMESTAMP '2022-07-01 12:12:07','par4'), ('id8','Han',56,TIMESTAMP '2022-07-01 12:12:08','par4');-- to update key='id1' The data of insert into t1 values ('id1','Danny',27,TIMESTAMP '1970-01-01 00:00:01','par1');SELECT * FROM student;
3.2 Create tables and insert data
Flink SQL> set execution.result-mode=tableau;[INFO] Session property has been set.Flink SQL> CREATE TABLE student(> uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,> name VARCHAR(10),> age INT,> ts TIMESTAMP(3),> `partition` VARCHAR(20)> )> PARTITIONED BY (`partition`)> WITH (> 'connector' = 'hudi',> 'path' = 'hdfs:///flink/hudi/student',> 'table.type' = 'MERGE_ON_READ'> );[INFO] Execute statement succeed.Flink SQL> INSERT INTO student VALUES> ('id1','Danny',23,TIMESTAMP '2022-07-01 12:12:12','par1'),> ('id2','Stephen',33,TIMESTAMP '2022-07-01 12:12:02','par1'),> ('id3','Julian',53,TIMESTAMP '2022-07-01 12:12:03','par2'),> ('id4','Fabian',31,TIMESTAMP '2022-07-01 12:12:04','par2'),> ('id5','Sophia',18,TIMESTAMP '2022-07-01 12:12:05','par3'),> ('id6','Emma',20,TIMESTAMP '2022-07-01 12:12:06','par3'),> ('id7','Bob',44,TIMESTAMP '2022-07-01 12:12:07','par4'),> ('id8','Han',56,TIMESTAMP '2022-07-01 12:12:08','par4');[INFO] Submitting SQL update statement to the cluster...[INFO] SQL update statement has been successfully submitted to the cluster:Job ID: 2b4962486c1fbcff9e6354ab17801ae1Flink SQL> SELECT * FROM student;
3.3 Query data

3.4 see yarn Mission

3.5 see HDFS file

3.6 Flink WebUI View tasks


边栏推荐
- 20220728 sorting strings that are not pure numbers
- Letax record \documentclass{}, authoryear attribute is used
- FLV文件简介
- 北京护照西班牙语翻译推荐
- MySQL stored procedure realizes the creation of a table (copy the structure of the original table and create a new table)
- 【Leetcode-滑动窗口问题】
- [unity] configure unity edit C as vscode
- A new generation of ultra safe cellular battery, Sihao aipao, is on the market, starting from 139900
- [Commons lang3 topic] 005- objectutils topic
- SDRAM Controller Design (two design methods of digital controller)
猜你喜欢

Self made | a 16 bit RISC architecture CPU is self-made by hand

Wechat campus bathroom reservation applet graduation design finished product (5) assignment

QT static compiler (MinGW compilation)
![“index [hotel/jXLK5MTYTU-jO9WzJNob4w] already exists“](/img/f2/37a1e65eb1104d72128f96fc5d9c85.png)
“index [hotel/jXLK5MTYTU-jO9WzJNob4w] already exists“

20220728-不纯为数字的字符串排序

Flask project architecture (First Edition
![“index [hotel/jXLK5MTYTU-jO9WzJNob4w] already exists“](/img/f2/37a1e65eb1104d72128f96fc5d9c85.png)
“index [hotel/jXLK5MTYTU-jO9WzJNob4w] already exists“

Day2:三种语言暴刷牛客130题

Inftnews | yuanuniverse shopping experience will become a powerful tool to attract consumers

Beginner's Guide to electronic bidding
随机推荐
日期转换 EEE MMM dd HH:mm:ss zzz yyyy
Visual full link log tracking
Y80. Chapter 4 Prometheus big factory monitoring system and practice -- Kube state metrics component introduction and monitoring extension (XI)
Letax record \documentclass{}, authoryear attribute is used
【mysql】多指标历史累计去重问题
Wechat campus bathroom reservation of small program completion work (6) opening defense ppt
Plato launched the LAAS protocol elephant swap, which allows users to earn premium income
solidity实现智能合约教程(5)-NFT拍卖合约
[Jenkins' notes] introduction, free space; Continuous integration of enterprise wechat; Allure reports, continuous integration of email notifications; Build scheduled tasks
Rewriting method set
Charles -- 从0-1教你如何使用抓包工具
括号匹配的检验
Flink SQL Hudi 实战
How to check the redis Version (check the redis process)
一元函数积分学之1__不定积分
转:认知亚文化
Machine learning | matlab implementation of RBF radial basis function neural network Newrbe parameter setting
如何处理项目中的时间、范围和成本限制?
How to deal with the time, scope and cost constraints in the project?
Google Play APK 上传其他国际应用商店