当前位置:网站首页>[Flink] flinksql and table programming cases
[Flink] flinksql and table programming cases
2022-06-29 06:21:00 【Roll a few you WOW】
「 This is my participation 2022 For the first time, the third challenge is 28 God , Check out the activity details :2022 For the first time, it's a challenge 」
One 、 summary
Flink SQL & TableBackground and principle- The concept of dynamic table
- Commonly used
SQLAnd built-in functions
Why do you need relational API ?
FlinkadoptTable API&SQLRealize batch flow unification .
Table APIandSQLAt the top , yesFlinkProvide advancedAPIoperation .Flink SQLyesFlinkReal time calculation is a simplified calculation model , A set of standards designed to lower the threshold for users to use real-time computingSQLSemantic development language .
rely on :
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-java-bridge_2.11</artifactId>
<version>1.11.1</version>
</dependency>
<!-- need scala api Then add -->
<!--<dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-scala-bridge_2.11</artifactId> <version>1.11.1</version> </dependency>-->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner-blink_2.11</artifactId>
<version>1.11.1</version>
</dependency>
Copy code Two 、 principle
In the offline computing scenario
HiveAlmost half of the offline data processing . Its bottom pairSQLThe parsing of the usesApache Calcite
Flink The same SQL Parsing 、 Optimization and execution teach Calcite.
Apache Calcite perform SQL The main steps of query are as follows :
- take
SQLParse into an unchecked abstract syntax tree (AST,Abstract Syntax Tree). Abstract syntax trees are language independent forms , This is similar to the first step of cross compilation . Validate: verificationAST, Main verificationSQLIs the statement legal , The result after verification isRelNodeTrees .Optimize: OptimizeRelNodeTree and generate a physical execution plan .Execute: Translate the physical execution plan into platform specific execution code , Such asFlinkOfDataStreamApplications for .
Pictured :
Whether it's a batch query
SQLOr streaming querySQL, Will go through the corresponding converterParserTransform into a node treeSQLNode tree, Then generate a logical execution planLogical Plan, The logical execution plan generates a physical execution plan that can be executed after optimization , handDataSetperhapsDataStreamOfAPITo carry out .
A complete Flink Table & SQL Job Also by Source、Transformation、Sink constitute :
3、 ... and 、 Dynamic table
With the traditional table SQL Compared with ,Flink Table & SQL When processing stream data, it will always be in dynamic data changes , So there is a concept of dynamic table .
The query of dynamic table is the same as that of static table , however , When querying dynamic tables ,SQL Can do continuous query , Will not terminate .
Take a chestnut ,Flink The program accepts a Kafka Stream as input ,Kafka Is the purchase record of the user :
First ,Kafka The message will be continuously parsed into a growing dynamic table , We execute on the dynamic table SQL New dynamic tables will be generated as result tables .
stay DataStream On the implementation SQL Inquire about , The process is as follows :
DataStreamConvert to dynamic table- Define persistent queries on dynamic tables
- Convert the real-time results of continuous queries into dynamic tables
- Convert the dynamic table representing the query results into
DataStream, So that applications canDataStream APIFurther transform the query results .
Four 、Flink Table & SQL Operators and built-in functions
A simple example :
SELECT * FROM Table;
SELECT name,age FROM Table;
SELECT name,age FROM Table where name LIKE '% Xiao Ming %';
SELECT * FROM Table WHERE age = 20;
SELECT name, age
FROM Table
WHERE name IN (SELECT name FROM Table2)
SELECT name,age FROM Table where name LIKE '% Xiao Ming %';
SELECT * FROM Table WHERE age = 20;
SELECT name, age
FROM Table
WHERE name IN (SELECT name FROM Table2)
SELECT *
FROM User LEFT JOIN Product ON User.name = Product.buyer
SELECT *
FROM User RIGHT JOIN Product ON User.name = Product.buyer
SELECT *
FROM User FULL OUTER JOIN Product ON User.name = Product.buyer
Copy code window
According to the different partition of window data , at present Apache Flink There are the following 3 Kind of :
Scroll the window , Window data has a fixed size , The data in the window is not superimposed ;
The sliding window , Window data has a fixed size , And there are generation intervals ;
Session window , Window data has no fixed size , It is divided according to the parameters passed in by the user , Window data is not superimposed ;
1) Scroll the window
Scrolling windows are characterized by : There's a fixed size 、 The data in the window does not overlap , As shown in the figure below :
The syntax is as follows :
SELECT
[gk],
[TUMBLE_START(timeCol, size)],
[TUMBLE_END(timeCol, size)],
agg1(col1),
...
aggn(colN)
FROM Tab1
GROUP BY [gk], TUMBLE(timeCol, size)
Copy code for example , It is necessary to calculate the daily order quantity of each user :
SELECT user, TUMBLE_START(timeLine, INTERVAL '1' DAY) as winStart, SUM(amount)
FROM Orders
GROUP BY TUMBLE(timeLine, INTERVAL '1' DAY), user;
Copy code among ,
TUMBLE_STARTandTUMBLE_ENDRepresents the start time and end time of the window ,TUMBLE (timeLine, INTERVAL '1' DAY)MediumtimeLineRepresents the column where the time field is located ,INTERVAL '1' DAYIndicates that the time interval is one day .
2) The sliding window
Sliding windows have a fixed size , Unlike scrolling windows, sliding windows can be accessed through slide Parameter controls how often sliding windows are created . It should be noted that , Data overlap may occur in multiple sliding windows , The specific semantics are as follows :
The syntax of a sliding window is compared with that of a scrolling window , Only one more slide Parameters :
SELECT
[gk],
[HOP_START(timeCol, slide, size)] ,
[HOP_END(timeCol, slide, size)],
agg1(col1),
...
aggN(colN)
FROM Tab1
GROUP BY [gk], HOP(timeCol, slide, size)
Copy code for example , We have to calculate the past every hour 24 The sales volume of each product in an hour :
SELECT product, SUM(amount)
FROM Orders
GROUP BY HOP(rowtime, INTERVAL '1' HOUR, INTERVAL '1' DAY), product
Copy code In the above case
INTERVAL '1' HOURRepresents the time interval of sliding window generation .
3) Session window
The session window defines an inactive time , If no event or message occurs within the specified time interval , The session window closes .
The syntax of the session window is as follows :
SELECT
[gk],
SESSION_START(timeCol, gap) AS winStart,
SESSION_END(timeCol, gap) AS winEnd,
agg1(col1),
...
aggn(colN)
FROM Tab1
GROUP BY [gk], SESSION(timeCol, gap)
Copy code give an example , We need to calculate the past of each user 1 Order volume in hours :
SELECT user, SESSION_START(rowtime, INTERVAL '1' HOUR) AS sStart, SESSION_ROWTIME(rowtime, INTERVAL '1' HOUR) AS sEnd, SUM(amount)
FROM Orders
GROUP BY SESSION(rowtime, INTERVAL '1' HOUR), user
Copy code 边栏推荐
- JIRA basic usage sharing
- Est - ce que l'ouverture d'un compte de titres est sécurisée? Y a - t - il un danger?
- Fault: display Storport driver out of date in component health
- Why is there a packaging type?
- Mongodb basic knowledge summary
- Skills of writing test cases efficiently
- Convert data frame with date column to timeseries
- The simple problem of leetcode is to divide an array into three parts equal to sum
- Manual (functional) test 1
- Loosely matched jest A value in tohavebeencalledwith - loose match one value in jest toHaveBeenCalledWith
猜你喜欢

2022 recommended REITs Industry Research Report investment strategy industry development prospect market analysis (the attachment is a link to the online disk, and the report is continuously updated)

5,10,15,20-tetra (3,5-dimethoxyphenyl) porphyrin ((tdmpp) H2) /2-nitro-5,10,15,20-tetra (3,5-dimethoxyphenyl) porphyrin copper (no2tdmpp) Cu) supplied by Qiyue

JIRA basic usage sharing

Ti Click: quickly set up tidb online laboratory through browser | ti- team interview can be conducted immediately

Establishing the development environment of esp8266

Love that can't be met -- what is the intimate relationship maintained by video chat

Longest substring between two identical characters of leetcode simple question
![ASP. Net core 6 framework unveiling example demonstration [03]:dapr initial experience](/img/fd/4c24e10fc91a7ce7e709a0874ba675.jpg)
ASP. Net core 6 framework unveiling example demonstration [03]:dapr initial experience

2022.02.14 - 239. A single element in an ordered array

The generation of leetcode simple questions each character is an odd number of strings
随机推荐
Segment in Lucene
Love that can't be met -- what is the intimate relationship maintained by video chat
Summary of redis basic knowledge points
Manual (functional) test 1
Jenkins operation Chapter 6 mail server sending build results
The most complete machine learning model training process
Principle of screen printing adjustment of EDA (cadence and AD) software
Ghost in the Log4Shell
[C language series] - initial C language (4)
2022.02.14 - 239. A single element in an ordered array
What has urbanization brought to our mental health and behavior?
Design of leetcode simple problem goal parser
Longest substring between two identical characters of leetcode simple question
Mongodb basic knowledge summary
QT (x): packaging and deployment
The generation of leetcode simple questions each character is an odd number of strings
2022 recommended precious metal industry research report industry development prospect market analysis white paper (the attachment is a link to the online disk, and the report is continuously updated)
Purple red solid meso tetra (o-alkoxyphenyl) porphyrin cobalt (meso-t (2-rop) PCO) / tetra (n, n-diphenyl-p-amino) phenyl porphyrin (tdpatph2)
What should I learn before learning programming?
Mongodb paging method