当前位置:网站首页>HQL statement execution process
HQL statement execution process
2022-08-05 05:25:00 【value growth】
HQL statement execution process:
- Syntax parsing: Antlr defines the grammatical rules of SQL, completes SQL lexical, grammatical parsing, and converts SQL into an abstract syntax tree AST Tree;
- Semantic parsing: traverse the AST Tree and abstract the basic unit of query QueryBlock;
- Generate a logical execution plan: traverse QueryBlock and translate it into an execution operation tree OperatorTree;
- Optimize the logic execution plan: The logic layer optimizer performs OperatorTree transformation, merges unnecessary ReduceSinkOperators, and reduces the amount of shuffle data;
- Generate physical execution plan: traverse OperatorTree and translate into MapReduce tasks;
- Optimize the physical execution plan: The physical layer optimizer transforms MapReduce tasks to generate the final execution plan.
How is HQL resolved into MR job?
Hive uses Antlr to implement syntax parsing. According to the SQL parsing rules formulated by Antlr, complete the lexical/syntax parsing of SQL statements, and convert SQL into abstract syntax tree AST.
Traverse the AST to generate the basic query unit QueryBlock. QueryBlock is the most basic unit of SQL, including three parts: input source, calculation process, and output.
Traverse QueryBlock to generate OperatorTree. The MapReduce task finally generated by Hive consists of OperatorTree in both Map and Reduce phases.Operator is to complete a single specific operation in the Map phase or the Reduce phase.QueryBlock generates Operator Tree by traversing the attributes of the saved syntax of the QB and QBParseInfo objects generated in the previous process.
**Optimize OperatorTree.**Most logic layer optimizers achieve the purpose of reducing MapReduce Job and shuffle data volume by transforming OperatorTree and merging operators
OperatorTree generates MapReduce Job. Traverse OperatorTree and translate into MR tasks.
- Generate MoveTask for output table
- Depth-first traversal from one of the root nodes of OperatorTree down
- ReduceSinkOperator marks the boundaries of Map/Reduce, the boundaries between multiple jobs
- Traverse other root nodes and encounter JoinOperator to merge MapReduceTask
- Generate StatTask update metadata
- Sever the operator relationship between Map and Reduce
Optimize the task. Use the physical optimizer to optimize the MR task to generate the final execution task
[HIVE] sqlStatement converted into mapreduce - Mr.Ming2 - Blog Park
边栏推荐
猜你喜欢
随机推荐
The role of the range function
Mesos学习
结构光三维重建(一)条纹结构光三维重建
Flutter real machine running and simulator running
The software design experiment four bridge model experiment
LAB Semaphore Implementation Details
结构光三维重建(二)线结构光三维重建
[Study Notes Dish Dog Learning C] Classic Written Exam Questions of Dynamic Memory Management
Community Sharing|Tencent Overseas Games builds game security operation capabilities based on JumpServer
[Student Graduation Project] Design and Implementation of the Website Based on the Web Student Information Management System (13 pages)
[Software Exam System Architect] Software Architecture Design ③ Domain-Specific Software Architecture (DSSA)
jvm three heap and stack
2022牛客多校第四场C.Easy Counting Problem(EGF+NTT)
【过一下 17】pytorch 改写 keras
第二讲 Linear Model 线性模型
span标签和p标签的区别
What field type of MySQL database table has the largest storage length?
1.3 mysql batch insert data
BFC(Block Formatting Context)
Returned object not currently part of this pool



![[cesium] 3D Tileset model is loaded and associated with the model tree](/img/03/50b7394f33118c9ca1fbf31b737b1a.png)





