当前位置:网站首页>Optimization of lazyagg query rewriting in parsing data warehouse
Optimization of lazyagg query rewriting in parsing data warehouse
2022-06-25 17:58:00 【Huawei cloud developer Alliance】
Abstract : This paper deals with Lazy Agg Query rewrite optimization and GaussDB(DWS) Provided Lazy Agg Rewrite rules .
This article is shared from Huawei cloud community 《GaussDB(DWS) lazyagg Query rewrite optimization resolution 【 Gauss is not a mathematician this time 】》, author : OreoreO .
The aggregation operation groups the query results by the values of one or more columns , A set of equal values . Aggregation is a common operation and is widely used in financial customers . For example, the following statement :
SELECT a, count(a) FROM t1 GROUP BY a; -- Press a Group and calculate the number of duplicate values in the group One 、Lazy Agg Rewriting rule
In the case of large amount of data , Due to the large amount of data, the footwall , The execution time of aggregation operation becomes a performance bottleneck , As a result, the whole query execution efficiency is very poor . for example :
SELECT t2.b, sum(cc) FROM (SELECT b, sum(c) AS cc FROM t1 GROUP BY b) AS s, t2 WHERE s.b=t2.b GROUP BY t2.b;Subquery pair t1.b Columns are aggregated , Yes t1.c Summation , In an external query , There are also aggregation operations , Aggregate sum columns for subqueries cc Summation . For such statements , When the aggregation operation of sub query is time-consuming , Query rewriting rules can be used to eliminate the aggregation of subqueries , The aggregation function of the external query uniformly completes the aggregation operation . Eliminating a subquery may result in an increase in the number of rows in the subquery , But for the sub query aggregation operation t1.b Column distinct Scenarios with high values , The number of rows after the sub query aggregation operation will not be significantly reduced compared with the original table , Will not cause the outer layer JOIN A large increase in the amount of computation . That is, the statement can be rewritten as :
SELECT t2.b, sum(cc) FROM (SELECT b, c AS cc FROM t1) AS s, t2 WHERE s.b=t2.b GROUP BY t2.b;This rewrite rule is called Lazy Agg, It is applicable to the large amount of base table data distinct Scenarios with high values . If there are fewer duplicate values , Then eliminating the aggregation operation will lead to Join After that, the number of lines surged ,Join Poor performance , Therefore, it is necessary to Agg Push down to Join Before , Through advance Agg Reduced operation Join The number of rows of the result , This rewrite rule is called Eager Agg.
Two 、GaussDB(DWS) lazyagg Optimize
To make tuning less difficult , Improve product ease of use ,GaussDB(DWS) Provides lazyagg Query rewrite optimization rules , Can be set by guc Parameters rewrite_rule contain ’lazyagg’ Use Lazy Agg Query rewrite optimization . Turn on lazyagg After query rewrite optimization , For the scenario that meets the conditions, the aggregation operation in the sub query will be optimized and eliminated . The original plan is as follows :

lazyagg Rewrite the optimized plan as follows :

You can see that compared with the original plan ,lazyagg After rewriting the optimization, the aggregation operation in the original plan is eliminated , namely 7 Number Subquery Scan Operator and 8 Number HashAggregate operator .
3、 ... and 、lazyagg Optimize specifications
- The sub query can be a single aggregate query or a query containing aggregate sub set operations . Collection operations only support UNION ALL, Some branch sub queries can be aggregated and eliminated . Subquery must be JOIN One of the tables ( be not in TargetList、Where Clause, etc ).
- Support all external queries Agg The parameter column is contained in the... Of one of its subqueries Agg Function column , The aggregation operation of the sub query can be eliminated .
- Support all kinds of aggregation functions with correct results after eliminating the aggregation operation of sub queries . See the following table for the correctness of aggregation function type results :

4. Scene constraint
On the basis of the above scenario expansion , For scenarios that may lead to incorrect results , No query rewriting , Including but not limited to :
- Eliminating is not supported Agg Function type .
- The subquery contains other conditions or operators , Will result in error after rewriting , for example HAVING、window agg、LIMIT、OFFSET、AP function、distinct、recursive etc. .
- Outer layer Agg Parameter column 、GROUP BY Column or JOIN Column contains volatile function , Such as random、timeofday etc. .
- Subquery Agg Out of function 、 External query Agg There are other expressions or function operations in the function , Such as sub query Agg Function column is sum+1、max+max(d), External query Agg Function column is sum(cc+1) etc. .
- For external queries JOIN Column 、GROUP BY Columns or other conditions contain subqueries Agg Function column .
- Subquery in LEFT JOIN、RIGHT JOIN Of inner Edge or FULL JOIN in , And subquery Agg Function is count, External query Agg Function is sum Of .
Four 、 Conclusion
Through the analysis of this paper , I believe the user friends have fully understood Lazy Agg Rewrite optimized usage scenarios , as well as GaussDB(DWS) Of lazyagg Realization way . I hope that the majority of users can have an in-depth understanding of , Yes GaussDB(DWS) Have a strong interest in and deeply participate in the performance tuning of .
Reference documents :
GaussDB(DWS) Performance Tuning Series 4 : One of the eighteen martial arts SQL rewrite
Theory is not as good as practice , How to experience it quickly DWS Well ?DWS Now we have launched a Demo Experience activities . Get into DWS home page , Click on “Demo Experience ”, A quick and convenient experience !( Any suggestions and comments during the experience , You can go to DWS Community BBS Feedback oh )
Click to follow , The first time to learn about Huawei's new cloud technology ~
边栏推荐
- How to open a stock account? Is it safe to open a securities account
- Essential characteristics of convolution operation +textcnn text classification
- 股票开户怎么办理?证券开户哪家好 办理开户安全吗
- 证券公司排名前十手续费最低 办理开户安全吗
- MVDR beam MATLAB, MVDR beam forming matlab[easy to understand]
- 广发易淘金和指南针哪个更好,更安全一些
- mvdr波束 matlab,mvdr波束形成matlab[通俗易懂]
- 什么是算子?
- 微服务介绍
- Kotlin of Android cultivation manual - several ways to write a custom view
猜你喜欢

利用Qt制作美化登录界面框

Unity technical manual - lifecycle rotation rotationoverlifetime speed rotation rotationbyspeed external forces

Deep understanding of ELF files

Introduction to microservices

Using QT to make a beautiful login interface box
![Jerry's addition of encrypted file playback function [chapter]](/img/d0/b7a0c9030c157f282405129be51efe.png)
Jerry's addition of encrypted file playback function [chapter]

观察者模式之通用消息发布与订阅

为什么在变频器场合需要安科瑞的电力有源滤波器?
![[tips] how to quickly start a new position for a new software testing engineer](/img/88/5c002f492db56c646cbfd1ee98cd5b.png)
[tips] how to quickly start a new position for a new software testing engineer

篇7:CLion中没有代码提示,,,
随机推荐
Unity technical manual - size over lifetime and size by speed
十大证券公司哪个佣金最低 办理开户安全吗
Utilisation de diskgenius pour augmenter la capacité du disque système C
Vscode / * * generate function comments
Qinheng ch583 USB custom hid debugging record
SQL Server实时备份库要求
20 provinces and cities announce the road map of the meta universe
Li Kou daily question - day 27 -561 Array splitting I
TLV decoding
沁恒CH583 USB 自定义HID调试记录
使用DiskGenius拓展系统盘C盘的容量
华为云GaussDB(for Redis)揭秘第19期:GaussDB(for Redis)全面对比Codis
CGI connects to database through ODBC
VSCode 自动生成头文件的#ifndef #define #endif
Essential characteristics of convolution operation +textcnn text classification
【工作小技巧】刚入职的软件测试工程师怎么快速上手新岗位
有关QueryInterface函数
Can I open an account? Is it safe to open an account
CentOS7 安装 Redis 7.0.2
How to solve the problem of network disconnection after enabling hotspot sharing in win10?