当前位置:网站首页>Processing of limit operator in Presto

Processing of limit operator in Presto

2022-06-25 21:41:00 Wangfeihuo

One . Preface

This article mainly explores in Presto How to deal with Limit Operator's . With simple Hive Data source query select * from testlimit limit 2 Take as an example to discuss in Presto in Limit 2 How to apply to TableScan Of . We can learn from this article limit Operator down to Hive Data sources TableScan after ,TableScan How is it handled .

Two . belt Limit Operator's plan execution tree

stay Presto in ,Limit Operator passing RBO After optimization, it will be pushed down to the position as close to the data source as possible . The above SQL After optimization , Last generated operator The sequence of operations is :

Although this picture looks like TableScan After reading all the data , after Limit Filter , And then to the downstream . But in Presto in , The implementation is not like this , Because it will lead to TableScan There is considerable resource consumption, resulting in slower query efficiency .

stay Presto in , It's using pipeline Execution mode of . in other words ,TableScan Every time you sweep one Page The data of , Send it to the downstream immediately Limit Operation,Limit Operator After operation , Also send it to the downstream immediately . At the same time ,TableScan Will also continue to scan the next page The data of , Week after week .

3、 ... and . TableScan How to deal with Limit Operator's

​     The following details are described in Presto The middle is the upper Limit Push down to TableScan after ,TableScan How is it handled .

It can also be seen from the picture above , stay TableScan in , Will not scan out all the data before giving it to Limit operator , But one by one Page scanning , The scanned data is greater than Limit After the number of ,Limit The operator informs TableScan Stopped scanning , Never reduce the amount of data scanned . 

原网站

版权声明
本文为[Wangfeihuo]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/176/202206251855547893.html