当前位置:网站首页>Postgresql source code (64) Query execution - data structure and execution process before submodule Executor (2) execution
Postgresql source code (64) Query execution - data structure and execution process before submodule Executor (2) execution
2022-08-03 19:04:00 【mingjie】
上一篇说明了执行的框架,本篇深入分析执行细节.测试用例不变,还是分析之前的case.
0 总结
- 下图中的planstate有四类:控制节点、扫描节点、连接节点、物化节点
- 扫描节点公共父类:Scan
- 连接节点公共父类:Join
- Plan的子节点通过Plan的lefttree和righttree指针连接,构成计划树
- 执行时,Planstate用于记录各节点的执行状态,estate中的es_tupleTable在节点间传递元组.
- Excutor输入QueryDesc,包含plannedstmt树形结构;执行前用plannedstmt初始化节点状态树planstate,同时初始化全局状态信息estate.然后执行planstate根节点的函数指针,进入根节点业务处理函数(例如nestloop),pull模型向下层取数据拉动整个计划树的执行.
1 ExecutorRun执行前数据结构
执行计划:
- teach_course和teacher走hash连接,生成outer表(驱动表)
- course表做inner表
- 循环嵌套连接:course.no是连接键,位于course表的第一列.
explain select t.name, c.name, stu_num
from course as c, teach_course as tc, teacher as t
where c.no = tc.cno and tc.tno = t.no and c.name = 'Database System' and t.name = 'Jennifer';
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=24.10..74.58 rows=1 width=68)
-> Hash Join (cost=23.95..62.61 rows=61 width=40)
Hash Cond: (tc.tno = t.no)
-> Seq Scan on teach_course tc (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=23.88..23.88 rows=6 width=36)
-> Seq Scan on teacher t (cost=0.00..23.88 rows=6 width=36)
Filter: ((name)::text = 'Jennifer'::text)
-> Index Scan using course_pkey on course c (cost=0.15..0.20 rows=1 width=36)
Index Cond: (no = tc.cno)
Filter: ((name)::text = 'Database System'::text)
执行前的数据结构:
2 ExecutorRun执行过程
测试SQL
select t.name, c.name, stu_num
from course as c, teach_course as tc, teacher as t
where c.no = tc.cno and tc.tno = t.no and c.name = 'Database System' and t.name = 'Jennifer';
调试起点:b ExecutorRun
我们再看下这张图:
2.1 ExecNestLoop
nestloop需要从外表(outer表)中(驱动表)顺序扫描拿一条,在从内表(inner表)中找这条能连上的.具体在这个执行计划中:
- 从hashjoin的结果中按顺序那一条(outer表)
- 用这一条去indexscan找能连上的(去inner表上索引扫描)
- 返回一条结果
执行过程
- 用Outerplan从驱动表里面拿一条
- 用这一条去innerplan里面找一条能连上的
- 返回
ExecNestLoop
...
outerPlan = outerPlanState(node) // left node
innerPlan = innerPlanState(node) // right node
// loop until we return a qualifying join tuple
for (;;)
// 从outer表的计划上找到一条
outerTupleSlot = ExecProcNode(outerPlan)
// 完了去inner表里面拿一条(要能连得上的一条)
innerTupleSlot = ExecProcNode(innerPlan)
ExecIndexScan
for (;;)
// 【重要】注意这个函数,ExecScanFetch没有具体业务逻辑,这是个框架函数
// ExecScanFetch用accessMtd拿到元组,在用recheckMtd检查元组是否符合要求
// 这里用的是两个通用函数 IndexNext 和 IndexRecheck
slot = ExecScanFetch(node, accessMtd, recheckMtd)
// 进入bt堆栈
IndexNext
2.2 ExecHashJoin
ExecHashJoin的逻辑是一个小型switch状态机,通过流转状态走完业务逻辑,读起来比较顺畅.
hashjoin会seqscan扫左表,同时把右表创建成一个哈希表(会带着过滤条件,并不是把所有元组都建到哈希表里面)
- 从左表中拿一条
- 用这一条去哈希表里面查询,如果能连上就返回一条
执行过程:
- 创建右节点的哈希表
- 从左节点拿一个元组
- 去哈希表中匹配
- 匹配上返回,匹配不上goto 2
ExecHashJoinImpl
ExecHashTableCreate
// 拿一条左表中的数据
ExecHashJoinOuterGetTuple
// T_SeqScanState
slot = ExecProcNode(outerNode)
// 找到能连上的tuple所在的bucket
ExecHashGetBucketAndBatch
ExecHashGetSkewBucket
// 查找bucket
ExecScanHashBucket
// 找到连接元组
2.3 ExecSeqScan
seqscan为什么那么费CPU又效率低. 答:无等待无IO连续下面几件事情:
- 找buffer(没在需要IO上来)
- 在buffer找到合适的位置,用tuple->t_data指上去
- 拼tupleslot
- 返回
- 继续循环
上面5步不涉及IO、没有任何会sleep的逻辑,基本就是连续的函数调用、变量赋值,所以持续做会把单核打满.效率低是应为没条件一行一行全部都要扫一遍,慢是必然的.
SeqNext
scandesc = table_beginscan
table_scan_getnextslot
slot->tts_tableOid = RelationGetRelid(sscan->rs_rd)
heap_getnextslot
heapgettup_pagemode
heapgetpage
BufferGetPage
tuple->t_data = (HeapTupleHeader) PageGetItem((Page) dp, lpp)
边栏推荐
- 高数---级数
- 【夜莺监控方案】08-监控msyql集群(prometheuse+n9e+mysqld_exporter)
- Zhong Hua, senior architect of Ali: China-Taiwan strategic thinking and architecture practice; including internal implementation manual
- WEB 渗透之RCE
- 力扣解法汇总899-有序队列
- warnings.warn(“Title is more than 31 characters. Some applications may not be able to read the file
- fatal error: jni.h: No such file or directory
- pytest接口自动化测试框架 | 基于Pytest的Web UI自动化测试框架介绍
- 分享即时通讯开发之WebSocket:概念、原理、易错常识、动手实践
- MySQL【变量、流程控制与游标】
猜你喜欢
Zhong Hua, senior architect of Ali: China-Taiwan strategic thinking and architecture practice; including internal implementation manual
vulnhub pyexp: 1
MySQL如何一劳永逸的永久支持输入中文
MySQL——增删改查进阶
Don't look down upon the WebSocket!Long connection, stateful, two-way, full-duplex king is Fried
图像超分——Real-ESRGAN快速上手
学弟:我适不适合转行做软件测试?
如何理解即时通讯开发移动网络的“弱”和“慢”
Bytes to beat three sides take offer: network + GC + + IO + redis + JVM red-black tree + data structure, to help you quickly into the giant!!!!!
首届MogDB征文活动开启啦!
随机推荐
阿里资深架构师钟华曰:中台战略思想与架构实战;含内部实施手册
Postgresql快照优化Globalvis新体系分析(性能大幅增强)
盲僧发现了华点——教你如何使用API接口获取数据
力扣刷题之爬楼梯(7/30)
多线程和并发编程(四)
使用安全浏览器将网页保存为pdf的方法步骤
X86函数调用模型分析
Chrome浏览器开发新截图工具,安全浏览器截图方法
Radondb mysql installation problems
Bytes to beat three sides take offer: network + GC + + IO + redis + JVM red-black tree + data structure, to help you quickly into the giant!!!!!
Arduino实验三:继电器实验
阿里巴巴政委体系-第九章、阿里政委启示录
OneNote 教程,如何在 OneNote 中设置页面格式?
PHP base notes - NO. 1
Cobalt Strike (CS) 逆向初探
flex布局
201709-3 CCF jason查询 (满分题解)
实时渲染器不止lumion,Chaos Vantage你值得一试
PHP Basic Notes-NO.2
梅科尔工作室-14天华为培训七