当前位置:网站首页>Postgresql源码(57)HOT更新为什么性能差距那么大?
Postgresql源码(57)HOT更新为什么性能差距那么大?
2022-07-01 22:57:00 【mingjie73】
相关
《Postgresql源码(52)bitmapset分析RelationGetIndexAttrBitmap》
《Postgresql源码(57)HOT更新为什么性能差距那么大?》
背景
PostgreSQL多版本实现中,堆页面是多版本,索引页面是单版本。如果更新一条堆页面,新元组其实是直接append到表上的,旧元组标记不可见。
同时索引也需要增加一条,但索引这一条是可以优化掉的,方法就是新旧表元组组成链表,用旧索引指向旧元组,再从旧元组指向新元组。
这就是PG的HOT更新机制,在8.3版本后引入这个优化,对多版本下的写放大问题有了较大程度的优化。
1 HOT与非HOT更新性能差距
结果
- 测试环境是kvm虚拟机16C,磁盘性能很差。(不严谨测试)
- 结果差距大的原因主要是因为两种测试场景IO都打满了,IO影响占比升高造成了较大的差距
- IO都打满时,HOT都在更新表页面,NOHOT还要更新很多索引页面。
- 后面分析下什么时候会走HOT?(2.2的第六步)
- 满足三个条件:
- BUFFER不变
- hot_attrs_checked:PAGE没满
- modified_attrs:有索引列被修改了
- 满足三个条件:
TPS | |
---|---|
HOT | 6035.041159 |
NOHOT | 2447.563057 |
IO状况截取
nohot
transaction type: ./test_nohot.sql
scaling factor: 1
query mode: prepared
number of clients: 32
number of threads: 32
duration: 120 s
number of transactions actually processed: 293701
latency average = 13.073 ms
latency stddev = 31.841 ms
initial connection time = 5.961 ms
tps = 2447.563057 (without initial connection time)
statement latencies in milliseconds:
0.001 \set id random(1,10000000)
13.064 update t0 set n1 = random()*100 where id = :id;
hot
transaction type: ./test_hot.sql
scaling factor: 1
query mode: prepared
number of clients: 32
number of threads: 32
duration: 120 s
number of transactions actually processed: 724192
latency average = 5.302 ms
latency stddev = 14.019 ms
initial connection time = 6.242 ms
tps = 6035.041159 (without initial connection time)
statement latencies in milliseconds:
0.001 \set id random(1,10000000)
5.321 update t0 set n2 = random()*100 where id = :id;
测试CASE
drop table t0;
create table t0(id int primary key, n1 int, n2 int);
create index idx_t0_id_n1 on t0(id, n1);
insert into t0 select generate_series(1,10000000);
vi test_nohot.sql
\set id random(1,10000000)
update t0 set n1 = random()*100 where id = :id;
pgbench -M prepared -n -r -P 1 -f ./test_nohot.sql -c 32 -j 32 -T 120
vi test_hot.sql
\set id random(1,10000000)
update t0 set n2 = random()*100 where id = :id;
pgbench -M prepared -n -r -P 1 -f ./test_hot.sql -c 32 -j 32 -T 120
2 update流程
从ExecUpdate函数进入
- 更新前执行:触发器、外键触发器、FDW等
- 检查新元组约束ExecConstraints
- 执行更新table_tuple_update
- 插入新索引元组ExecInsertIndexTuples(HOT更新不需要更新索引)
ExecUpdate
/* 更新前触发器、外键触发器、FDW等 */
/* BEFORE ROW UPDATE Triggers */
/* INSTEAD OF ROW UPDATE Triggers */
/* update in foreign table: let the FDW do it */
...
ExecConstraints
table_tuple_update
/* 可能走HOT不需要新索引项 */
ExecInsertIndexTuples
/* AFTER ROW UPDATE Triggers */
/* Process RETURNING if present */
...
return
2.1 update流程细节(HOT)
再展开上面流程
- 更新前:在ExecutorState内存中重新组装元组ExecMaterializeSlot
- 更新前:拿到所有相关索引ExecOpenIndices
- 更新前:执行:触发器、外键触发器、FDW等
- 检查:新元组约束ExecConstraints
- 执行更新:table_tuple_update
- 返回是否需要更新索引,如果是HOT则不需要更新索引
- heap_update执行具体更新流程(下面继续展开分析)
- 插入新索引元组ExecInsertIndexTuples(HOT不需要更新索引)
ExecUpdate
ExecMaterializeSlot
tts_buffer_heap_materialize
// 切换到ExecutorState内存上下文
// 在新内存池中把元组拼出来
heap_form_tuple
/* 更新前触发器、外键触发器、FDW等 */
...
/* 拿到相关索引oid list */
ExecOpenIndices
/* BEFORE ROW UPDATE Triggers */
/* INSTEAD OF ROW UPDATE Triggers */
/* update in foreign table: let the FDW do it */
...
ExecConstraints
table_tuple_update
heapam_tuple_update
// 执行更新
heap_update
// 是否需要更新索引?HOT?
*update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple)
/* 可能走HOT不需要新索引项 */
if (... && update_indexes)
ExecInsertIndexTuples
// 对每个相关索引都需要构造新的index tuple
for (i = 0; i < numIndices; i++)
FormIndexDatum
index_insert
// checkUnique
/* AFTER ROW UPDATE Triggers */
/* Process RETURNING if present */
...
return
2.2 heap_update函数执行分析(HOT)
在展开上面流程中的heap_update函数:
执行:
drop table t0;
create table t0(id int primary key, n1 int, n2 int);
create index idx_t0_id_n1 on t0(id, n1);
insert into t0 select generate_series(1,2), generate_series(1,2)+100, 888;
update t0 set n2 = 0 where id = 2;
第一步:bitmap配置,找到所有索引列的位置
这里的位图是什么参考这一篇:《Postgresql源码(52)bitmapset分析RelationGetIndexAttrBitmap》
生成三个位图记录索引位置:hot_attrs、key_attrs、id_attrs
n1 id
\/
||
hot_attrs = 1100000000 :哪些列上有索引?
key_attrs = 100000000 :哪些列上有唯一索引?
id_attrs = 100000000 :复制标识
第二步:如果PAGE没满,可能走HOT,构造bitmap interesting_attrs
配置:hot_attrs_checked=true
n1 id
\/
||
interesting_attrs = 1100000000
第三步:锁页面
第四步:哪些索引列被修改了?
HeapDetermineColumnsInfo函数构造modified_attrs位图,这里没有索引列被修改
modified_attrs = 0000000000
这个位图记录的是有哪些索引列被修改了,注意是索引列
第五步:旧元组是否可见?
HeapTupleSatisfiesUpdate(MVCC)
第六步:决定是否走HOT
满足三个条件:
- BUFFER不变
- hot_attrs_checked:上面第二步:PAGE没满
- modified_attrs:上面第四步:有索引列被修改了
if (newbuf == buffer)
if (hot_attrs_checked && !bms_overlap(modified_attrs, hot_attrs))
use_hot_update = true;
第七步:开始HOT更新
- 一、加标志位:
- 老元组:HEAP_HOT_UPDATED、HEAP_ONLY_TUPLE
- 新元组:HEAP_ONLY_TUPLE
- 二、配置旧元组的头
- 三、插入新元组
if (use_hot_update)
HeapTupleSetHotUpdated(&oldtup)
(tup)->t_infomask2 |= HEAP_HOT_UPDATED
HeapTupleSetHeapOnly(heaptup)
(tup)->t_infomask2 |= HEAP_ONLY_TUPLE
HeapTupleSetHeapOnly(newtup)
(tup)->t_infomask2 |= HEAP_ONLY_TUPLE
第八步:记录XLOG
log_heap_update
边栏推荐
- CADD课程学习(3)-- 靶点药物相互作用
- Why is PHP called hypertext preprocessor
- Know --matplotlib
- Yoga27 multidimensional all-in-one computer with excellent appearance and high-end configuration
- MT manager test skiing Adventure
- [micro service sentinel] @sentinelresource details
- Understanding threads
- Jielizhi Bluetooth headset quality control and production skills [chapter]
- Redis~02 缓存:更新数据时如何保证MySQL和Redis中的数据一致性?
- 实在RPA:银行数字化,业务流程自动化“一小步”,贷款审核效率“一大步”
猜你喜欢
微信个人小商店一键开通助手小程序开发
Zhao Fuquan: to ensure supply in the short term, we should build a safe, efficient and resilient supply chain in the long term
物联网技术应用属于什么专业分类
STM32F030F4驱动TIM1637数码管芯片
"35 years old, the boss of the company, with a monthly salary of 20000, give away takeout": the times abandoned you, not even saying goodbye
Matplotlib常用图表
Paramètres communs de matplotlib
Aaai22 | structural tagging and interaction modeling: a "slim" network for graph classification
为什么PHP叫超文本预处理器
Use 3DMAX to make a chess piece
随机推荐
2022 R1 fast opening pressure vessel operation test questions and answers
认识线程
"35 years old, the boss of the company, with a monthly salary of 20000, give away takeout": the times abandoned you, not even saying goodbye
URL 介绍
Matplotlib常用图表
What professional classification does the application of Internet of things technology belong to
AirServer最新Win64位个人版投屏软件
win 10 mstsc连接 RemoteApp
ShanDong Multi-University Training #3
微服务服务稳定性治理
什么是马赛克?
通过Go语言创建CA与签发证书
微信个人小商店一键开通助手小程序开发
物联网应用技术专业是属于什么类
Zhongang Mining: it has inherent advantages to develop the characteristic chemical industry dominated by fluorine chemical industry
Detailed explanation of twenty common software testing methods (the most complete in History)
Create Ca and issue certificate through go language
Matplotlib common charts
Commemorate becoming the first dayus200 tripartite demo contributor
The online beggar function of Japanese shopping websites