当前位置:网站首页>Postgresql源码(58)元组拼接heap_form_tuple剖析
Postgresql源码(58)元组拼接heap_form_tuple剖析
2022-07-01 22:57:00 【mingjie73】
版本:14
相关:
《Postgresql源码(51)变长类型实现(valena.c)》
《Postgresql源码(56)可扩展类型分析ExpandedObject/ExpandedRecord》
2 背景
- PG中元组的表现有两种格式:expanded格式(便于计算)和flatten格式(便于保存)
- 前文《Postgresql源码(56)可扩展类型分析ExpandedObject/ExpandedRecord》中说明了元组的扩展格式
- 本篇介绍元组更通用的flatten格式HeapTupleData
- expanded格式和flatten格式是可以互相转换的(flatten_into函数指针,参考Postgresql源码(56))
typedef struct HeapTupleData
{
uint32 t_len; /* length of *t_data */
ItemPointerData t_self; /* SelfItemPointer */
Oid t_tableOid; /* table the tuple came from */
HeapTupleHeader t_data; /* -> tuple header and data */
} HeapTupleData;
- t_len来看,这是一个很明显的4B头变长结构(参考《Postgresql源码(51)变长类型实现(valena.c)》),变长类型使用4B头遵循PG内部约定。
3 HeapTuple的构造函数heap_form_tuple
HeapTuple结构在heap_form_tuple函数中拼接,后文重点分析这个函数:
这里已插入5列数据为例:三定长、二变长
drop table t21;
create table t21(i1 int, v10 varchar(10), n1 numeric, c2 char(2), t1 text);
insert into t21 values (1, 'mylen=7', 5.5, '22', 'hi12345');
3.1 heap_form_tuple入参
构造函数heap_form_tuple
HeapTuple
heap_form_tuple(TupleDesc tupleDescriptor, Datum *values, bool *isnull)
注意入参是一个元组描述符、值数组、isnull数组,值数组里面记的是int值或datum数据指针
(gdb) p *tupleDescriptor
$9 = {natts = 5, tdtypeid = 2249, tdtypmod = -1, tdrefcount = -1, constr = 0x0, attrs = 0x199ce90}
(gdb) p values[0]
$11 = 1 : int的值
(gdb) p values[1]
$12 = 27157600 : datum数据指针
(gdb) p values[2]
$13 = 27153160 : datum数据指针
(gdb) p values[3]
$14 = 27158432 : datum数据指针
(gdb) p values[4]
$15 = 27154592 : datum数据指针
(gdb) p isnull[0]
$17 = false
(gdb) p isnull[1]
$18 = false
(gdb) p isnull[2]
$19 = false
(gdb) p isnull[3]
$20 = false
(gdb) p isnull[4]
$21 = false
3.2 heap_form_tuple执行流程
- 注意:hoff的位置是HeapTupleHeaderData往后多少能偏移到数据
- 注意:tuple->t_data的位置是HeapTupleData往后偏移多少能到HeapTupleHeaderData头的位置
- 内存结构是:
HeapTupleData+HeapTupleHeaderData+数据
heap_form_tuple
...
len = offsetof(HeapTupleHeaderData, t_bits) : 计算出头的大小len = 23,t_bits是柔性数组指针
hoff = len = MAXALIGN(len); : 对齐hoff = len = 24
data_len = heap_compute_data_size(...) : 计算出数据需要的长度见3.3,共data_len = 30字节
len += data_len; : len = 24 + 30 = 54
tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len) : 申请HeapTupleData + HeapTupleHeaderData + 数据30字节
tuple->t_data = td = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE)
: t_data指向的是HeapTupleData后,HeapTupleHeaderData头的位置
...
// 配置tuple的值
...
heap_fill_tuple : 根据数据类型开始添加数据,见3.4
3.3 heap_compute_data_size
计算数据长度heap_compute_data_size,已下面SQL为例
drop table t21;
create table t21(i1 int, v10 varchar(10), n1 numeric, c2 char(2), t1 text);
insert into t21 values (1, 'mylen=7', 5.5, '22', 'hi12345');
函数对每个列单独处理,主要处理逻辑走三个分支:
3.3.1 三个分支的进入逻辑
分支一: atti->attlen == -1 且 atti->attstorage != 'p' 且 当前是4B头 且 数据很短能换成1B头
分支二: atti->attlen == -1 且 当前是1B_E头 且 1B_E是RO类型VARTAG_EXPANDED_RO
分支三: 其他情况
if (ATT_IS_PACKABLE(atti) &&
VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
{
/*
* we're anticipating converting to a short varlena header, so
* adjust length and don't count any alignment
*/
data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val));
}
else if (atti->attlen == -1 &&
VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(val)))
{
/*
* we want to flatten the expanded value so that the constructed
* tuple doesn't depend on it
*/
data_length = att_align_nominal(data_length, atti->attalign);
data_length += EOH_get_flat_size(DatumGetEOHP(val));
}
else
{
data_length = att_align_datum(data_length, atti->attalign,
atti->attlen, val);
data_length = att_addlength_datum(data_length, atti->attlen,
val);
}
对于五列测试数据
int类型:走分支三(长度4)
(gdb) p atti->attlen
$30 = 4
(gdb) p atti->attstorage
$31 = 112 'p'
计算流程
// 第一步:对齐data_length=0,对齐后还是0
data_length = att_align_datum(data_length, atti->attalign,
atti->attlen, val);
// 第二步:加上长度atti->attlen,data_length=4
data_length = att_addlength_datum(data_length, atti->attlen,
val);
长度增加4
varchar类型:走分支一(长度8)
(gdb) p atti->attlen
$38 = -1
(gdb) p atti->attstorage
$39 = 120 'x'
计算流程
// 能1B就能装下了, 后面会把4B转成1B头,这里按1B计算长度即可
data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val))
长度增加8
numeric类型:走分支一(长度7)
char类型:走分支一(长度3)
1B头加上自己2个字节,一共三字节
text类型:走分支一(长度8)
1B头加上自己7个字节,一共8字节
3.4 heap_fill_tuple
heap_fill_tuple对每一列调用fill_val填入数据
heap_fill_tuple
for (i = 0; i < numberOfAttributes; i++)
fill_val(...)
fill_val的分支就比较多了,对于每一列都进入下面4个分支来处理
if (att->attbyval)
{
/* pass-by-value */
data = (char *) att_align_nominal(data, att->attalign);
store_att_byval(data, datum, att->attlen);
data_length = att->attlen;
}
else if (att->attlen == -1)
{
/* varlena */
Pointer val = DatumGetPointer(datum);
*infomask |= HEAP_HASVARWIDTH;
if (VARATT_IS_EXTERNAL(val))
{
if (VARATT_IS_EXTERNAL_EXPANDED(val))
{
/*
* we want to flatten the expanded value so that the
* constructed tuple doesn't depend on it
*/
ExpandedObjectHeader *eoh = DatumGetEOHP(datum);
data = (char *) att_align_nominal(data,
att->attalign);
data_length = EOH_get_flat_size(eoh);
EOH_flatten_into(eoh, data, data_length);
}
else
{
*infomask |= HEAP_HASEXTERNAL;
/* no alignment, since it's short by definition */
data_length = VARSIZE_EXTERNAL(val);
memcpy(data, val, data_length);
}
}
else if (VARATT_IS_SHORT(val))
{
/* no alignment for short varlenas */
data_length = VARSIZE_SHORT(val);
memcpy(data, val, data_length);
}
else if (VARLENA_ATT_IS_PACKABLE(att) &&
VARATT_CAN_MAKE_SHORT(val))
{
/* convert to short varlena -- no alignment */
data_length = VARATT_CONVERTED_SHORT_SIZE(val);
SET_VARSIZE_SHORT(data, data_length);
memcpy(data + 1, VARDATA(val), data_length - 1);
}
else
{
/* full 4-byte header varlena */
data = (char *) att_align_nominal(data,
att->attalign);
data_length = VARSIZE(val);
memcpy(data, val, data_length);
}
}
else if (att->attlen == -2)
{
/* cstring ... never needs alignment */
*infomask |= HEAP_HASVARWIDTH;
Assert(att->attalign == TYPALIGN_CHAR);
data_length = strlen(DatumGetCString(datum)) + 1;
memcpy(data, DatumGetPointer(datum), data_length);
}
else
{
/* fixed-length pass-by-reference */
data = (char *) att_align_nominal(data, att->attalign);
Assert(att->attlen > 0);
data_length = att->attlen;
memcpy(data, DatumGetPointer(datum), data_length);
}
分支:
att->attbyval == true值是直接传递的,就直接赋值就好了att->attlen == -1变长头类型,要走valena按4B、1B、1B_E分别处理att->attlen == -2直接拷贝cstring类型- 其他:直接拷贝
对于五列测试数据
int类型:走分支一:值拷贝
传值的数据保存在栈内存上,直接赋值即可
varchar类型:走分支二:数据4B转换为1B后内存拷贝
数据足够小,可以不用4B头存储,转换为1B头保存后拷贝
numeric类型:走分支二:数据4B转换为1B后内存拷贝
数据足够小,可以不用4B头存储,转换为1B头保存后拷贝
char类型:走分支二:数据4B转换为1B后内存拷贝
数据足够小,可以不用4B头存储,转换为1B头保存后拷贝
text类型:走分支二:数据4B转换为1B后内存拷贝
数据足够小,可以不用4B头存储,转换为1B头保存后拷贝
边栏推荐
猜你喜欢

什么是马赛克?

Istio, ebpf and rsocket Broker: in depth study of service grid

SWT/ANR问题--SWT 导致 kernel fuse deadlock

Redis~02 缓存:更新数据时如何保证MySQL和Redis中的数据一致性?

YOGA27多维一体电脑,兼具出色外观与高端配置

Future trend and development of neural network Internet of things
![[MySQL] basic use of explain and the function of each column](/img/d6/64f65ba21f5cda2c409477705f6a79.png)
[MySQL] basic use of explain and the function of each column

Huisheng Huiying 2022 intelligent, fast and simple video editing software

神经网络物联网的发展趋势和未来方向

硅谷产品实战学习感触
随机推荐
“35岁,公司老总,月薪2万送外卖“:时代抛弃你,连声再见都没有
openresty 负载均衡
Convergence and disposal suggestions of some Internet exposure surfaces
共享电商的背后: 共创、共生、共享、共富,共赢的共富精神
物联网技术应用属于什么专业分类
MT manager test skiing Adventure
Current situation and future development trend of Internet of things
Understanding threads
Istio, ebpf and rsocket Broker: in depth study of service grid
Wechat personal small store one click opening assistant applet development
Microservice stability management
Daily three questions 6.30
2022年起重机司机(限桥式起重机)考试试题及模拟考试
Y53. Chapter III kubernetes from introduction to mastery -- ingress (26)
Why is PHP called hypertext preprocessor
SWT/ANR问题--SWT 导致 kernel fuse deadlock
【Swoole系列1】在Swoole的世界中,你将学习到什么?
Zhao Fuquan: to ensure supply in the short term, we should build a safe, efficient and resilient supply chain in the long term
认识线程
Yoga27 multidimensional all-in-one computer with excellent appearance and high-end configuration