当前位置:网站首页>Postgresql源码(58)元组拼接heap_form_tuple剖析
Postgresql源码(58)元组拼接heap_form_tuple剖析
2022-07-01 22:57:00 【mingjie73】
版本:14
相关:
《Postgresql源码(51)变长类型实现(valena.c)》
《Postgresql源码(56)可扩展类型分析ExpandedObject/ExpandedRecord》
2 背景
- PG中元组的表现有两种格式:expanded格式(便于计算)和flatten格式(便于保存)
- 前文《Postgresql源码(56)可扩展类型分析ExpandedObject/ExpandedRecord》中说明了元组的扩展格式
- 本篇介绍元组更通用的flatten格式HeapTupleData
- expanded格式和flatten格式是可以互相转换的(flatten_into函数指针,参考Postgresql源码(56))
typedef struct HeapTupleData
{
uint32 t_len; /* length of *t_data */
ItemPointerData t_self; /* SelfItemPointer */
Oid t_tableOid; /* table the tuple came from */
HeapTupleHeader t_data; /* -> tuple header and data */
} HeapTupleData;
- t_len来看,这是一个很明显的4B头变长结构(参考《Postgresql源码(51)变长类型实现(valena.c)》),变长类型使用4B头遵循PG内部约定。
3 HeapTuple的构造函数heap_form_tuple
HeapTuple结构在heap_form_tuple函数中拼接,后文重点分析这个函数:
这里已插入5列数据为例:三定长、二变长
drop table t21;
create table t21(i1 int, v10 varchar(10), n1 numeric, c2 char(2), t1 text);
insert into t21 values (1, 'mylen=7', 5.5, '22', 'hi12345');
3.1 heap_form_tuple入参
构造函数heap_form_tuple
HeapTuple
heap_form_tuple(TupleDesc tupleDescriptor, Datum *values, bool *isnull)
注意入参是一个元组描述符、值数组、isnull数组,值数组里面记的是int值或datum数据指针
(gdb) p *tupleDescriptor
$9 = {natts = 5, tdtypeid = 2249, tdtypmod = -1, tdrefcount = -1, constr = 0x0, attrs = 0x199ce90}
(gdb) p values[0]
$11 = 1 : int的值
(gdb) p values[1]
$12 = 27157600 : datum数据指针
(gdb) p values[2]
$13 = 27153160 : datum数据指针
(gdb) p values[3]
$14 = 27158432 : datum数据指针
(gdb) p values[4]
$15 = 27154592 : datum数据指针
(gdb) p isnull[0]
$17 = false
(gdb) p isnull[1]
$18 = false
(gdb) p isnull[2]
$19 = false
(gdb) p isnull[3]
$20 = false
(gdb) p isnull[4]
$21 = false
3.2 heap_form_tuple执行流程
- 注意:hoff的位置是HeapTupleHeaderData往后多少能偏移到数据
- 注意:tuple->t_data的位置是HeapTupleData往后偏移多少能到HeapTupleHeaderData头的位置
- 内存结构是:
HeapTupleData+HeapTupleHeaderData+数据
heap_form_tuple
...
len = offsetof(HeapTupleHeaderData, t_bits) : 计算出头的大小len = 23,t_bits是柔性数组指针
hoff = len = MAXALIGN(len); : 对齐hoff = len = 24
data_len = heap_compute_data_size(...) : 计算出数据需要的长度见3.3,共data_len = 30字节
len += data_len; : len = 24 + 30 = 54
tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len) : 申请HeapTupleData + HeapTupleHeaderData + 数据30字节
tuple->t_data = td = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE)
: t_data指向的是HeapTupleData后,HeapTupleHeaderData头的位置
...
// 配置tuple的值
...
heap_fill_tuple : 根据数据类型开始添加数据,见3.4
3.3 heap_compute_data_size
计算数据长度heap_compute_data_size,已下面SQL为例
drop table t21;
create table t21(i1 int, v10 varchar(10), n1 numeric, c2 char(2), t1 text);
insert into t21 values (1, 'mylen=7', 5.5, '22', 'hi12345');
函数对每个列单独处理,主要处理逻辑走三个分支:
3.3.1 三个分支的进入逻辑
分支一: atti->attlen == -1 且 atti->attstorage != 'p' 且 当前是4B头 且 数据很短能换成1B头
分支二: atti->attlen == -1 且 当前是1B_E头 且 1B_E是RO类型VARTAG_EXPANDED_RO
分支三: 其他情况
if (ATT_IS_PACKABLE(atti) &&
VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
{
/*
* we're anticipating converting to a short varlena header, so
* adjust length and don't count any alignment
*/
data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val));
}
else if (atti->attlen == -1 &&
VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(val)))
{
/*
* we want to flatten the expanded value so that the constructed
* tuple doesn't depend on it
*/
data_length = att_align_nominal(data_length, atti->attalign);
data_length += EOH_get_flat_size(DatumGetEOHP(val));
}
else
{
data_length = att_align_datum(data_length, atti->attalign,
atti->attlen, val);
data_length = att_addlength_datum(data_length, atti->attlen,
val);
}
对于五列测试数据
int类型:走分支三(长度4)
(gdb) p atti->attlen
$30 = 4
(gdb) p atti->attstorage
$31 = 112 'p'
计算流程
// 第一步:对齐data_length=0,对齐后还是0
data_length = att_align_datum(data_length, atti->attalign,
atti->attlen, val);
// 第二步:加上长度atti->attlen,data_length=4
data_length = att_addlength_datum(data_length, atti->attlen,
val);
长度增加4
varchar类型:走分支一(长度8)
(gdb) p atti->attlen
$38 = -1
(gdb) p atti->attstorage
$39 = 120 'x'
计算流程
// 能1B就能装下了, 后面会把4B转成1B头,这里按1B计算长度即可
data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val))
长度增加8
numeric类型:走分支一(长度7)
char类型:走分支一(长度3)
1B头加上自己2个字节,一共三字节
text类型:走分支一(长度8)
1B头加上自己7个字节,一共8字节
3.4 heap_fill_tuple
heap_fill_tuple对每一列调用fill_val填入数据
heap_fill_tuple
for (i = 0; i < numberOfAttributes; i++)
fill_val(...)
fill_val的分支就比较多了,对于每一列都进入下面4个分支来处理
if (att->attbyval)
{
/* pass-by-value */
data = (char *) att_align_nominal(data, att->attalign);
store_att_byval(data, datum, att->attlen);
data_length = att->attlen;
}
else if (att->attlen == -1)
{
/* varlena */
Pointer val = DatumGetPointer(datum);
*infomask |= HEAP_HASVARWIDTH;
if (VARATT_IS_EXTERNAL(val))
{
if (VARATT_IS_EXTERNAL_EXPANDED(val))
{
/*
* we want to flatten the expanded value so that the
* constructed tuple doesn't depend on it
*/
ExpandedObjectHeader *eoh = DatumGetEOHP(datum);
data = (char *) att_align_nominal(data,
att->attalign);
data_length = EOH_get_flat_size(eoh);
EOH_flatten_into(eoh, data, data_length);
}
else
{
*infomask |= HEAP_HASEXTERNAL;
/* no alignment, since it's short by definition */
data_length = VARSIZE_EXTERNAL(val);
memcpy(data, val, data_length);
}
}
else if (VARATT_IS_SHORT(val))
{
/* no alignment for short varlenas */
data_length = VARSIZE_SHORT(val);
memcpy(data, val, data_length);
}
else if (VARLENA_ATT_IS_PACKABLE(att) &&
VARATT_CAN_MAKE_SHORT(val))
{
/* convert to short varlena -- no alignment */
data_length = VARATT_CONVERTED_SHORT_SIZE(val);
SET_VARSIZE_SHORT(data, data_length);
memcpy(data + 1, VARDATA(val), data_length - 1);
}
else
{
/* full 4-byte header varlena */
data = (char *) att_align_nominal(data,
att->attalign);
data_length = VARSIZE(val);
memcpy(data, val, data_length);
}
}
else if (att->attlen == -2)
{
/* cstring ... never needs alignment */
*infomask |= HEAP_HASVARWIDTH;
Assert(att->attalign == TYPALIGN_CHAR);
data_length = strlen(DatumGetCString(datum)) + 1;
memcpy(data, DatumGetPointer(datum), data_length);
}
else
{
/* fixed-length pass-by-reference */
data = (char *) att_align_nominal(data, att->attalign);
Assert(att->attlen > 0);
data_length = att->attlen;
memcpy(data, DatumGetPointer(datum), data_length);
}
分支:
att->attbyval == true值是直接传递的,就直接赋值就好了att->attlen == -1变长头类型,要走valena按4B、1B、1B_E分别处理att->attlen == -2直接拷贝cstring类型- 其他:直接拷贝
对于五列测试数据
int类型:走分支一:值拷贝
传值的数据保存在栈内存上,直接赋值即可
varchar类型:走分支二:数据4B转换为1B后内存拷贝
数据足够小,可以不用4B头存储,转换为1B头保存后拷贝
numeric类型:走分支二:数据4B转换为1B后内存拷贝
数据足够小,可以不用4B头存储,转换为1B头保存后拷贝
char类型:走分支二:数据4B转换为1B后内存拷贝
数据足够小,可以不用4B头存储,转换为1B头保存后拷贝
text类型:走分支二:数据4B转换为1B后内存拷贝
数据足够小,可以不用4B头存储,转换为1B头保存后拷贝
边栏推荐
- URL 介绍
- Future trend and development of neural network Internet of things
- 软件架构的本质
- What category does the Internet of things application technology major belong to
- Some abilities can't be learned from work. Look at this article, more than 90% of peers
- 2022安全员-C证考试题模拟考试题库及模拟考试
- Convergence and disposal suggestions of some Internet exposure surfaces
- What is the relationship between modeling and later film and television?
- 距离度量 —— 汉明距离(Hamming Distance)
- 马赛克后挡板是什么?
猜你喜欢

Matplotlib common settings

Airserver latest win64 bit personal screen projection software

微信个人小商店一键开通助手小程序开发
![[MySQL] basic use of explain and the function of each column](/img/d6/64f65ba21f5cda2c409477705f6a79.png)
[MySQL] basic use of explain and the function of each column

实在RPA:银行数字化,业务流程自动化“一小步”,贷款审核效率“一大步”

从第三次技术革命看企业应用三大开发趋势

【微服务|Sentinel】sentinel整合openfeign

MT manager test skiing Adventure

What is mosaic?

“35岁,公司老总,月薪2万送外卖“:时代抛弃你,连声再见都没有
随机推荐
De PIP. Interne. CLI. Main Import main modulenotfounderror: No module named 'PIP'
from pip._ internal. cli. main import main ModuleNotFoundError: No module named ‘pip‘
物联网开发零基础教程
Daily three questions 6.29
CKS CKA ckad change terminal to remote desktop
“35岁,公司老总,月薪2万送外卖“:时代抛弃你,连声再见都没有
问题随记 —— /usr/bin/perl is needed by MySQL-server-5.1.73-1.glibc23.x86_64
实在RPA:银行数字化,业务流程自动化“一小步”,贷款审核效率“一大步”
Glass mosaic
SWT/ANR问题--SWT 导致 low memory killer(LMK)
Zhongang Mining: it has inherent advantages to develop the characteristic chemical industry dominated by fluorine chemical industry
from pip._internal.cli.main import main ModuleNotFoundError: No module named ‘pip‘
转行软件测试,知道这四点就够了!
Create Ca and issue certificate through go language
【微服务|Sentinel】@SentinelResource详解
dat.GUI
AirServer最新Win64位个人版投屏软件
2022 examination questions and online simulation examination for safety management personnel of hazardous chemical business units
Detailed explanation of twenty common software testing methods (the most complete in History)
想请教股票开户要认识谁?在线开户是安全么?