当前位置:网站首页>PostgreSQL Guide: Insider exploration (Chapter 7 heap tuples and index only scanning) - Notes
PostgreSQL Guide: Insider exploration (Chapter 7 heap tuples and index only scanning) - Notes
2022-06-13 04:46:00 【Shallow as the breeze CYF】
Chapter vii. Heap tuples and index only scans
List of articles
- [ Chapter vii. Heap tuples and index only scans ](https://pg-internal.vonng.com/#/ch7?id= Chapter vii. - Heap tuples and index only scans )
This chapter introduces two features related to index scanning —— Heap tuples (heap only tuple, HOT) and Index scan only (index-only scan) .
7.1 Heap tuples (HOT)
- HOT characteristic , When updating rows , The new row can be placed in the same data page as the old row , So as to make efficient use of the data pages of indexes and tables ;
- HOT Features reduce unnecessary cleanup processes .
First ,7.1.1 Section describes how to HOT When it's characteristic , What is the process of updating a row , To clarify the problem to be solved .
Next , stay 7.1.2 Will introduce HOT What did you do .
7.1.1 No, HOT Row update on
Hypothesis table tbl There are two columns :id and data;id yes tbl Primary key of .
testdb=# \d tbl
Table "public.tbl"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | not null |
data | text | | |
Indexes:
"tbl_pkey" PRIMARY KEY, btree (id)
surface tbl Yes 1000 Bar tuple ; Of the last tuple id yes 1000, Stored in the fifth data page . The last tuple is referenced by the corresponding index tuple , Index tuples key yes 1000, And tid yes (5,1), Pictured 7.1(a) Shown .
chart 7.1 No, HOT Row update for

Let's think about it , No, HOT features , How the last tuple is updated .
testdb=# UPDATE tbl SET data = 'B' WHERE id = 1000;
In this scene ,PostgreSQL Not only to insert a new table tuple , You also need to insert a new index tuple into the index page , Pictured 7.1(b) Shown . The insertion of index tuples consumes the space of index pages , Moreover, the insertion and cleaning of index tuples are expensive operations .HOT Purpose , Is to reduce this effect .
7.1.2 HOT How to work
When using HOT When a property updates a row , If ** The updated tuple is stored in the page where the old tuple is located ,PostgreSQL The corresponding index tuple will not be inserted , It is Set the... Of the new element group respectively HEAP_ONLY_TUPLE Mark bits and old tuples HEAP_HOT_UPDATED Marker bit , Both tag bits are stored in the... Of the tuple t_informask2 Field **. Pictured 7.2 and 7.3 Shown ;
chart 7.2 HOT Row update for


For example, in this case ,Tuple_1 and Tuple_2 Are set to HEAP_HOT_UPDATED and HEAP_ONLY_TUPLE.
in addition , stay ** trim (pruning) and Defragmentation (defragmentation)** In the process , Will use the following HEAP_HOT_UPDATED and HEAP_ONLY_TUPLE Marker bit .
Next, we will introduce , When based on HOT After updating a tuple ,PostgreSQL How to access these data in index scanning HOT Updated tuples , Pictured 7.4(a) Shown .
chart 7.4 Row pointer pruning

- Find the index tuple pointing to the target data tuple
- Access the row pointer array by the position pointed to by the obtained index tuple , Find line pointer
1 - Read
Tuple_1 - Through
Tuple_1Oft_ctidField , ReadTuple_2.
under these circumstances ,PostgreSQL Will read two tuples ,Tuple_1 and Tuple_2, And pass The first 5 Chapter The concurrency control mechanism to determine which tuple is visible ; But if... In the data page ** Dead tuple (dead tuple)** It has been cleaned up , Then there's a problem . Like in the picture 7.4(a) in , If Tuple_1 It was cleaned up because it was a dead tuple , Can't be accessed through the index Tuple_2 了 .
To solve this problem ,PostgreSQL At the right time Line pointer redirection : Re point the row pointer pointing to the old tuple to the row pointer of the new group . stay PostgreSQL in , This process is called trim (pruning). chart 7.4(b) Illustrates the PostgreSQL How to access updated tuples after pruning .
- Find index tuple
- By indexing tuples , Find line pointer
[1] - Through the redirected line pointer
[1], Find line pointer[2]; - Through line pointer
[2], ReadTuple_2
Pruning can happen at any time , such as SELECT ,UPDATE, INSERT ,DELETE This kind of SQL When the command is executed .
stay PostgreSQL When pruning , If possible , Will choose the right time to clean up dead tuples . stay PostgreSQL This operation in is called Defragmentation (defragmentation), chart 7.5 Described in HOT The defragmentation process in .
chart 7.5 Defragmentation of dead tuples

It should be noted that , because Defragmentation does not involve the removal of index tuples , Therefore, defragmentation is much less expensive than regular cleanup .
therefore ,HOT Features reduce the space consumption of indexes and tables , It also reduces the number of tuples to be processed by the cleanup process . Since the number of index tuples to be inserted for update operation is reduced , It also reduces the number of tuples to be processed by the cleanup operation ,HOT It can promote the performance improvement .
HOT Scenes that are not available
In order to understand clearly HOT The job of , Here are some HOT Scenes that are not available .
- When the updated tuple is on another page , That is, with the old tuple ** When not in the same data page **, The index tuple pointing to this tuple will also be added to the index page , Pictured 7.6(a) Shown .
- When Indexed key update , A new index tuple will be inserted into the index page , Pictured 7.6(b) Shown .
chart 7.6 HOT Not applicable
pg_stat_all_tablesThe view provides a view of statistics for each table , You can also refer to this Expand .
7.2 Index scan only
** When SELECT sentence ** When all the target columns of are in the index key , In order to reduce the I/O cost , Index scan only (Index-Only Scan)( Also called index only access ) Will directly use the key value in the index . This technology is available in all commercial relational databases , such as DB2 and Oracle.PostgreSQL stay 9.2 This feature was introduced in the version .
Next we will base on a special example , Introduce PostgreSQL The working process of index scan only in .
The first is the assumption about this example :
The table definition
We have a
tblsurface , It is defined as follows :testdb=# \d tbl Table "public.tbl" Column | Type | Modifiers --------+---------+----------- id | integer | name | text | data | text | Indexes: "tbl_idx" btree (id, name)Indexes
surface
tblThere is an indextbl_idx, There are two columns :idandname.Tuples
tblSome tuples have been inserted .id=18, name = 'Queen'OfTuple_18Stored in 0 No. data page .id=19, name='BOSTON'OfTuple_19Stored in 1 No. data page .visibility
** All in 0 Tuples in page No. are always visible ;1 Tuples in page number are not always visible . Note that the visibility information for each page is stored in the corresponding Visibility mapping (visibility map)** in , For a description of visibility mapping, refer to chapter 6.2 section .
Let's look at , When the following SELECT Statement execution time ,PostgreSQL How to read tuples .
testdb=# SELECT id, name FROM tbl WHERE id BETWEEN 18 and 19;
id | name
----+--------
18 | Queen
19 | Boston
(2 rows)
The query needs to read two columns from the table :id and name, However, the index tbl_idx Contains these columns . So when using index scanning , At first glance, it seems that there is no need to visit the page of the table , Because the index already contains the necessary data . However ** In principle, ,PostgreSQL It is necessary to check the visibility of these tuples **, However, there is no transaction related information about heap tuples in index tuples , such as t_xmin and t_xmax, For details, refer to section 5 Chapter . therefore ,PostgreSQL You need to access the table data to check the visibility of the data in the index tuple , This is a bit of putting the cart before the horse .
In the face of this dilemma ,PostgreSQL Use the visibility mapping table corresponding to the target data table to solve this problem . If all tuples stored in a page are visible ,PostgreSQL Index tuples will be used , Instead of accessing the data page pointed to by the index tuple to check the visibility ; otherwise ,PostgreSQL Read the data tuple pointed to by the index tuple and check the tuple visibility , And this is the same as the original idea .
In this case , Because of 0 Page number is marked as visible , therefore **0 The number stored in the page includes Tuple_18 All tuples within are visible , So there is no need to visit Tuple_18 了 . Corresponding , because 1 Page number is not marked as visible , To check the visibility of concurrency control , Need to access Tuple_19.**
chart 7.7 Working process of index scan only

边栏推荐
猜你喜欢

Collection of wrong questions in soft test -- morning questions in the first half of 2010

Solution to sudden font change in word document editing

CTFSHOW SQL注入篇(211-230)

Embedded hardware: electronic components (1) resistance capacitance inductance

Internet people a few years ago vs Internet people today

C disk lossless move file

Ctfshow SQL injection (231-253)

Embedded hardware - read schematic

【Try to Hack】upload-labs通关(暂时写到12关)

利用Javeswingjdbc基於mvc設計系統
随机推荐
Win8.1和Win10各自的优势
Explain the differences and usage scenarios between created and mounted
2022年建筑架子工(建筑特殊工种)特种作业证考试题库及在线模拟考试
php开发14 友情链接模块的编写
Ctfshow SQL injection (231-253)
Mysql database installation
你的一对一会议效率低下,你可以这么做!
PHP development 14 compilation of friendship link module
JS, how to add grid style
NodeJS 解析 GET 请求 url 字符串
promise处理js多线程全部获得结果后同一处理结果
Go scheduled task cron package usage
Tita: Xinrui group uses one-to-one talk to promote the success of performance change
Set properties \ classes
Red Treasure Book Reading Notes (continuously updated)
Swiper plug-in
SS selector
[LeetCode]-滑动窗口
一致性哈希的简单认识
ES6 learning
