当前位置:网站首页>PostgreSQL guide -- inside exploration Chapter 1 database clusters, databases and data tables
PostgreSQL guide -- inside exploration Chapter 1 database clusters, databases and data tables
2022-07-24 04:02:00 【Fisherman database notes】
This article refers to books 《postgresql guide – Inside exploration 》 Of
Chapter 1:Database Cluster, Databases and Tables
The online content of this book can be referred to :The Internals of PostgreSQL
This article is reproduced from : Database cluster 、 Databases and tables
Logical structure of database cluster
Database cluster (database cluster) It's a set of databases (database) Of aggregate , By a PostgreSQL Server management . Database clusters are different from highly available database clusters , It doesn't mean “ A set of database servers ”, One PostgreSQL The server will only run on a single machine and manage a single database cluster .
Physical structure of database cluster
Database cluster It is essentially a file directory , That is, the basic directory . perform initdb The basic directory will be created under the specified directory , So as to initialize a new database cluster .
PostgreSQL The tablespace in corresponds to a directory containing data outside the base directory .
The main files and subdirectories are as follows :
| file | describe |
|---|---|
| PG_VERSION | contain PostgreSQL The major version number |
| pg_hba.conf | control PostgreSQL Client side authentication |
| pg_ident.conf | control PostgreSQL user name mapping |
| postgresql.conf | Save the configuration parameters related to the database |
| postgresql.auto.conf | Storage use ALTER SYSTEM Modified configuration parameters (version 9.4 or later) |
| postmaster.opts | Record the command line options that the server last started , for example :/home/postgres/pgsql/bin/postgres “-D” “/pgdata/postgres” “-c” |
| base/ | The subdirectories corresponding to each database are stored here |
| global/ | Table of database cluster category ( for example pg_database) And pg_control |
| pg_commit_ts/ | Timestamp data of transaction submission (Version 9.5 or later) |
| pg_clog/ | Transaction commit status data (Version 9.6 or earlier). In version 10.0 Was renamed to pg_xact |
| pg_dynshmem/ | Files used in the dynamic shared memory subsystem (Version 9.4 or later) |
| pg_logical/ | Logically decoded status data (Version 9.4 or later) |
| pg_multixact/ | Multi transaction status data |
| pg_notify/ | LISTEN/NOTIFY Status data |
| pg_repslot/ | Copy slot data (Version 9.4 or later) |
| pg_serial/ | Information about committed serializable transactions (version 9.1 or later) |
| pg_snapshots/ | Export snapshot (version 9.2 or later).PostgreSQL function pg_export_snapshot Snapshot information files created in this subdirectory |
| pg_stat/ | Permanent file of statistical subsystem |
| pg_stat_tmp/ | Temporary files of statistical subsystem |
| pg_subtrans/ | Sub transaction status data |
| pg_tblspc/ | Symbolic links to tablespaces |
| pg_twophase/ | The state file of a two-phase transaction |
| pg_xlog/ | WAL Segment file (Version 9.6 or earlier), It's in version 10.0 Was renamed to pg_wal |
A database with base A subdirectory under the subdirectory corresponds to .
SELECT
datname,
oid
FROM
pg_database
ORDER BY
oid;
datname | oid
-----------+-------
template1 | 1
template0 | 13268
postgres | 13269
(3 rows)
[[email protected]:base]$ ls -l
total 12
drwx------ 2 postgres dba 4096 May 15 08:40 1
drwx------ 2 postgres dba 4096 May 15 08:40 13268
drwx------ 2 postgres dba 4096 May 19 08:23 13269
Layout of tables and index related files
Each less than 1GB The table or index of is stored as a single file in the corresponding database directory . These data files consist of variables relfilenode management .
SELECT
relname,
oid,
relfilenode
FROM
pg_class
WHERE
relname = 't_test';
relname | oid | relfilenode
---------+-------+-------------
t_test | 16447 | 16447
(1 row)
[[email protected]:13269]$ ls -l |grep 16447
-rw------- 1 postgres dba 0 May 19 10:10 16447
relfilenode Usually and oid Agreement . however ,relfilenode Will be ordered ( for example TRUNCATE、REINDEX、CLUSTER、VACUUM FULL) Changed .
TRUNCATE t_test;
SELECT
relname,
oid,
relfilenode
FROM
pg_class
WHERE
relname = 't_test';
relname | oid | relfilenode
---------+-------+-------------
t_test | 16447 | 16450
(1 row)
SELECT
pg_relation_filepath('16447');
pg_relation_filepath
----------------------
base/13269/16450
(1 row)
SELECT
pg_relation_filepath('t_test');
pg_relation_filepath
----------------------
base/13269/16450
(1 row)
When the file size of the table and index exceeds 1GB when ,PostgreSQL Will create and use a named relfilenode.1 The new document of , And so on , There will be relfilenode.2.
INSERT INTO t_test (
id
)
SELECT
generate_series(1, 100000000);
SELECT
pg_size_pretty(pg_table_size('t_test'));
pg_size_pretty
----------------
3458 MB
(1 row)
[[email protected]:13269]$ ls -l |grep 16450
-rw------- 1 postgres dba 1073741824 May 19 10:27 16450
-rw------- 1 postgres dba 1073741824 May 19 10:27 16450.1
-rw------- 1 postgres dba 1073741824 May 19 10:27 16450.2
-rw------- 1 postgres dba 403554304 May 19 10:27 16450.3
-rw------- 1 postgres dba 909312 May 19 10:27 16450_fsm
Compiling PostgreSQL when , You can use the configuration options –with-segsize Change the maximum file size of tables and indexes .
SELECT
name,
setting,
unit,
short_desc
FROM
pg_settings
WHERE
name like 'seg%size%';
name | setting | unit | short_desc
--------------+---------+------+------------------------------------------
segment_size | 131072 | 8kB | Shows the number of pages per disk file.
(1 row)
SHOW segment_size ;
segment_size
--------------
1GB
(1 row)
Each table has two files associated with it ,_fsm and _vm The free space information and visibility information on each page of the table file are stored respectively . Index has no visibility mapping file , Only free space mapping files .
CREATE TABLE t_test2 (
id int
);
SELECT
pg_relation_filepath('t_test2');
pg_relation_filepath
----------------------
base/13269/16454
(1 row)
# View in another window
[[email protected]:13269]$ ls -l |grep 16454
-rw------- 1 postgres dba 0 May 19 10:41 16454
INSERT INTO t_test2 (
id
)
SELECT
generate_series(1, 100000);
# View in another window
[[email protected]:13269]$ ls -l |grep 16454
-rw------- 1 postgres dba 3629056 May 19 10:41 16454
-rw------- 1 postgres dba 24576 May 19 10:41 16454_fsm
UPDATE t_test2
SET
id = id + 1;
CHECKPOINT;
# View in another window
[[email protected]:13269]$ ls -l |grep 16454
-rw------- 1 postgres dba 7249920 May 19 10:42 16454
-rw------- 1 postgres dba 24576 May 19 10:42 16454_fsm
-rw------- 1 postgres dba 8192 May 19 10:42 16454_vm
PostgreSQL The tablespace in is Additional data areas outside the base directory .
stay /home/postgres/tblspc Create a table space in new_tblspc, Its oid by 16458, A name such as “PG_ The major version number _ Catalog version number ” A subdirectory .
CREATE TABLESPACE new_tblspc LOCATION '/home/postgres/tblspc';
SELECT
*,
oid
FROM
pg_tablespace
WHERE
spcname = 'new_tblspc';
oid | spcname | spcowner | spcacl | spcoptions
-------+------------+----------+--------+------------
16458 | new_tblspc | 10 | |
(1 row)
[[email protected]:tblspc]$ ls -l /home/postgres/tblspc/
total 4
drwx------ 2 postgres dba 4096 May 19 14:34 PG_9.6_201608131
The tablespace directory passes pg_tblspc Soft link addressing in subdirectories , Link name and tablespace oid identical .
[[email protected]:pg_tblspc]$ ls -l
total 0
lrwxrwxrwx 1 postgres dba 21 May 19 14:34 16458 -> /home/postgres/tblspc
Create a database under this tablespace :
CREATE DATABASE test TABLESPACE new_tblspc;
SELECT
datname,
oid
FROM
pg_database
WHERE
datname = 'test';
datname | oid
---------+-------
test | 16459
(1 row)
[[email protected]:PG_9.6_201608131]$ ls -l /home/postgres/tblspc/PG_9.6_201608131
total 4
drwx------ 2 postgres dba 4096 May 19 14:37 16459
If in new_tblspc Create a new table in the table space , But the database to which the new table belongs is created under the basic directory ( such as postgres database ), that PostgreSQL First in new_tblspc Under the version specific subdirectory in the table space , Create directory name and existing database oid Same subdirectory , Then put the new table file in this directory .
postgres=# CREATE TABLE t_test3 (id int) tablespace new_tblspc;
postgres=# SELECT relname, oid FROM pg_class where relname = 't_test3';
relname | oid
---------+-------
t_test3 | 16460
(1 row)
postgres=# SELECT pg_relation_filepath('16460');
pg_relation_filepath
----------------------------------------------
pg_tblspc/16458/PG_9.6_201608131/13269/16460
(1 row)
[[email protected]:tblspc]$ ls -l /home/postgres/tblspc/
total 4
drwx------ 4 postgres dba 4096 May 19 14:55 PG_9.6_201608131
[[email protected]:tblspc]$ ls -l /home/postgres/tblspc/PG_9.6_201608131/
total 8
drwx------ 2 postgres dba 4096 May 19 14:55 13269
drwx------ 2 postgres dba 4096 May 19 14:37 16459
[[email protected]:tblspc]$ ls -l /home/postgres/tblspc/PG_9.6_201608131/13269/
total 0
-rw------- 1 postgres dba 0 May 19 14:55 16460
Internal layout of heap table file
Data files ( Heap table 、 Indexes , It also includes free space mapping and visibility mapping ) The interior is divided into fixed length pages , Or block , Size defaults to 8192B(8KB).
The pages in each file are from 0 Start numbering in order , These numbers are called block numbers . If the file is full ,PostgreSQL Just add a new blank page at the end of the file to increase the length of the file . The layout inside the page depends on the type of data file .
The page of the table contains three types of data :
Heap tuples —— That is, the data record itself . They are stacked sequentially from the bottom of the page .
Row pointer —— Each row pointer occupies 4B, Save a pointer to the heap tuple . They are also called project pointers (item pointer). Row pointers form a simple array , It plays the role of tuple index . Each index entry is from 1 Start numbering in sequence , It's called the offset number . When adding a new meta group to a page , A corresponding new row pointer will also be put into the array , And point to the newly added tuple .
First data —— The starting position of the page is assigned by the structure PageHeaderData The first data defined . It's the size of 24B, Contains metadata about the page .
Structure PageHeaderData Defined in src/include/storage/bufpage.h. The main member variables of this structure are as follows :
pd_lsn—— The last change on this page was written xlog Record the corresponding lsn. It's a 8B Unsigned integer , And WAL Mechanisms related to .
pd_checksum—— The checksum value of this page .
pd_lower、pd_upper——pd_lower Point to the end of the line pointer ,pd_upper Point to the starting position of the latest heap tuple .
pd_special—— This field will be used in the index page , In the heap table page, it points to the end of the page .( In the index page, it points to the starting position of the special space , A special space is a special data area used only by indexes , Contains specific data , The specific content depends on the type of index )
The free space between the end of the row pointer and the starting position of the latest metagroup is called free space or void .
To identify tuples in the table , Tuple identifiers are used inside the database (tuple identifier,TID).TID Consists of a pair of values , They are respectively the block number of the page to which the tuple belongs and the offset number of the row pointer pointing to the tuple .
Besides , The size is more than about 2KB(8KB Quarter of ) The heap tuple of will use a method called TOAST(The Oversized-Attribute Storage Technique, Super large attribute storage technology ) Methods to store and manage .
The way to read and write tuples
Suppose a table consists of only one page , And the page contains only one heap tuple . Of this page pd_lower Pointer to the first row , And the line pointer and pd_upper All point to the first heap tuple .
When writing the second tuple , It will be placed after the first tuple . The second row pointer is written after the first row pointer , And point to the second tuple .pd_lower Change to point to the second row pointer ,pd_upper Change to point to the second heap tuple .
The first data in the page will also be rewritten to the appropriate value .
Here are two typical access methods : Sequential scanning and B Tree index scan .
Sequential scanning —— By scanning the line pointer in each page , Read all tuples in all pages in order .
B Tree index scan —— The index file contains index tuples , The index tuple consists of a key value pair , The key is the indexed column value , The value is... Of the target heap tuple TID. When making index query , First use the key to find , If the corresponding index tuple is found ,PostgreSQL According to TID To read the corresponding heap tuple .

PostgreSQL And support TID scanning .TID Scanning is by using the required tuples TID Direct access to tuples .
CREATE TABLE t_test4 (
id int
);
INSERT INTO t_test4 (
id
)
VALUES (1);
SELECT
ctid,
id
FROM
t_test4
WHERE
ctid = '(0, 1)';
ctid | id
-------+----
(0,1) | 1
(1 row)
EXPLAIN
SELECT
ctid,
id
FROM
t_test4
WHERE
ctid = '(0, 1)';
QUERY PLAN
--------------------------------------------------------
Tid Scan on t_test4 (cost=0.00..4.01 rows=1 width=10)
TID Cond: (ctid = '(0,1)'::tid)
(2 rows)边栏推荐
- 组合数(阶乘的质因子的个数,组合数的计算)
- The value is 0. The other part includes but is not limited to the following. This question is
- svg图片颜色的修改 没有花里胡哨
- Pyth deinitialization averages on many machine decision boundaries, starting on the bus
- 会话技术相关
- Developers share the first chapter of "Book Eating bar: deep learning and mindspire practice"
- How to change the direction of this gracefully
- The progress in the stack will consume functions that cannot meet the needs of the enterprise. We are committed to
- 【C语言】程序环境和预处理操作
- [development technology] spingboot database and Persistence technology, JPA, mongodb, redis
猜你喜欢

Worthington's test of hepatocyte separation system and related optimization schemes

MySQL message queue list structure

Collection of test case design methods

Svg image color modification is not fancy

MySQL service 1 master 2 slave, master master, MHA configuration detailed steps

Worthington mammalian lactate dehydrogenase study -- Characteristics and determination scheme
![[development technology] spingboot database and Persistence technology, JPA, mongodb, redis](/img/fb/bb4a26699e5ec10c6881a4c95ac767.png)
[development technology] spingboot database and Persistence technology, JPA, mongodb, redis

Exploration of new mode of code free production

MLP - Multilayer Perceptron

Leetcode (Sword finger offer) - 11. Minimum number of rotation array
随机推荐
MySQL message queue list structure
嵌入式系统移植【6】——uboot源码结构
An accident caused by MySQL misoperation, and "high availability" can't withstand it
Learning summary | truly record what mindspire two-day training camp can bring to you (1)!
RSA of go language parses jsencrypt with secret key JS the encrypted ciphertext of this library failed
[translation] announce krius -- accelerate your monitoring and adoption of kubernetes
2022 China software products national tour exhibition is about to set sail
Data Lake (19): SQL API reads Kafka data and writes it to iceberg table in real time
1.7.1 right and wrong problem (infix expression)
Combinatorial number (number of prime factors of factorials, calculation of combinatorial number)
Page Jump and redirection in flask framework
Jinglianwen technology provides 3D point cloud image annotation service
RTOS internal skill cultivation (10) | in depth analysis of RTOS kernel context switching mechanism
swagger2的初步使用
Pat grade a 1041 be unique
发送数据1010_1发人员通过 字节的
D2DEngine食用教程(3)———将渲染目标导出为图像文件
(零八)Flask有手就行——数据库迁移Flask-Migrate
Mongo from start to installation and problems encountered
(008) flask is OK if you have a hand -- database migration flask migrate
