当前位置:网站首页>postgresql 15源码浅析(5)—— pg_control
postgresql 15源码浅析(5)—— pg_control
2022-07-31 03:16:00 【墨天轮】
摘要
postgresql的控制文件保存initdb期间初始化的信息、WAL信息、检查点信息等。文件位于$PGDATA/global/pg_control。postgresql集簇存在期间(运行或停止),一些工具或进程可以查看或修改该文件。本文整理了(几乎)所有修改和查看pg_control控制文件的地方,结合源码进行了梳理,希望能对pg的控制文件有跟进一步的了解。
全局概览
先上图,共有5个服务端工具、4个内置函数和1个后端进程可以对pg_control控制文件进行查询或修改操作,后文将进行介绍。

数据结构&方法
pg_control文件是一个大小8192字节的二进制文件,文件内容是将结构体ControlFileData以二进制的形式写入pg_control文件中。
- pg_control文件大小
#define PG_CONTROL_FILE_SIZE 8192
- ControlFileData数据结构
源码位于src/include/catalog/pg_control.h
/* * Contents of pg_control. */typedef struct ControlFileData{ uint64 system_identifier; uint32 pg_control_version; /* PG_CONTROL_VERSION */ uint32 catalog_version_no; /* see catversion.h */ DBState state; /* see enum above */ pg_time_t time; /* time stamp of last pg_control update */ XLogRecPtr checkPoint; /* last check point record ptr */ CheckPoint checkPointCopy; /* copy of last check point record */ XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */ XLogRecPtr minRecoveryPoint; TimeLineID minRecoveryPointTLI; XLogRecPtr backupStartPoint; XLogRecPtr backupEndPoint; bool backupEndRequired; int wal_level; bool wal_log_hints; int MaxConnections; int max_worker_processes; int max_wal_senders; int max_prepared_xacts; int max_locks_per_xact; bool track_commit_timestamp; uint32 maxAlign; /* alignment requirement for tuples */ double floatFormat; /* constant 1234567.0 */#define FLOATFORMAT_VALUE 1234567.0 uint32 blcksz; /* data block size for this DB */ uint32 relseg_size; /* blocks per segment of large relation */ uint32 xlog_blcksz; /* block size within WAL files */ uint32 xlog_seg_size; /* size of each WAL segment */ uint32 nameDataLen; /* catalog name field width */ uint32 indexMaxKeys; /* max number of columns in an index */ uint32 toast_max_chunk_size; /* chunk size in TOAST tables */ uint32 loblksize; /* chunk size in pg_largeobject */ bool float8ByVal; /* float8, int8, etc pass-by-value? */ uint32 data_checksum_version; char mock_authentication_nonce[MOCK_AUTH_NONCE_LEN]; pg_crc32c crc;} ControlFileData;
- get_controlfile
get_controlfile主要功能是将二进制文件pg_control读取到ControlFileData中。
// 函数定义ControlFileData *get_controlfile(const char *DataDir, bool *crc_ok_p);
// 关键代码,读取控制文件到结构体中r = read(fd, ControlFile, sizeof(ControlFileData));
- update_controlfile
update_controlfile主要功能室将结构体ControlFileData中的内容以二进制的形式写入pg_control中。
// 函数定义void update_controlfile(const char *DataDir, ControlFileData *ControlFile, bool do_sync)
// 二进制形式打开文件if ((fd = open(ControlFilePath, O_WRONLY | PG_BINARY, pg_file_create_mode)) == -1)// 写入内容if (write(fd, buffer, PG_CONTROL_FILE_SIZE) != PG_CONTROL_FILE_SIZE)
- read_controlfile
在pg_resetwal中单独实现了一个read_controlfile函数,这里处理读取pg_control控制文件,主要的功能是检查控制文件的长度、版本号、WAL文件的大小。个人觉得,其实这里用read_controlfile这个名字未必合理。
static boolread_controlfile(void){ …… if ((fd = open(XLOG_CONTROL_FILE, O_RDONLY | PG_BINARY, 0)) < 0) { …… } …… len = read(fd, buffer, PG_CONTROL_FILE_SIZE); …… if (len >= sizeof(ControlFileData) && ((ControlFileData *) buffer)->pg_control_version == PG_CONTROL_VERSION) { …… if (!EQ_CRC32C(crc, ((ControlFileData *) buffer)->crc)) { …… } …… if (!IsValidWalSegSize(ControlFile.xlog_seg_size)) { …… } return true; } ……}
谁读了pg_control
服务端工具pg_controldata
这个工具实现十分简单,就是读取pg_control然后进行打印输出。

内部函数
postgresql 提供了4个内置函数对控制文件中的信息进行了分类显示。
pg_control_init(初始化集簇initdb的参数)

pg_control_system(系统参数)

pg_control_checkpoint(checkpoint参数)

pg_control_recovery(recovery参数)

服务端工具pg_checksums
pg_checksums在PostgreSQL集簇中检查、启用或禁用数据校验和。运行pg_checksums之前,必须彻底关闭服务器。验证校验和时,如果没有校验和错误,则退出状态为零,如果检测到至少一个校验和失败,则退出状态为非零。启用或禁用校验和时,如果操作失败,则退出状态为非零。
验证校验和时,集簇中的每个文件都要被扫描。启用校验和时,集簇中的每个文件都会被重写。禁用校验和时,仅更新pg_control文件。

服务端工具pg_ctl
pg_clt在备点进行promote时,需要判断备点状态,即备点状态需为DB_IN_ARCHIVE_RECOVERY。
static DBStateget_control_dbstate(void){ DBState ret; bool crc_ok; ControlFileData *control_file_data = get_controlfile(pg_data, &crc_ok); if (!crc_ok) { write_stderr(_("%s: control file appears to be corrupt\n"), progname); exit(1); } ret = control_file_data->state; pfree(control_file_data); return ret;}
DBState的几种状态如下:
/* * System status indicator. Note this is stored in pg_control; if you change * it, you must bump PG_CONTROL_VERSION */typedef enum DBState{ DB_STARTUP = 0, DB_SHUTDOWNED, DB_SHUTDOWNED_IN_RECOVERY, DB_SHUTDOWNING, DB_IN_CRASH_RECOVERY, DB_IN_ARCHIVE_RECOVERY, DB_IN_PRODUCTION} DBState;
服务端工具pg_resetwal
通过函数read_controlfile,实现了读取pg_control,目的是为了后续对pg_control的修改。
服务端工具pg_rewind
通过函数read_controlfile,实现了读取pg_control,目的是为了后续对pg_control的修改。
谁写了pg_control
服务端工具pg_checksums
之前提到pg_checksums会读取pg_control控制文件,同时pg_checksums也会更新pg_control控制文件,主要是更新Data page checksum version的值。

当执行pg_checksums -e时,开启校验,会将控制文件中Data page checksum version更新为1,如果是 -d 关闭校验,则Data page checksum version被更新为0.
/* * Finally make the data durable on disk if enabling or disabling * checksums. Flush first the data directory for safety, and then update * the control file to keep the switch consistent. */ if (mode == PG_MODE_ENABLE || mode == PG_MODE_DISABLE) { ControlFile->data_checksum_version = (mode == PG_MODE_ENABLE) ? PG_DATA_CHECKSUM_VERSION : 0; if (do_sync) { pg_log_info("syncing data directory"); fsync_pgdata(DataDir, PG_VERSION_NUM); } pg_log_info("updating control file"); update_controlfile(DataDir, ControlFile, do_sync); if (verbose) printf(_("Data checksum version: %u\n"), ControlFile->data_checksum_version); if (mode == PG_MODE_ENABLE) printf(_("Checksums enabled in cluster\n")); else printf(_("Checksums disabled in cluster\n")); }
服务端工具pg_resetwal
pg_resetwal可以重置损坏的wal日志或根据事务号进行重置wal日志文件,同时如有必要同时更新pg_control文件。
/* * Write out the new pg_control file. */static voidRewriteControlFile(void){ /* * Adjust fields as needed to force an empty XLOG starting at * newXlogSegNo. */ XLogSegNoOffsetToRecPtr(newXlogSegNo, SizeOfXLogLongPHD, WalSegSz, ControlFile.checkPointCopy.redo); ControlFile.checkPointCopy.time = (pg_time_t) time(NULL); ControlFile.state = DB_SHUTDOWNED; ControlFile.checkPoint = ControlFile.checkPointCopy.redo; ControlFile.minRecoveryPoint = 0; ControlFile.minRecoveryPointTLI = 0; ControlFile.backupStartPoint = 0; ControlFile.backupEndPoint = 0; ControlFile.backupEndRequired = false; /* * Force the defaults for max_* settings. The values don't really matter * as long as wal_level='minimal'; the postmaster will reset these fields * anyway at startup. */ ControlFile.wal_level = WAL_LEVEL_MINIMAL; ControlFile.wal_log_hints = false; ControlFile.track_commit_timestamp = false; ControlFile.MaxConnections = 100; ControlFile.max_wal_senders = 10; ControlFile.max_worker_processes = 8; ControlFile.max_prepared_xacts = 0; ControlFile.max_locks_per_xact = 64; /* The control file gets flushed here. */ update_controlfile(".", &ControlFile, true);}
服务端工具pg_rewind
将PostgreSQL数据目录同步到新的时间线。
static ControlFileData ControlFile_target;static ControlFileData ControlFile_source;static ControlFileData ControlFile_source_after;
从源控制文件中读取必要信息,并重新写入目标控制文件。
总结
pg_control保存了4类信息,分别是postgres集簇的初始化信息、系统信息、checkpoint信息、recovery信息。多种服务端工具会对pg_control进行查看或者修改。本文从代码的角度梳理了对pg_control读写相关的代码,希望能对大家了解postgres控制文件有所帮助。
边栏推荐
猜你喜欢

Recursive query single table - single table tree structure - (self-use)

【编译原理】递归下降语法分析设计原理与实现

With 7 years of experience, how can functional test engineers improve their abilities step by step?

SIP Protocol Standard and Implementation Mechanism

编译Hudi

IIR滤波器和FIR滤波器

10. Redis implements likes (Set) and obtains the total number of likes

JS function this context runtime syntax parentheses array IIFE timer delay self.backup context call apply

Several common errors when using MP

Mysql 45讲学习笔记(二十五)MYSQL保证高可用
随机推荐
BUG definition of SonarQube
Analysis summary - self-use
Number 16, top posts
注解用法含义
Select the smoke test case, and make the first pass for the product package entering QA
Is interprofessional examination difficult?Low success rate of "going ashore"?Please accept this practical guide!
Automation strategies for legacy systems
JS function this context runtime syntax parentheses array IIFE timer delay self.backup context call apply
Detailed explanation of TCP (3)
TCP详解(一)
PMP微信群日常习题
The use of font compression artifact font-spider
【异常】The field file exceeds its maximum permitted size of 1048576 bytes.
什么是系统?
LeetCode每日一练 —— OR36 链表的回文结构
TCP详解(二)
SQL 面试用题(重点)
Project (5) - Small target detection tph-yolov5
【Exception】The field file exceeds its maximum permitted size of 1048576 bytes.
PMP WeChat group daily exercises