当前位置：网站首页>PostgreSQL Guide: inside exploration (Chapter 10 basic backup and point in time recovery) - Notes

PostgreSQL Guide: inside exploration (Chapter 10 basic backup and point in time recovery) - Notes

2022-06-13 04:46:00 【Shallow as the breeze CYF】

Chapter ten Basic backup and point in time recovery

List of articles

[ Chapter ten Basic backup and point in time recovery ](https://pg-internal.vonng.com/#/ch10?id= Chapter ten - Basic backup and point in time recovery )

** Online database backup ** It can be roughly divided into two categories ： Logical backup and physical backup . They have their own advantages and disadvantages .

Logical backup There is a drawback ： Execution takes too much time . Especially for large databases , need Take a long time to back up , It may take longer to recover the database from the backup data .
By contrast, , The physical backup Sure Backup and restore in a relatively short time Large databases , So in a real system , It is a very important and practical function .

stay PostgreSQL in , since 8.0 Version is available online Full physical backup , The entire database cluster （ Physical backup data ） Runtime snapshot be called Basic backup （base backup）.

PostgreSQL still 8.0 The version introduces Time to recover （Point-In-Time Recovery, PITR）. This function The database can be restored to any point in time , This is done by using a Basic backup And from Continuous archiving Generated Archive log To achieve . for example , Even if you make a serious mistake （ for example TRUNCATE All the watches ）, This feature allows you to recover the database to the time before the error occurred .

This chapter describes the following topics ：

What is basic backup
PITR How it works
Timeline identification （TimelineID） What is it?
What is the timeline history file

stay 7.4 Or earlier ,PostgreSQL Only logical backup is supported （ Full logical backup 、 Partial logical backup , Export data ）.

10.1 Basic backup

First , The standard procedure for performing a basic backup using the underlying commands is as follows ：

issue pg_start_backup command
Use the archive command you want to use to get a snapshot of the database cluster
issue pg_stop_backup command

This simple process is important for DBA It is easy to operate , Because it doesn't need special tools , Just common tools （ Such as copying commands or similar archiving tools ） To create a basic backup . Besides , In the process ,** There is no need to acquire the lock on the table ,** All users can initiate queries without being affected by backup operations . Compared with other open source relational databases , This is a huge advantage .

A simpler way is to use pg_basebackup Command to do basic backup , But it also uses these low-level commands to work internally .

chart 10.1 Make basic backup

Since these commands are clearly understood PITR One of the key points of , We will explore them in the following sections .

pg_start_backup and pg_stop_backup The command is defined in ：src/backend/access/transam/xlogfuncs.c.

10.1.1 `pg_start_backup`

pg_start_backup Start preparing for the basic backup . Such as The first 9.8 section Described , The recovery process starts at the redo point , therefore pg_start_backup Checkpoints must be performed , To explicitly create a... At the beginning of the basic backup Redo point . Besides , The location of this checkpoint must be saved in a different location from pg_control In other documents , Because there may be multiple routine checkpoints during backup . therefore pg_start_backup Perform the following four actions ：

Forced entry Full page write Pattern .
Switch to the current WAL Segment file （8.4 Or later ）.【WAL：Write Ahead Logging（ Pre written logs ）】
Execute checkpoints .
establish backup_label file —— This file is created at the top level of the base directory , Contains key information about the basic backup itself , For example, the checkpoint location of the checkpoint .

The third and fourth actions are the core of the command . The first and second operations are to recover the database cluster more reliably .

Backup label backup_label file It contains the following six items （11 Or update the version to seven projects ）：

Checkpoint location （CHECKPOINT LOCATION） —— Of the checkpoint created by this command LSN Location .
WAL Starting position （START WAL LOCATION） —— It's not for PITR With , But for The first 11 Chapter The described stream is prepared for replication . It was named START WAL LOCATION, Because the standby server in replication mode only reads this value once during initial startup .
Backup method （BACKUP METHOD） —— This is the method used for this basic backup . （pg_start_backup or pg_basebackup）
Backup source （BACKUP FROM） —— Explain whether the backup is pulled from the primary database or the standby database .
Starting time （START TIME）—— This is execution pg_start_backup Time stamp of .
Backup label （LABEL） —— This is a pg_start_backup Label specified in .
Start timeline （START TIMELINE） —— This is the timeline for the start of the backup . This is for a normality check , In version 11 Introduced in .

Backup label

One 9.6 The actual example of the backup label in the release is shown below ：

postgres> cat /usr/local/pgsql/data/backup_label
START WAL LOCATION: 0/9000028 (file 000000010000000000000009)
CHECKPOINT LOCATION: 0/9000060
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2018-7-9 11:45:19 GMT
LABEL: Weekly Backup

As you can imagine , When using this basic backup to restore a database ,PostgreSQL from backup_label Take the checkpoint location from the file CHECKPOINT LOCATION, Then read the checkpoint record from the appropriate location in the archive log , Then get the location of the redo point from the checkpoint record , Finally, start the recovery process from the redo point （ The next section will cover the details ）.

10.1.2 `pg_stop_backup`

pg_stop_backup Perform the following five actions to complete the backup .

If pg_start_backup Open the Full page write , So close it Full page write .
Write a message of the end of the backup XLOG Record .
Switch WAL Segment file .
Create a backup history file —— This file contains backup_label The content of the document , And executed pg_stop_backup The timestamp .
Delete backup_label file —— Restoring from an underlying backup requires backup_label file , But once copied , It is not needed in the original database cluster .

The backup history file is named as follows ：

{
     WAL Segment file name }.{
      Offset at the beginning of the base backup }.backup

10.2 Time to recover （PITR） How it works

chart 10.2 It shows PITR Basic concepts of . PITR Mode of PostgreSQL The... In the archive log will be replayed on the basic backup WAL data , from pg_start_backup The created redo point starts , Restore to the position you want . stay PostgreSQL in , Location to restore to , go by the name of Recovery objectives （recovery target）.

chart 10.2 PITR Basic concepts of

PITR It works like this . Suppose you GMT Time 2018-07-16 12:05:00 Made a mistake . Then you should Delete the current database cluster , And use the basic backup made before to restore a new one . then , Create a recovery.conf file , In which the parameter recovery_target_time The parameter configuration is the time when you make a mistake （ In this case , That is to say 12:05 GMT） .recovery.conf The file is shown below ：

# Place archive logs under /mnt/server/archivedir directory.
restore_command = 'cp /mnt/server/archivedir/%f %p'
recovery_target_time = "2018-7-16 12:05 GMT"

When PostgreSQL When it starts , If... Exists in the database cluster recovery.conf and backup_label file , It goes into recovery mode .

PITR The process is almost the same as Chapter nine The general recovery process described in is exactly the same , The only difference is the following two points ：

Where to read WAL paragraph / Archive log ？
- Normal recovery mode —— From... Under the basic directory pg_xlog subdirectories （10 Or later ,pg_wal subdirectories ）.
- PITR Pattern —— From configuration parameters archive_command Archive directory set in .
Where to read the checkpoint position ？
- Normal recovery mode —— come from pg_control file .
- PITR Pattern —— come from backup_label file .

**PITR technological process ** Here's an overview ：

In order to find Redo point ,PostgreSQL Using inner functions read_backup_label from backup_label File read CHECKPOINT LOCATION Value .
PostgreSQL from recovery.conf Read some parameter values in ; In this example restore_command and recovery_target_time.
PostgreSQL Start Replay from redo point WAL data , The location of the redo point can be simply determined from CHECKPOINT LOCATION From the value of . PostgreSQL Execution parameter restore_command Commands configured in , Copy the archive logs from the archive area to the staging area , And read from it WAL data （ Log files copied to the staging area will be deleted after use ）.
In this case ,PostgreSQL Read from redo point and replay WAL data , Until timestamp 2018-7-16 12:05:00 until , Because parameters recovery_target_time Is set to this timestamp . If recovery.conf There is no recovery target configured in , be PostgreSQL Replay to the end of the archive log .
When the recovery process is complete , Will be in pg_xlog subdirectories （10 Or higher is pg_wal subdirectories ） Create a timeline history file in , for example 00000002.history; If the log archiving function is enabled , The same named file will also be created in the archive Directory . The following sections describe the contents and functions of this document .

The records of commit and abort operations contain the timestamp when each operation is completed （ Two operations XLOG The data part is respectively in xl_xact_commit and xl_xact_abort In the definition of ）. therefore , If the target time is set as the parameter recovery_target_time, as long as PostgreSQL Replay committed or aborted operations XLOG Record , It can choose whether to continue the recovery . When replaying each action XLOG When recording ,PostgreSQL The target time is compared with each timestamp written in the record ; If the timestamp exceeds the target time ,PITR The process will be completed .

typedef struct xl_xact_commit
{
    
        TimestampTz    xact_time;              /*  Submission time  */
        uint32          xinfo;              /*  Information marker bit  */
        int                nrels;              /* RelFileNodes The number of  */
        int                nsubxacts;          /*  Sub business XIDs The number of  */
        int                nmsgs;              /*  Number of shared failed messages  */
        Oid                dbId;               /* MyDatabaseId,  database Oid */
        Oid                tsId;               /* MyDatabaseTableSpace,  Table space Oid */
        /*  Those that need to be discarded at the time of submission RelFileNode(s) Array  */
        RelFileNode     xnodes[1];          /*  Variable length array  */
        /*  This is followed by the committed sub transaction XIDs Array  */
        /*  This is followed by an array of shared invalidation messages  */
} xl_xact_commit;
typedef struct xl_xact_abort
{
    
        TimestampTz     xact_time;          /*  Stop time  */
        int                nrels;              /* RelFileNodes The number of  */
        int             nsubxacts;          /*  Sub business XIDs The number of  */
        /*  To be discarded at the time of termination RelFileNode(s) Array  */
        RelFileNode     xnodes[1];          /*  Variable length array  */
        /*  This is followed by the committed sub transaction XIDs Array  */
} xl_xact_abort;

function read_backup_label Defined in src/backend/access/transam/xlog.c in . structure xl_xact_commit and xl_xact_abort Defined in src/backend/access/transam/xlog.c.

Why can I use general archiving tools for basic backup ？
Although database clusters may be inconsistent , but ** The recovery process is the process of making the database cluster reach a consistent state . because PITR Is based on the recovery process , therefore Even if the underlying backup is a stack of inconsistent files , It can also restore database clusters **. So we can do this without the file system snapshot function , Or other special tools , Use general archiving tools for basic backup .

10.3 Timeline and timeline history files

PostgreSQL The timeline in be used for Distinguish between the original database cluster and the restored database cluster , It is PITR Core concept of . In this section , Describes two things related to the timeline ： Timeline identification （TimelineID）, as well as Timeline history file （Timeline History Files）.

10.3.1 Timeline identification （`TimelineID`）

Each timeline has a corresponding Timeline identification , A four byte unsigned integer , from 1 Start counting .

Each database cluster is assigned a timeline id . from initdb Command to create The original database cluster of , Its timeline is marked as 1. Whenever a database cluster is restored , The timeline logo will be increased 1. For example, in the example in the previous section , The cluster recovered from the original cluster , Its timeline is marked as 2.

chart 10.3 From the perspective of timeline identification PITR The process . First , We Delete Current database cluster , and Replace For the past base backup , In order to Return to the starting point of recovery , This step is marked by the red curve arrow in the figure . Next , We started PostgreSQL The server , it ** By tracking the initial timeline （ Timeline identification 1）, from pg_start_backup The created redo point starts , Replay the... In the archive log WAL data , Until the recovery goal is achieved , This step is marked with a blue straight arrow in the figure . Next , The restored database cluster will be assigned a New timeline logo 2**, and PostgreSQL Will run on the new timeline .

chart 10.3 The relationship between the original database cluster and the recovery database cluster

just as Chapter nine Briefly mentioned in ,WAL Before the segment file name 8 The bit number is equal to the timeline ID of the database cluster that created these segment files . When the timeline ID changes ,WAL The segment file name will change accordingly .

Let's start with WAL Review the recovery process from the point of view of the segment file . Suppose we use two archive log files to recover the database ：

$\color{blue}{00000001}0000000000000009$ , as well as $\color{blue}{00000001}000000000000000A$ . The newly recovered database cluster is assigned a timeline id 2, and PostgreSQL will from $\color{blue}{00000002}000000000000000A$ Start to create WAL paragraph **. Pictured 10.4 Shown .

chart 10.4 Between the original database cluster and the recovery database cluster WAL Relationship between segment files

10.3.2 Timeline history file

When PITR At the end of the process , In the archive directory and pg_xlog subdirectories （10 Or higher is pg_wal subdirectories ） Next Create name as 00000002.history Timeline history file for . This document records 了 From which timeline is the current timeline Bifurcation Coming out , And the time of bifurcation .

The naming rules for this file are as follows ：

“8 New timeline ID with digits ”.history

The timeline history file contains at least one line , Each line consists of the following three items ：

Timeline identification —— Timeline of archived logs that were used for recovery .
LSN —— happen WAL Segment switching LSN Location .
reason —— Explanation of the reason why the readable timeline changes .

Specific examples are as follows ：

postgres> cat /home/postgres/archivelogs/00000002.history
1      0/A000198    before 2018-7-9 12:05:00.861324+00

The meaning is as follows ：

Database cluster （ The timeline is identified as 2） Based on the timeline, the identification is 1 Basic backup of , And in 2018-7-9 12:05:00.861324+00 Before , By replaying the checkpoint log , Restore to 0/A000198 The location of .

In this way , Each timeline history file will tell us the complete history of each restored database cluster . Phase II it is also in PITR It is also used in the process . The next section describes the details .

The format of timeline history file is in 9.3 Changes in version .9.3 The format before and after is as follows , But relatively simple .
9.3 And later ：
timelineId    LSN    "reason"
9.2 And previous versions
timelineId    WAL_segment    "reason"

10.4 Point in time recovery and timeline history files

The timeline history file is used in the second and subsequent PITR Play an important role in the process . By trying to recover a second time , We will explore how to use it .

Again , Suppose you ** stay 12:15:00 Made another mistake , The error occurred in the timeline ID by 2 On the database cluster of . In this case, in order to recover the database cluster , You need Create an recovery.conf writing ** Pieces of ：

restore_command = 'cp /mnt/server/archivedir/%f %p'
recovery_target_time = "2018-7-16 12:15:00 GMT"
recovery_target_timeline = 2

Parameters recovery_target_time Is set to the time when you make a new mistake , and recovery_target_timeline Set to 2, In order to recover along this timeline .

restart PostgreSQL Server and enter PITR Pattern , The database will identify along the timeline 2 Resume . Pictured 10.5 Shown .

chart 10.5 Along the timeline 2 Restore the database to 12:15 The state of

PostgreSQL from backup_label In file Read CHECKPOINT LOCATION Value .
from recovery.conf in Read some parameter values ; In this example restore_command,recovery_target_time and recovery_target_timeline.
PostgreSQL Read the timeline history file 00000002.history, This file corresponds to the parameter recovery_target_timeline Value .
PostgreSQL Go through the following steps replay WAL data ：
1. From redo point to LSN 0/A000198（ The value is written in 00000002.history In file ） Between WAL data ,PostgreSQL Meeting （ From the appropriate archive log ） Read and replay TimelineID=1 Of WAL data .
2. From LSN 0/A000198, To the time stamp 2018-7-9 12:15:00 Between WAL data ,PostgreSQL Meeting （ From the appropriate archive log ） Read and replay TimelineID=2 Of WAL data .
When the recovery process is complete , Current The timeline ID will be increased to 3, And in pg_xlog subdirectories （10 And subsequent versions are pg_wal subdirectories ） And archive directory Create a 00000003.history New timeline history file for .
```
postgres> cat /home/postgres/archivelogs/00000003.history
1         0/A000198     before 2018-7-9 12:05:00.861324+00

2         0/B000078     before 2018-7-9 12:15:00.927133+00
```

When you have done more than once PITR when , Time line identification shall be clearly set , In order to use the appropriate timeline history file .

therefore , A timeline history file is more than just a history log of a database cluster , still PITR Reference recovery instructions for the process .

原网站

版权声明
本文为[Shallow as the breeze CYF]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202280520517937.html