当前位置:网站首页>PostgreSQL Guide: inside exploration (Chapter 10 basic backup and point in time recovery) - Notes

PostgreSQL Guide: inside exploration (Chapter 10 basic backup and point in time recovery) - Notes

2022-06-13 04:46:00 Shallow as the breeze CYF

Chapter ten Basic backup and point in time recovery


** Online database backup ** It can be roughly divided into two categories : Logical backup and physical backup . They have their own advantages and disadvantages .

  • Logical backup There is a drawback : Execution takes too much time . Especially for large databases , need Take a long time to back up , It may take longer to recover the database from the backup data .

  • By contrast, , The physical backup Sure Backup and restore in a relatively short time Large databases , So in a real system , It is a very important and practical function .

stay PostgreSQL in , since 8.0 Version is available online Full physical backup , The entire database cluster ( Physical backup data ) Runtime snapshot be called Basic backup (base backup).

PostgreSQL still 8.0 The version introduces Time to recover (Point-In-Time Recovery, PITR). This function The database can be restored to any point in time , This is done by using a Basic backup And from Continuous archiving Generated Archive log To achieve . for example , Even if you make a serious mistake ( for example TRUNCATE All the watches ), This feature allows you to recover the database to the time before the error occurred .

This chapter describes the following topics :

  • What is basic backup
  • PITR How it works
  • Timeline identification (TimelineID) What is it?
  • What is the timeline history file

stay 7.4 Or earlier ,PostgreSQL Only logical backup is supported ( Full logical backup 、 Partial logical backup , Export data ).

10.1 Basic backup

First , The standard procedure for performing a basic backup using the underlying commands is as follows :

  1. issue pg_start_backup command
  2. Use the archive command you want to use to get a snapshot of the database cluster
  3. issue pg_stop_backup command

This simple process is important for DBA It is easy to operate , Because it doesn't need special tools , Just common tools ( Such as copying commands or similar archiving tools ) To create a basic backup . Besides , In the process ,** There is no need to acquire the lock on the table ,** All users can initiate queries without being affected by backup operations . Compared with other open source relational databases , This is a huge advantage .

A simpler way is to use pg_basebackup Command to do basic backup , But it also uses these low-level commands to work internally .

chart 10.1 Make basic backup

img

Since these commands are clearly understood PITR One of the key points of , We will explore them in the following sections .

pg_start_backup and pg_stop_backup The command is defined in :src/backend/access/transam/xlogfuncs.c.

10.1.1 pg_start_backup

pg_start_backup Start preparing for the basic backup . Such as The first 9.8 section Described , The recovery process starts at the redo point , therefore pg_start_backup Checkpoints must be performed , To explicitly create a... At the beginning of the basic backup Redo point . Besides , The location of this checkpoint must be saved in a different location from pg_control In other documents , Because there may be multiple routine checkpoints during backup . therefore pg_start_backup Perform the following four actions :

  1. Forced entry Full page write Pattern .
  2. Switch to the current WAL Segment file (8.4 Or later ).【WAL:Write Ahead Logging( Pre written logs )】
  3. Execute checkpoints .
  4. establish backup_label file —— This file is created at the top level of the base directory , Contains key information about the basic backup itself , For example, the checkpoint location of the checkpoint .

The third and fourth actions are the core of the command . The first and second operations are to recover the database cluster more reliably .

Backup label backup_label file It contains the following six items (11 Or update the version to seven projects ):

  • Checkpoint location (CHECKPOINT LOCATION —— Of the checkpoint created by this command LSN Location .
  • WAL Starting position (START WAL LOCATION —— It's not for PITR With , But for The first 11 Chapter The described stream is prepared for replication . It was named START WAL LOCATION, Because the standby server in replication mode only reads this value once during initial startup .
  • Backup method (BACKUP METHOD —— This is the method used for this basic backup . (pg_start_backup or pg_basebackup
  • Backup source (BACKUP FROM —— Explain whether the backup is pulled from the primary database or the standby database .
  • Starting time (START TIME—— This is execution pg_start_backup Time stamp of .
  • Backup label (LABEL —— This is a pg_start_backup Label specified in .
  • Start timeline (START TIMELINE —— This is the timeline for the start of the backup . This is for a normality check , In version 11 Introduced in .

Backup label

One 9.6 The actual example of the backup label in the release is shown below :

postgres> cat /usr/local/pgsql/data/backup_label
START WAL LOCATION: 0/9000028 (file 000000010000000000000009)
CHECKPOINT LOCATION: 0/9000060
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2018-7-9 11:45:19 GMT
LABEL: Weekly Backup

As you can imagine , When using this basic backup to restore a database ,PostgreSQL from backup_label Take the checkpoint location from the file CHECKPOINT LOCATION, Then read the checkpoint record from the appropriate location in the archive log , Then get the location of the redo point from the checkpoint record , Finally, start the recovery process from the redo point ( The next section will cover the details ).

10.1.2 pg_stop_backup

pg_stop_backup Perform the following five actions to complete the backup .

  1. If pg_start_backup Open the Full page write , So close it Full page write .
  2. Write a message of the end of the backup XLOG Record .
  3. Switch WAL Segment file .
  4. Create a backup history file —— This file contains backup_label The content of the document , And executed pg_stop_backup The timestamp .
  5. Delete backup_label file —— Restoring from an underlying backup requires backup_label file , But once copied , It is not needed in the original database cluster .

The backup history file is named as follows :

{
     WAL Segment file name }.{
      Offset at the beginning of the base backup }.backup

10.2 Time to recover (PITR) How it works

chart 10.2 It shows PITR Basic concepts of . PITR Mode of PostgreSQL The... In the archive log will be replayed on the basic backup WAL data , from pg_start_backup The created redo point starts , Restore to the position you want . stay PostgreSQL in , Location to restore to , go by the name of Recovery objectives (recovery target).

chart 10.2 PITR Basic concepts of

img

PITR It works like this . Suppose you GMT Time 2018-07-16 12:05:00 Made a mistake . Then you should Delete the current database cluster , And use the basic backup made before to restore a new one . then , Create a recovery.conf file , In which the parameter recovery_target_time The parameter configuration is the time when you make a mistake ( In this case , That is to say 12:05 GMT) .recovery.conf The file is shown below :

# Place archive logs under /mnt/server/archivedir directory.
restore_command = 'cp /mnt/server/archivedir/%f %p'
recovery_target_time = "2018-7-16 12:05 GMT"

When PostgreSQL When it starts , If... Exists in the database cluster recovery.conf and backup_label file , It goes into recovery mode .

PITR The process is almost the same as Chapter nine The general recovery process described in is exactly the same , The only difference is the following two points :

  1. Where to read WAL paragraph / Archive log ?
    • Normal recovery mode —— From... Under the basic directory pg_xlog subdirectories (10 Or later ,pg_wal subdirectories ).
    • PITR Pattern —— From configuration parameters archive_command Archive directory set in .
  2. Where to read the checkpoint position ?
    • Normal recovery mode —— come from pg_control file .
    • PITR Pattern —— come from backup_label file .

**PITR technological process ** Here's an overview :

  1. In order to find Redo point ,PostgreSQL Using inner functions read_backup_label from backup_label File read CHECKPOINT LOCATION Value .

  2. PostgreSQL from recovery.conf Read some parameter values in ; In this example restore_command and recovery_target_time.

  3. PostgreSQL Start Replay from redo point WAL data , The location of the redo point can be simply determined from CHECKPOINT LOCATION From the value of . PostgreSQL Execution parameter restore_command Commands configured in , Copy the archive logs from the archive area to the staging area , And read from it WAL data ( Log files copied to the staging area will be deleted after use ).

    In this case ,PostgreSQL Read from redo point and replay WAL data , Until timestamp 2018-7-16 12:05:00 until , Because parameters recovery_target_time Is set to this timestamp . If recovery.conf There is no recovery target configured in , be PostgreSQL Replay to the end of the archive log .

  4. When the recovery process is complete , Will be in pg_xlog subdirectories (10 Or higher is pg_wal subdirectories ) Create a timeline history file in , for example 00000002.history; If the log archiving function is enabled , The same named file will also be created in the archive Directory . The following sections describe the contents and functions of this document .

The records of commit and abort operations contain the timestamp when each operation is completed ( Two operations XLOG The data part is respectively in xl_xact_commit and xl_xact_abort In the definition of ). therefore , If the target time is set as the parameter recovery_target_time, as long as PostgreSQL Replay committed or aborted operations XLOG Record , It can choose whether to continue the recovery . When replaying each action XLOG When recording ,PostgreSQL The target time is compared with each timestamp written in the record ; If the timestamp exceeds the target time ,PITR The process will be completed .

typedef struct xl_xact_commit
{
    
        TimestampTz    xact_time;              /*  Submission time  */
        uint32          xinfo;              /*  Information marker bit  */
        int                nrels;              /* RelFileNodes The number of  */
        int                nsubxacts;          /*  Sub business XIDs The number of  */
        int                nmsgs;              /*  Number of shared failed messages  */
        Oid                dbId;               /* MyDatabaseId,  database Oid */
        Oid                tsId;               /* MyDatabaseTableSpace,  Table space Oid */
        /*  Those that need to be discarded at the time of submission RelFileNode(s) Array  */
        RelFileNode     xnodes[1];          /*  Variable length array  */
        /*  This is followed by the committed sub transaction XIDs Array  */
        /*  This is followed by an array of shared invalidation messages  */
} xl_xact_commit;
typedef struct xl_xact_abort
{
    
        TimestampTz     xact_time;          /*  Stop time  */
        int                nrels;              /* RelFileNodes The number of  */
        int             nsubxacts;          /*  Sub business XIDs The number of  */
        /*  To be discarded at the time of termination RelFileNode(s) Array  */
        RelFileNode     xnodes[1];          /*  Variable length array  */
        /*  This is followed by the committed sub transaction XIDs Array  */
} xl_xact_abort;

function read_backup_label Defined in src/backend/access/transam/xlog.c in . structure xl_xact_commit and xl_xact_abort Defined in src/backend/access/transam/xlog.c.

Why can I use general archiving tools for basic backup ?

Although database clusters may be inconsistent , but ** The recovery process is the process of making the database cluster reach a consistent state . because PITR Is based on the recovery process , therefore Even if the underlying backup is a stack of inconsistent files , It can also restore database clusters **. So we can do this without the file system snapshot function , Or other special tools , Use general archiving tools for basic backup .

10.3 Timeline and timeline history files

PostgreSQL The timeline in be used for Distinguish between the original database cluster and the restored database cluster , It is PITR Core concept of . In this section , Describes two things related to the timeline : Timeline identification (TimelineID), as well as Timeline history file (Timeline History Files).

10.3.1 Timeline identification (TimelineID

Each timeline has a corresponding Timeline identification , A four byte unsigned integer , from 1 Start counting .

Each database cluster is assigned a timeline id . from initdb Command to create The original database cluster of , Its timeline is marked as 1. Whenever a database cluster is restored , The timeline logo will be increased 1. For example, in the example in the previous section , The cluster recovered from the original cluster , Its timeline is marked as 2.

chart 10.3 From the perspective of timeline identification PITR The process . First , We Delete Current database cluster , and Replace For the past base backup , In order to Return to the starting point of recovery , This step is marked by the red curve arrow in the figure . Next , We started PostgreSQL The server , it ** By tracking the initial timeline ( Timeline identification 1), from pg_start_backup The created redo point starts , Replay the... In the archive log WAL data , Until the recovery goal is achieved , This step is marked with a blue straight arrow in the figure . Next , The restored database cluster will be assigned a New timeline logo 2**, and PostgreSQL Will run on the new timeline .

chart 10.3 The relationship between the original database cluster and the recovery database cluster

img

just as Chapter nine Briefly mentioned in ,WAL Before the segment file name 8 The bit number is equal to the timeline ID of the database cluster that created these segment files . When the timeline ID changes ,WAL The segment file name will change accordingly .

Let's start with WAL Review the recovery process from the point of view of the segment file . Suppose we use two archive log files to recover the database :

000000010000000000000009 \color{blue}{00000001}0000000000000009 000000010000000000000009, as well as 00000001000000000000000 A \color{blue}{00000001}000000000000000A 00000001000000000000000A. The newly recovered database cluster is assigned a timeline id 2, and PostgreSQL will from 00000002000000000000000 A \color{blue}{00000002}000000000000000A 00000002000000000000000A Start to create WAL paragraph **. Pictured 10.4 Shown .

chart 10.4 Between the original database cluster and the recovery database cluster WAL Relationship between segment files

img

10.3.2 Timeline history file

When PITR At the end of the process , In the archive directory and pg_xlog subdirectories (10 Or higher is pg_wal subdirectories ) Next Create name as 00000002.history Timeline history file for . This document records From which timeline is the current timeline Bifurcation Coming out , And the time of bifurcation .

The naming rules for this file are as follows :

“8 New timeline ID with digits ”.history

The timeline history file contains at least one line , Each line consists of the following three items :

  • Timeline identification —— Timeline of archived logs that were used for recovery .
  • LSN —— happen WAL Segment switching LSN Location .
  • reason —— Explanation of the reason why the readable timeline changes .

Specific examples are as follows :

postgres> cat /home/postgres/archivelogs/00000002.history
1      0/A000198    before 2018-7-9 12:05:00.861324+00

The meaning is as follows :

Database cluster ( The timeline is identified as 2) Based on the timeline, the identification is 1 Basic backup of , And in 2018-7-9 12:05:00.861324+00 Before , By replaying the checkpoint log , Restore to 0/A000198 The location of .

In this way , Each timeline history file will tell us the complete history of each restored database cluster . Phase II it is also in PITR It is also used in the process . The next section describes the details .

The format of timeline history file is in 9.3 Changes in version .9.3 The format before and after is as follows , But relatively simple .

9.3 And later :

timelineId    LSN    "reason"

9.2 And previous versions

timelineId    WAL_segment    "reason"

10.4 Point in time recovery and timeline history files

The timeline history file is used in the second and subsequent PITR Play an important role in the process . By trying to recover a second time , We will explore how to use it .

Again , Suppose you ** stay 12:15:00 Made another mistake , The error occurred in the timeline ID by 2 On the database cluster of . In this case, in order to recover the database cluster , You need Create an recovery.conf writing ** Pieces of :

restore_command = 'cp /mnt/server/archivedir/%f %p'
recovery_target_time = "2018-7-16 12:15:00 GMT"
recovery_target_timeline = 2

Parameters recovery_target_time Is set to the time when you make a new mistake , and recovery_target_timeline Set to 2, In order to recover along this timeline .

restart PostgreSQL Server and enter PITR Pattern , The database will identify along the timeline 2 Resume . Pictured 10.5 Shown .

chart 10.5 Along the timeline 2 Restore the database to 12:15 The state of

img

  1. PostgreSQL from backup_label In file Read CHECKPOINT LOCATION Value .

  2. from recovery.conf in Read some parameter values ; In this example restore_command,recovery_target_time and recovery_target_timeline.

  3. PostgreSQL Read the timeline history file 00000002.history, This file corresponds to the parameter recovery_target_timeline Value .

  4. PostgreSQL Go through the following steps replay WAL data

    1. From redo point to LSN 0/A000198( The value is written in 00000002.history In file ) Between WAL data ,PostgreSQL Meeting ( From the appropriate archive log ) Read and replay TimelineID=1 Of WAL data .
    2. From LSN 0/A000198, To the time stamp 2018-7-9 12:15:00 Between WAL data ,PostgreSQL Meeting ( From the appropriate archive log ) Read and replay TimelineID=2 Of WAL data .
  5. When the recovery process is complete , Current The timeline ID will be increased to 3, And in pg_xlog subdirectories (10 And subsequent versions are pg_wal subdirectories ) And archive directory Create a 00000003.history New timeline history file for .

    postgres> cat /home/postgres/archivelogs/00000003.history
    1         0/A000198     before 2018-7-9 12:05:00.861324+00
    
    2         0/B000078     before 2018-7-9 12:15:00.927133+00
    

When you have done more than once PITR when , Time line identification shall be clearly set , In order to use the appropriate timeline history file .

therefore , A timeline history file is more than just a history log of a database cluster , still PITR Reference recovery instructions for the process .

原网站

版权声明
本文为[Shallow as the breeze CYF]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202280520517937.html