当前位置:网站首页>PostgreSQL database firstborn - background first-class citizen process startupdatabase startupxlog function enters recovery mode
PostgreSQL database firstborn - background first-class citizen process startupdatabase startupxlog function enters recovery mode
2022-06-21 07:31:00 【Tertium ferrugosum】
Check if we need to force from WAL To recover . If the database appears to be completely shut down and we do not recover the signal file , It is assumed that there is no need to restore (InRecovery = false).
/* Check whether we need to force recovery from WAL. If it appears to have been a clean shutdown and we did not have a recovery signal file, then assume no recovery needed. */
if (checkPoint.redo < RecPtr) {
if (wasShutdown) ereport(PANIC, (errmsg("invalid redo record in shutdown checkpoint")));
InRecovery = true;
} else if (ControlFile->state != DB_SHUTDOWNED)
InRecovery = true;
else if (ArchiveRecoveryRequested) {
InRecovery = true; /* force recovery due to presence of recovery signal file */
}
/* Start recovery assuming that the final record isn't lost. */
abortedRecPtr = InvalidXLogRecPtr;
missingContrecPtr = InvalidXLogRecPtr;
Database access Recovery Pattern , You need to update ControlFile Status information in , Indicates that you are currently entering Recovery Pattern , And save the read checkpoint information to ControlFile in .
if (InRecovery) {
/* REDO */
int rmid;
/* Update pg_control to show that we are recovering and to show the selected checkpoint as the place we are starting from. We also mark pg_control with any minimum recovery stop point obtained from a backup history file. to update pg_control To show that we are recovering and show the selected checkpoint as where we started . We also mark with any minimum recovery stop point obtained from the backup history file pg_control. */
dbstate_at_startup = ControlFile->state;
if (InArchiveRecovery){
// If it's true recovery Pattern , Set the current status of the database to DB_IN_ARCHIVE_RECOVERY
ControlFile->state = DB_IN_ARCHIVE_RECOVERY;
SpinLockAcquire(&XLogCtl->info_lck);
XLogCtl->SharedRecoveryState = RECOVERY_STATE_ARCHIVE; // SharedRecoveryState towards xlog The module indicates whether it is in the process of crash recovery, archive recovery or recovery completion
SpinLockRelease(&XLogCtl->info_lck);
} else {
ereport(LOG,(errmsg("database system was not properly shut down; automatic recovery in progress")));
if (recoveryTargetTLI > ControlFile->checkPointCopy.ThisTimeLineID) // If a target timeline is specified , And the target timeline is larger than the current timeline , hit log
ereport(LOG,(errmsg("crash recovery starts in timeline %u and has target timeline %u",ControlFile->checkPointCopy.ThisTimeLineID, recoveryTargetTLI)));
ControlFile->state = DB_IN_CRASH_RECOVERY; // Normal crash recovery mode
SpinLockAcquire(&XLogCtl->info_lck);
XLogCtl->SharedRecoveryState = RECOVERY_STATE_CRASH;
SpinLockRelease(&XLogCtl->info_lck);
}
// Update checkpoints
ControlFile->checkPoint = checkPointLoc;
ControlFile->checkPointCopy = checkPoint;
if (InArchiveRecovery) {
/* initialize minRecoveryPoint if not set yet */ // minRecoveryPoint Shouldn't be better than RedoPtr Small
if (ControlFile->minRecoveryPoint < checkPoint.redo) {
ControlFile->minRecoveryPoint = checkPoint.redo;
ControlFile->minRecoveryPointTLI = checkPoint.ThisTimeLineID;
}
}
/* Set backupStartPoint if we're starting recovery from a base backup. Also set backupEndPoint and use minRecoveryPoint as the backup end location if we're starting recovery from a base backup which was taken from a standby. In this case, the database system status in pg_control must indicate that the database was already in recovery. Usually that will be DB_IN_ARCHIVE_RECOVERY but also can be DB_SHUTDOWNED_IN_RECOVERY if recovery previously was interrupted before reaching this point; e.g. because restore_command or primary_conninfo were faulty. Any other state indicates that the backup somehow became corrupted and we can't sensibly continue with recovery. */ // If we start from the basic backup , Please set up backupStartPoint. If we start from the basic backup obtained from the standby database , Also set backupEndPoint And use minRecoveryPoint As the end of backup . under these circumstances ,pg_control The state of the database system in must indicate that the database is already being restored . Usually this will be DB_IN_ARCHIVE_RECOVERY But it can also be DB_SHUTDOWNED_IN_RECOVERY If the previous recovery is interrupted before this point is reached ; for example because restore_command or primary_conninfo There is a problem . Any other status indicates that the backup is somehow corrupted , We cannot continue to recover wisely
if (haveBackupLabel) {
ControlFile->backupStartPoint = checkPoint.redo;
ControlFile->backupEndRequired = backupEndRequired;
// backupFromStandby Indicates that the current data is basebackup Coming out , And it is backed up from the standby computer . The backup of the standby machine can only be non exclusive , And the checkpoints created are also restartpoint
if (backupFromStandby){
// If the database exception is not in recovery state , False report ( Because if it is backed up from the standby computer , Only these two states are possible )
if (dbstate_at_startup != DB_IN_ARCHIVE_RECOVERY && dbstate_at_startup != DB_SHUTDOWNED_IN_RECOVERY)
ereport(FATAL,(errmsg("backup_label contains data inconsistent with control file"), errhint("This means that the backup is corrupted and you will have to use another backup for recovery.")));
ControlFile->backupEndPoint = ControlFile->minRecoveryPoint;
}
}
ControlFile->time = (pg_time_t) time(NULL);
UpdateControlFile(); /* No need to hold ControlFileLock yet, we aren't up far enough */
Initialize our minRecoveryPoint The local variable . During crash recovery , We hope to replay to WAL end . Especially when the backup machine is lifted , In the control file minRecoveryPoint Values are updated only after the first checkpoint . however , If the instance crashes before the first post recovery checkpoint completes , Then the restore will use an old location , This causes the startup process to consider that there are still invalid page references when checking the data consistency .
/* Initialize our local copy of minRecoveryPoint. When doing crash recovery we want to replay up to the end of WAL. Particularly, in the case of a promoted standby minRecoveryPoint value in the control file is only updated after the first checkpoint. However, if the instance crashes before the first post-recovery checkpoint is completed then recovery will use a stale location causing the startup process to think that there are still invalid page references when checking for data consistency. */
if (InArchiveRecovery){
minRecoveryPoint = ControlFile->minRecoveryPoint;
minRecoveryPointTLI = ControlFile->minRecoveryPointTLI;
}else{
minRecoveryPoint = InvalidXLogRecPtr;
minRecoveryPointTLI = 0;
}
clear pgstat data 、backuplable file 、tablespace_map file . If there is a backup label file (backup label file), It has done its job , The information has now spread to pg_control. We have to get rid of the label file , So if we crash during recovery , We will continue at the latest recovery restart point , Instead of going all the way back to the starting point . Although it seems prudent to rename a file rather than delete it completely . If there is a tablespace_map file , It has done its job and has created symbolic links . We must get rid of the mapping file , So if we crash during recovery , We will not create symbolic links again . Although it seems prudent to rename a file rather than delete it completely .
pgstat_reset_all(); /* Reset pgstat data, because it may be invalid after recovery. */
/* If there was a backup label file, it's done its job and the info has now been propagated into pg_control. We must get rid of the label file so that if we crash during recovery, we'll pick up at the latest recovery restartpoint instead of going all the way back to the backup start point. It seems prudent though to just rename the file out of the way rather than delete it completely. */
if (haveBackupLabel) {
unlink(BACKUP_LABEL_OLD);
durable_rename(BACKUP_LABEL_FILE, BACKUP_LABEL_OLD, FATAL);
}
/* If there was a tablespace_map file, it's done its job and the symlinks have been created. We must get rid of the map file so that if we crash during recovery, we don't create symlinks again. It seems prudent though to just rename the file out of the way rather than delete it completely. */
if (haveTblspcMap) {
unlink(TABLESPACE_MAP_OLD);
durable_rename(TABLESPACE_MAP, TABLESPACE_MAP_OLD, FATAL);
}
Checks are used to generate WAL Of GUC Whether recovery is allowed . We are recovering , Therefore, unrecorded relationships may be discarded and must be reset . This should be done before allowing hot standby connections , This way, the read-only backend will not try to read any garbage left before . Again , Delete any saved transaction snapshot files left by the crashed backend .
CheckRequiredParameterValues(); /* Check that the GUCs used to generate the WAL allow recovery */
/* We're in recovery, so unlogged relations may be trashed and must be reset. This should be done BEFORE allowing Hot Standby connections, so that read-only backends don't try to read whatever garbage is left over from before. */
ResetUnloggedRelations(UNLOGGED_RELATION_CLEANUP);
/* Likewise, delete any saved transaction snapshot files that got left behind by crashed backends. */
DeleteAllExportedSnapshotFiles();
Initialize for hot standby ( If enabled ). We won't let the back end enter , Until we reach the minimum recovery point specified in the control file and we have recovered from running-xacts WAL The record establishes the recovery snapshot .
/* Initialize for Hot Standby, if enabled. We won't let backends in yet, not until we've reached the min recovery point specified in control file and we've established a recovery snapshot from a running-xacts WAL record. */
if (ArchiveRecoveryRequested && EnableHotStandby) {
TransactionId *xids;
int nxids;
InitRecoveryTransactionEnvironment();
if (wasShutdown) oldestActiveXID = PrescanPreparedTransactions(&xids, &nxids);
else oldestActiveXID = checkPoint.oldestActiveXid;
/* Tell procarray about the range of xids it has to deal with */
ProcArrayInitRecovery(XidFromFullTransactionId(ShmemVariableCache->nextXid));
/* Startup subtrans only. CLOG, MultiXact and commit timestamp have already been started up and other SLRUs are not maintained during recovery and need not be started yet. */ // Start only the sub transfer . CLOG、MultiXact And the submission timestamp has been started , other SLRU There is no maintenance during the recovery process , There is no need to start .
StartupSUBTRANS(oldestActiveXID);
/* If we're beginning at a shutdown checkpoint, we know that nothing was running on the primary at this point. So fake-up an empty running-xacts record and use that here and now. Recover additional standby state for prepared transactions. */ // If we start by closing checkpoints , We know that nothing is running on the primary node at this time . So forge an empty running-xacts Record and use it here and now . Restore additional standby state for prepared transactions .
if (wasShutdown) {
RunningTransactionsData running;
TransactionId latestCompletedXid;
/* Construct a RunningTransactions snapshot representing a shut down server, with only prepared transactions still alive. We're never overflowed at this point because all subxids are listed with their parent prepared transactions. */
running.xcnt = nxids;
running.subxcnt = 0;
running.subxid_overflow = false;
running.nextXid = XidFromFullTransactionId(checkPoint.nextXid);
running.oldestRunningXid = oldestActiveXID;
latestCompletedXid = XidFromFullTransactionId(checkPoint.nextXid);
TransactionIdRetreat(latestCompletedXid);
Assert(TransactionIdIsNormal(latestCompletedXid));
running.latestCompletedXid = latestCompletedXid;
running.xids = xids;
ProcArrayApplyRecoveryInfo(&running);
StandbyRecoverPreparedTransactions();
}
}
rm_startup be used for startup Process execution StartupXLOG Function XLOG Initialization resource manager in log playback RMGR.PostgreSQL database WAL—— Explorer RMGR.
initialization XLogCtl Share variables to track WAL Replay progress , It's like we just replayed REDO Record before position ( Or the checkpoint record itself , If it is a closed checkpoint ).
/* Initialize shared variables for tracking progress of WAL replay, as if we had just replayed the record before the REDO location (or the checkpoint record itself, if it's a shutdown checkpoint). */
SpinLockAcquire(&XLogCtl->info_lck);
if (checkPoint.redo < RecPtr) XLogCtl->replayEndRecPtr = checkPoint.redo;
else XLogCtl->replayEndRecPtr = EndRecPtr;
XLogCtl->replayEndTLI = ThisTimeLineID;
XLogCtl->lastReplayedEndRecPtr = XLogCtl->replayEndRecPtr;
XLogCtl->lastReplayedTLI = XLogCtl->replayEndTLI;
XLogCtl->recoveryLastXTime = 0;
XLogCtl->currentChunkStartTime = 0;
XLogCtl->recoveryPauseState = RECOVERY_NOT_PAUSED;
SpinLockRelease(&XLogCtl->info_lck);
XLogReceiptTime = GetCurrentTimestamp(); /* Also ensure XLogReceiptTime has a sane value */
Give Way postmaster Know we've started doing it again now , So that it can start checkpointer To execute the restart point . We won't interrupt during crash recovery , Because the restart point can only be performed during archive recovery . We want to keep crash recovery simple , To avoid introducing errors that might affect you when recovering from a crash . After that , We can no longer assume that we are except postmaster The only process outside ! Besides ,fsync The request will then be processed by the check pointer , Instead of dealing with... Locally .
/* Let postmaster know we've started redo now, so that it can launch checkpointer to perform restartpoints. We don't bother during crash recovery as restartpoints can only be performed during archive recovery. And we'd like to keep crash recovery simple, to avoid introducing bugs that could affect you when recovering after crash. * After this point, we can no longer assume that we're the only process in addition to postmaster! Also, fsync requests are subsequently to be handled by the checkpointer, not locally. */
if (ArchiveRecoveryRequested && IsUnderPostmaster) {
PublishStartupProcessInformation();
EnableSyncRequestForwarding();
SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED);
bgwriterLaunched = true;
}
If we have agreed , Then read-only connections are allowed immediately
/* Allow read-only connections immediately if we're consistent already. */
CheckRecoveryConsistency();
Find the first record that logically follows the checkpoint —— It may be physically before the first record of the checkpoint
/* Find the first record that logically follows the checkpoint --- it might physically precede it, though. */
if (checkPoint.redo < RecPtr) {
/* back up to find the record */
XLogBeginRead(xlogreader, checkPoint.redo);
record = ReadRecord(xlogreader, PANIC, false);
}else{
/* just have to read next record after CheckPoint */
record = ReadRecord(xlogreader, LOG, false);
}
边栏推荐
- Necessary free artifact for remote assistance todesk remote control software (defense, remote, debugging, office) necessary remote tools
- Using XAML only to realize the effect of ground glass background panel
- sql与mysql有哪些区别
- Dynamic addition of prompt information for successful operation
- 根因解析 | Kubernetes Pod状态异常九大场景盘点
- 基于Flexsim的供应链建模与仿真课程设计
- Root cause analysis | inventory of nine scenarios with abnormal status of kubernetes pod
- Digital twin smart server: information security monitoring platform
- Research Report on inorganic copper fungicide industry - market status analysis and development prospect forecast
- RDKIT | 基于分子指纹的分子相似性
猜你喜欢

arduino有关软件卸载,库的卸载问题

How to see who developed the applet (see the method of the applet development company)

thinkphp的这些扩展插架你都知道吗?

mysql中执行存储过程的语句怎么写

mysql如何关闭事务

RDKIT | 基于分子指纹的分子相似性

Deploy ZABBIX enterprise level distributed monitoring

EasyExcel-简介-01

Sword finger offer (2nd Edition) brush questions | 04 Find in 2D array

Minesweeping - C language - Advanced (recursive automatic expansion + chess mark)
随机推荐
QML control type: drawer
Dynamic addition of prompt information for successful operation
24 parameter estimation interval estimation of two population parameters
Unittest use
Type de contrôle qml: Drawer
[telnet] telnet installation and configuration
【蓝桥杯单片机组】串口通信
Black technology, real-time voice simulation
SaaS multi lease mall system in logistics industry improves logistics management efficiency and realizes efficient collaboration
Can customer managers be relied on online? Is the fund safe
SQL advanced challenge (26 - 30)
Web3 in 2022 - define concepts and develop innovative paradigms
Seat number of Pat grade B 1041 test (15 points)
mysql分页查询如何优化
Tensorrt notes (III) reference
mysql不是内部命令如何解决
Research Report on market supply and demand and strategy of shuttleless loom industry in China
17 statistics and their sampling distribution statistics and distribution
企业级开发使用POI踩坑盘点
Kubernetes cluster setup detailed tutorial