当前位置:网站首页>PostgreSQL source code learning (21) -- fault recovery ② - transaction log initialization

PostgreSQL source code learning (21) -- fault recovery ② - transaction log initialization

2022-06-11 03:19:00 Hehuyi_ In

        We know ,WAL Logs are not real-time disks ,pg Allocated... In shared memory XLOG BUFFER Cache log pages . When a log record is to be written , Will first write XLOG BUFFER, And before that , The first thing to do is to apply for shared memory for the transaction log , And the initialization of some important structures .

The main contents of this section are as follows :

  • Calculation XLOG How much shared memory do I need to apply for :XLOGShmemSize function
  • Two important structures :XLogCtlData and XLogCtlInsert
  • XLOG Initialize shared memory :XLOGShmemInit function

One 、 XLOGShmemSize function

        The user can go through wal_buffers Parameter to specify XLOG BUFFER Number of cached pages in .wal_buffers The default value is -1, Express pg Through heuristic algorithm (XLOGChooseNumBuffers function ) Calculate the number of pages that need to be cached .

        except XLOG BUFFER, You also need to apply for some shared memory storage XLOG Control information , So you can see in the function size It is the result of adding several parts .

/*
 *  Calculation XLOG How much shared memory do I need to apply for 
*/
Size XLOGShmemSize(void)
{
	Size		size;

	/*
	 *  Judge wal_buffers Is it -1, If it is , Call XLOGChooseNumBuffers Function to get the value 
	 */
	if (XLOGbuffers == -1)
	{
		char		buf[32];

		snprintf(buf, sizeof(buf), "%d", XLOGChooseNumBuffers());
		SetConfigOption("wal_buffers", buf, PGC_POSTMASTER, PGC_S_OVERRIDE);
	}
    // wal_buffers Must be greater than 0, Because it is the number of cached pages 
	Assert(XLOGbuffers > 0);

	/* XLogCtl,XLog Control the size of information , We will learn more about this structure later  */
	size = sizeof(XLogCtlData);

	/* WAL insertion locks, plus alignment, Log the lightweight lock size used when inserting  */
	size = add_size(size, mul_size(sizeof(WALInsertLockPadded), NUM_XLOGINSERT_LOCKS + 1));
	/* xlblocks array, Record each XLog Block The beginning of LSN */
	size = add_size(size, mul_size(sizeof(XLogRecPtr), XLOGbuffers));
	/* extra alignment padding for XLOG I/O buffers, Use an extra block size , Keep byte alignment  */
	size = add_size(size, XLOG_BLCKSZ);
	/* and the buffers themselves,XLOG BUFFER Size  */
	size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers));

    //  Be careful : This function does not calculate ControlFileData Size 
	return size;
}

Two 、 Two Important structures

1. XLogCtlData

     XLogCtlData Save the current WAL Write status of 、 Brush disc status 、buffer Status information of the page, etc .

/*
 * Total shared-memory state for XLOG.
 */
typedef struct XLogCtlData
{
	XLogCtlInsert Insert;  // It includes XLogCtlInsert Structure 

	/* Protected by info_lck:  from info_lck Protect  */
	XLogwrtRqst LogwrtRqst;        /*  Logs that need to be written and flushed LSN */
	XLogRecPtr	RedoRecPtr;		/* a recent copy of Insert->RedoRecPtr, Redundant storage Insert->RedoRecPtr, It means close to checkpoint Of Redo LSN */
	FullTransactionId ckptFullXid;	/* nextXid of latest checkpoint, The last time checkpoint Corresponding to the next transaction id */
	XLogRecPtr	asyncXactLSN;	/* LSN of newest async commit/abort, Latest asynchronous commit / Rollback operation LSN */
	XLogRecPtr	replicationSlotMinLSN;	/* oldest LSN needed by any slot, The oldest required to copy slots LSN */

	XLogSegNo	lastRemovedSegNo;	/* latest removed/recycled XLOG segment, Last deleted / Recycled log segments ID */

	/* Fake LSN counter, for unlogged relations. Protected by ulsn_lck.  fake LSN Counter , For tables that do not require logging , from ulsn_lck Protect , At present, only GiST Use  */
	XLogRecPtr	unloggedLSN;
	slock_t		ulsn_lck;

	/* Time and LSN of last xlog segment switch. Protected by WALWriteLock. WAL Log switch , Record the current time and log in LSN, from WALWriteLock Protect  */
	pg_time_t	lastSegSwitchTime;
	XLogRecPtr	lastSegSwitchLSN;

	/*
	 * Protected by info_lck and WALWriteLock (you must hold either lock to
	 * read it, but both to update)  Logs that have been written and flushed LSN. from info_lck and WALWriteLock Protect , When you get one of them, you can read it , You can only update it when you have both 
	 */
	XLogwrtResult LogwrtResult;

	/*
	 * Latest initialized page in the cache (last byte position + 1).
	 *  At present XLOG BUFFER Assigned pages , Of the last page LSN
	 */
	XLogRecPtr	InitializedUpTo;

	/*
	 * These values do not change after startup, although the pointed-to pages
	 * and xlblocks values certainly do.  xlblocks values are protected by
	 * WALBufMappingLock. XLOG BUFFER Page and page number in . The following values are in db It will not change after startup , Although they point to pages and xlblocks It does change .Xlblocks The value of is determined by WALBufMappingLock Protect 
	 */
	char	   *pages;			/* buffers for unwritten XLOG pages, Point to XLOG BUFFER Has not been written in XLOG Pointer to the page of  */
	XLogRecPtr *xlblocks;		/* 1st byte ptr-s + XLOG_BLCKSZ */
	int			XLogCacheBlck;	/* highest allocated xlog buffer index, Maximum allocated XLOG BUFFER Indexes  */

	/*
	 *  Timeline information 
	 */
	TimeLineID	ThisTimeLineID; //  Current timeline 
	TimeLineID	PrevTimeLineID; //  The previous timeline , If there are no branches , The two are equal 

	/*
	 * SharedRecoveryState indicates if we're still in crash or archive
	 * recovery.  Protected by info_lck. Recovery status flag , Indicates whether we are crash or archive Recovering , from info_lck Protect 
	 */
	RecoveryState SharedRecoveryState;

	/*
	 * SharedHotStandbyActive indicates if we allow hot standby queries to be
	 * run.  Protected by info_lck. Whether to allow query execution from the Library , from info_lck Protect 
	 */
	bool		SharedHotStandbyActive;

	/*
	 * SharedPromoteIsTriggered indicates if a standby promotion has been
	 * triggered.  Protected by info_lck.  Whether the operation of promoting from library to main library has been triggered , from info_lck Protect 
	 */
	bool		SharedPromoteIsTriggered;

	/*
	 * WalWriterSleeping indicates whether the WAL writer is currently in
	 * low-power mode (and hence should be nudged if an async commit occurs).
	 * Protected by info_lck. WAL writer Whether it is low-power Pattern ( Allow asynchronous commit ), from info_lck Protect 
	 */
	bool		WalWriterSleeping;

	/*
	 * recoveryWakeupLatch is used to wake up the startup process to continue
	 * WAL replay, if it is waiting for WAL to arrive or failover trigger file
	 * to appear.  Used to wake up the startup process to continue execution WAL replay operation , If you are currently waiting WAL File or appear failover Trigger file 
	 *
	 */
	Latch		recoveryWakeupLatch;

	/*
	 * During recovery, we keep a copy of the latest checkpoint record here. stay recovery period , We will keep the latest checkpoint Copy of records 
	 */
	XLogRecPtr	lastCheckPointRecPtr; // Point to checkpoint The beginning of the record 
	XLogRecPtr	lastCheckPointEndPtr; // Point to checkpoint The end position of the record (end+1), When checkpointer Need to create restartpoint When using 
	CheckPoint	lastCheckPoint;       //  Current checkpoint Copy of records 

	/*
	 * lastReplayedEndRecPtr points to end+1 of the last record successfully
	 * replayed. When we're currently replaying a record, ie. in a redo
	 * function, replayEndRecPtr points to the end+1 of the record being
	 * replayed, otherwise it's equal to lastReplayedEndRecPtr. 
* lastReplayedEndRecPtr Pointing to the last record successfully played back end+1 Location .
     *  If you are in redo Function during playback recording ,replayEndRecPtr Points to the record being recovered end+1 Location , otherwise replayEndRecPtr = lastReplayedEndRecPtr
	 */
	XLogRecPtr	lastReplayedEndRecPtr;
	TimeLineID	lastReplayedTLI;
	XLogRecPtr	replayEndRecPtr;
	TimeLineID	replayEndTLI;
	/* timestamp of last COMMIT/ABORT record replayed (or being replayed), The final submission / Rollback record playback ( Or playing back ) Time for  */
	TimestampTz recoveryLastXTime;

	/*
	 * timestamp of when we started replaying the current chunk of WAL data,
	 * only relevant for replication or archive recovery, Start playing back the current WAL chunk Time for ( Only related to replication or archive recovery )
	 */
	TimestampTz currentChunkStartTime;
	/* Recovery pause state,Recovery Pause state  */
	RecoveryPauseState recoveryPauseState;
	ConditionVariable recoveryNotPausedCV;

	/*
	 * lastFpwDisableRecPtr points to the start of the last replayed
	 * XLOG_FPW_CHANGE record that instructs full_page_writes is disabled. 
*  Point to the last played back XLOG_FPW_CHANGE Record ( Disable full page write ) The starting point of .
	 */
	XLogRecPtr	lastFpwDisableRecPtr;

	slock_t		info_lck;		/* locks shared variables shown above, A lock shared variable mentioned earlier  */
} XLogCtlData;

//  Define the corresponding pointer , Initial value is null 
static XLogCtlData *XLogCtl = NULL;

2. XLogCtlInsert

     XLogCtlInsert Save log records in buffer Various variables required for .

/*
 * Shared state data for WAL insertion.
 */
typedef struct XLogCtlInsert
{
	slock_t		insertpos_lck;	/* protects CurrBytePos and PrevBytePos, Protect CurrBytePos and PrevBytePos */

	uint64		CurrBytePos; //  Where the new record is written 
	uint64		PrevBytePos; //  The new record needs to record the previous log LSN

	/*
	 *  Make sure that the variables above ( Will be frequently modified ) In the same cache line, The following variables ( Less modification ) In another cache line. Get different cache line It can avoid the frequent modification of the above variables cache line invalid , Affect the reading efficiency of the following variables 
	 */
	char		pad[PG_CACHE_LINE_SIZE];

	/*
	 *  Write related variables in full page 
	 */
	XLogRecPtr	RedoRecPtr;		/* current redo point for insertions, Current at the time of insertion redo point */
	bool		forcePageWrites;	/* forcing full-page writes for PITR?  by PITR Enforce full page write ? */
	bool		fullPageWrites;  // Whether the whole page is written ?

	/*
	 *  Online backup function  pg_start_backup/pg_stop_backup
	 */
	ExclusiveBackupState exclusiveBackupState; // Exclusive backup status 
	int			nonExclusiveBackups;          // The non exclusive backup variable is a counter , Indicates the current stream based backup in progress (streaming base backups) The number of 
	XLogRecPtr	lastBackupStart; // Redo location of the most recent checkpoint used as the starting point for online backup 

	/*
	 * WAL insertion locks. Lock during log insertion . To improve logging write buffer The concurrency of , There will be assigned NUM_XLOGINSERT_LOCKS A lock ,Backends Process in accordance with the MyProc->pgprocno Apply in turn , Until you get the lock 
	 */
	WALInsertLockPadded *WALInsertLocks;
} XLogCtlInsert;

3、 ... and 、 XLOGShmemInit function

        be used for XLOG Initialize shared memory

void XLOGShmemInit(void)
{
	bool		foundCFile,
				foundXLog;
	char	   *allocptr;
	int			i;
	ControlFileData *localControlFile;    //  Control file content 

#ifdef WAL_DEBUG   //  If enabled WAL_DEBUG Parameters 

	/*
	 *  by WAL debug Create memory context , If memory allocation fails ,DB May enter PANIC state , however wal_debug It is not a parameter used in the production environment , So it's not a big problem 
	 */
	if (walDebugCxt == NULL)
	{
		walDebugCxt = AllocSetContextCreate(TopMemoryContext,
											"WAL Debug",
											ALLOCSET_DEFAULT_SIZES);
		MemoryContextAllowInCriticalSection(walDebugCxt, true);
	}
#endif

//  Initialize shared memory structure XLogCtlData The object of XLogCtl
	XLogCtl = (XLogCtlData *)
		ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &foundXLog);

//  Initialize shared memory structure ControlFileData The object of ControlFile, as well as localControlFile
	localControlFile = ControlFile;
	ControlFile = (ControlFileData *)
		ShmemInitStruct("Control File", sizeof(ControlFileData), &foundCFile);

//  If there are control documents or XLOG file 
	if (foundCFile || foundXLog)
	{
		/* both should be present or neither, Both exist or neither exists  */
		Assert(foundCFile && foundXLog);

		/* Initialize local copy of WALInsertLocks, initialization WALInsertLocks The local copy */
		WALInsertLocks = XLogCtl->Insert.WALInsertLocks;

		/*  If localControlFile Already exists , Free up the memory it occupies  */
		if (localControlFile)
			pfree(localControlFile);
		return;
	}

//  by XLogCtl Allocate memory 
	memset(XLogCtl, 0, sizeof(XLogCtlData));

	/*
	 * Already have read control file locally, unless in bootstrap mode. Move
	 * contents into shared memory.  If the control file has been read locally , Unless it is bootstrap Pattern , Otherwise, move its contents to shared memory 
	 */
	if (localControlFile)
	{
		memcpy(ControlFile, localControlFile, sizeof(ControlFileData));
		pfree(localControlFile);
	}

	/*
	 * Since XLogCtlData contains XLogRecPtr fields, its sizeof should be a
	 * multiple of the alignment for same, so no extra alignment padding is
	 * needed here.
*  because XLogCtlData contain XLogRecPtr Field , its size Should be multiple allocations of the same size , No additional fields to fill in 
	 */
	allocptr = ((char *) XLogCtl) + sizeof(XLogCtlData);
	XLogCtl->xlblocks = (XLogRecPtr *) allocptr;
	memset(XLogCtl->xlblocks, 0, sizeof(XLogRecPtr) * XLOGbuffers);
	allocptr += sizeof(XLogRecPtr) * XLOGbuffers;


	/* WAL insertion locks. Ensure they're aligned to the full padded size,WAL Insert the lock  */
	allocptr += sizeof(WALInsertLockPadded) -
		((uintptr_t) allocptr) % sizeof(WALInsertLockPadded);
	WALInsertLocks = XLogCtl->Insert.WALInsertLocks =
		(WALInsertLockPadded *) allocptr;
	allocptr += sizeof(WALInsertLockPadded) * NUM_XLOGINSERT_LOCKS;

	for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
	{
		LWLockInitialize(&WALInsertLocks[i].l.lock, LWTRANCHE_WAL_INSERT);
		WALInsertLocks[i].l.insertingAt = InvalidXLogRecPtr;
		WALInsertLocks[i].l.lastImportantAt = InvalidXLogRecPtr;
	}

	/*
	 * Align the start of the page buffers to a full xlog block size boundary.
	 * This simplifies some calculations in XLOG insertion. It is also
	 * required for O_DIRECT. Allocate from the start cache page to the full xlog Shared memory for block size boundaries . This simplifies XLOG Inserted partial operation , In addition, you need to request O_DIRECT Pattern 
	 */
	allocptr = (char *) TYPEALIGN(XLOG_BLCKSZ, allocptr);
	XLogCtl->pages = allocptr;
	memset(XLogCtl->pages, 0, (Size) XLOG_BLCKSZ * XLOGbuffers);

	/*
	 * Do basic initialization of XLogCtl shared data. (StartupXLOG will fill
	 * in additional info.)  initialization  XLogCtl  Shared data ,StartupXLOG The function fills in the remaining fields 
	 */
	XLogCtl->XLogCacheBlck = XLOGbuffers - 1;
	XLogCtl->SharedRecoveryState = RECOVERY_STATE_CRASH;
	XLogCtl->SharedHotStandbyActive = false;
	XLogCtl->SharedPromoteIsTriggered = false;
	XLogCtl->WalWriterSleeping = false;

	// XLogCtl  The lock in is through spinlock To achieve 
	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
	SpinLockInit(&XLogCtl->info_lck);
	SpinLockInit(&XLogCtl->ulsn_lck);
	InitSharedLatch(&XLogCtl->recoveryWakeupLatch);
	ConditionVariableInit(&XLogCtl->recoveryNotPausedCV);
}

Reference resources

PostgreSQL Technology insider : Deep exploration of transaction processing 》 The first 4 Chapter

https://www.jianshu.com/p/69323c1c9994

原网站

版权声明
本文为[Hehuyi_ In]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/162/202206110248364659.html