当前位置：网站首页>[high concurrency, high performance and high availability of massive data MySQL practice-7] - memory data drop disk

[high concurrency, high performance and high availability of massive data MySQL practice-7] - memory data drop disk

2022-07-04 05:17:00 【Don't be infatuated with Fage】

WAL Require changes in data to be written to disk , First, the log in memory must be written to disk ; When a transaction is committed , All generated logs must be flushed to disk , If the log is refreshed successfully , The database went down before the data in the buffer pool was flushed to disk , So when you restart , The database can recover data from logs . This method improves the efficiency of data writing , At the same time, it also forms memory dirty pages . Dirty pages eventually need to be written to disk , therefore InnoDB Adopted checkpoint Mechanism Realize the final persistence of data .

1. Introduction to dirty page falling tray

Think about this scenario ： If redo logs can grow infinitely , At the same time, the buffer pool is large enough , There is no need to refresh the new version of pages in the buffer pool back to disk . Because when downtime occurs , The data in the entire database system can be recovered by redoing the log until the time of downtime .

But this requires two prerequisites ：

1. The buffer pool can cache all the data in the database ;

2. Redo logs can grow infinitely

therefore Checkpoint（ checkpoint ） Technology is born , Used to solve the problem of dirty pages falling on the disk .

LSN(log sequence number) Used to record log serial number , It's a growing unsigned long long Type integer . stay InnoDB In the log system of ,LSN Everywhere , It is used to indicate the log sequence number when the dirty page is modified , Also used to record checkpoint. adopt LSN It can be specifically located in redo log Location in file . To manage dirty pages , stay Buffer Pool Of Every instance It's maintained a flush list,flush list Upper page Modify these according to page Of LSN Number to sort . So do it regularly redo checkpoint when , You can find it quickly flush list The oldest one in the world page（ Have the smallest LSN）.

As a result of WAL The strategy of , Persistence is required every time a transaction is committed redo log Only in this way can we ensure that our affairs are not lost . The delayed brushing of dirty pages has the effect of merging multiple modifications , Avoid performance problems caused by writing data files frequently .

LSN By command SHOW ENGINE INNODB STATUS To observe ：

mysql> show engine innodb status \G

Checkpoint When it happened 、 The selection of conditions and dirty pages are very complex . and Checkpoint All you have to do is brush the dirty pages in the buffer pool back to disk , The difference is how many pages are refreshed to disk at a time , Where to get the dirty page every time , And when to trigger Checkpoint.

2.checkpoint Mechanism to drop the plate

stay InnoDB Inside the storage engine , There are two kinds of Checkpoint, Respectively ： Sharp Checkpoint 、 Fuzzy Checkpoint

1.sharp checkpoint

1. summary

That is, the synchronous data is dropped , Will block write operations , Affect the throughput of the system .checkpoint All pages in the range fall at the same time, that is, after all write operations are completed checkpoint Will complete . produce sharp checkpoint The timing of ：

1. When closing the database , take buffer pool All dirty pages in are flushed to disk .

2. The log file is full , The log of subsequent operations cannot be written , You need to free up some log file space .

3. buffer pool More than 90% When the sharp checkpoint.

2. summary

sharp checkpoint： Forced drop . Blocking write operations , You need to put everything in this checkpoint All dirty pages are written to disk .

① When closing the database

② When redo log When it's full , When redo log More than 90%.

③buffer pool Dirty pages in more than 90% when

2.fuzzy checkpoint

Asynchronous data drop , This time checkpoint The timing points of the data pages within the range may be different . It will not affect the throughput of the system . The purpose is also to avoid sharp checkpoint Performance problems caused by .

1. Buffer pool timing Checkpoint

When the percentage of dirty pages reaches by innodb_max_dirty_pages_pct_lwm When the low watermark value defined by the variable , Buffer pool flushing will be started . The default low watermark is... Of the buffer pool page 10%. And prevent the number of dirty pages from reaching innodb_max_dirty_pages_pct Variable （ The default value is 90） Defined thresholds . If the percentage of dirty pages in the buffer pool reaches innodb_max_dirty_pages_pct After threshold , Will perform sharp checkpoint Refresh the buffer pool page . stay MySQL 8.0 in , Buffer pool flushing is performed by the page cleaner thread . The number of page cleaner threads is determined by innodb_page_cleaners Variable control , The default value of this variable is 4. however , If the number of page cleaner threads exceeds the number of buffer pool instances, it will be automatically set to the same value as the number of buffer pools .

The cleanup thread executes once per second , The scanning range is a distance from the end of the thread pool chain table , The scanning range is determined by the parameter innodb_lru_scan_depth Parameter control , The default is 1024 , So the range of each scan is ：

 Number of buffer pools （8） *  The length of the scan （1024）

 namely ：

innodb_buffer_pool_instances * innodb_lru_scan_depth

2.Adaptive Flushing Checkpoint

The adaptive refresh algorithm is based on redo log The speed of generation and the current refresh rate are dynamically adjusted . The goal is to smooth overall performance by ensuring that refresh activities are synchronized with the current workload . Automatically adjusting the refresh rate helps to avoid the I/O The sudden impact of activities can be used for ordinary reading and writing activities I/O Capacity time , The throughput drops suddenly . The adaptive refresh algorithm helps avoid this by tracking the number of dirty pages in the buffer pool and the rate at which redo log records are generated . According to this information , It determines how many dirty pages are flushed from the buffer pool per second , So as to adapt to sudden changes in workload . So as to ensure that the utilization rate of redo logs will not reach 75%（ achieve 75% After that, asynchronous refresh will be started , Here is hard coded no parameter control ）

innodb_adaptive_flushing_lwm Variable defines the low watermark for the redo log capacity . When the threshold is exceeded , Adaptive refresh will be enabled .

3.Async/Sync Flush Checkpoint

Async/Sync Flush checkpoint It's on its own page cleaner Executed in thread .

Async/Sync Flush checkpoint Occurs when the redo log is unavailable , take buffer pool Some dirty pages in are flushed to disk , After dirty pages are written to disk , The redo log corresponding to the transaction can also be released .

About redo_log The size of the file , Can pass innodb_log_file_size To configure the .

For execution Async Flush checkpoint still Sync Flush checkpoint, from checkpoint_age as well as async_water_mark and sync_water_mark To decide .

## namely checkpoint_age Equal to the latest lsn Subtract those that have been flushed to disk lsn Value  

checkpoint_age = redo_lsn-checkpoint_lsn 

async_water_mark = 75%*innodb_log_file_size 

sync_water_mark = 90%*innodb_log_file_size

1. When checkpoint_age<async_water_mark When , No execution required Flush checkpoint. That is to say ,redo log The remaining space exceeds 25% When , No execution required Async/Sync Flush checkpoint.

2. When async_water_mark<checkpoint_age<sync_water_mark When , perform Async Flush checkpoint, That is to say ,redo log Insufficient space left 25%, But more than 10% When , perform Async Flush checkpoint, Refresh to meet the conditions 1

3. When checkpoint_age>sync_water_mark When , perform sync Flush checkpoint. That is to say ,redo log Insufficient space left 10% When , perform Sync Flush checkpoint, Refresh to meet the conditions 1. stay mysql 5.6 after , Whether it's Async Flush checkpoint still Sync Flush checkpoint, Will not block the user's query process .　

Because disk is a relatively slow storage device , The interaction between memory and disk is a relatively slow process

because innodb_log_file_size Defines a relatively large value , Under normal circumstances , From the first two checkpoint Refresh dirty pages to disk , In the first two checkpoint After refreshing dirty pages to disk , Dirty pages correspond to redo log The space is released , It doesn't usually happen Async/Sync Flush checkpoint. Also be aware that , In order to avoid frequent low occurrence Async/Sync Flush checkpoint, Should also be innodb_log_file_size The configuration is relatively larger .

4.Dirty Page too much

Dirty Page too much signify buffer pool Too many dirty pages in , perform checkpoint Brush dirty pages into disk , Guarantee buffer pool There are enough pages available in .Dirty Page from innodb_max_dirty_pages_pct To configure ,innodb_max_dirty_pages_pct The default value of is 90, It used to be 75 The increased default value allows a larger percentage of dirty pages in the buffer pool .InnoDB Try refreshing data from the buffer pool , So that the percentage of dirty pages does not exceed this value .

5. summary

fuzzy checkpoint： Fuzzy falling disc , Write operations are not blocked , This time checkpoint Dirty pages of do not require writing to disk at the same time .

①buffer pool Dirty pages in more than 10% when , Start the mechanism of regular dropping , Check once per second . The scope of inspection is in each buffer pool lru After the list 1024 A page .

② The adaptive checkpoint： Check redo log file , According to the frequency and redo log Usage of , Dynamically adjust the data page of the disk . Guarantee redo log The usage of is no more than 75%.

③ When redo log More than 75% when , It will trigger a fuzzy checkpoint.

Dirty pages fall behind the disk , Corresponding redo log Need to delete .

3、Double Write Dirty pages double write down disk

Double write process

① First write a copy of the dirty pages to the double write buffer

② Then write the dirty page to the corresponding data file .

Ensure the security during data writing .

if Insert Buffer to InnoDB The storage engine has brought performance improvements , that Double Write Bring InnoDB Storage engine is the reliability of the data page .

The double write buffer is a storage area , In this area , First refresh the page from the buffer pool , Then write the page to the correct position in the data file . If the operating system crashes while writing pages to disk , In the process of recovery ,InnoDB The storage engine can select from... In the shared table space double write Find a copy of the page in , Copy it to the tablespace file , Then apply redo log .

If you think the article looks good , Welcome to like collection and attention , Three strikes in a row , You must be the driving force of my continuous output , thank ！！*

原网站

版权声明
本文为[Don't be infatuated with Fage]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202141648161275.html