当前位置:网站首页>[high concurrency, high performance and high availability of massive data MySQL practice-7] - memory data drop disk
[high concurrency, high performance and high availability of massive data MySQL practice-7] - memory data drop disk
2022-07-04 05:17:00 【Don't be infatuated with Fage】
WAL Require changes in data to be written to disk , First, the log in memory must be written to disk ; When a transaction is committed , All generated logs must be flushed to disk , If the log is refreshed successfully , The database went down before the data in the buffer pool was flushed to disk , So when you restart , The database can recover data from logs . This method improves the efficiency of data writing , At the same time, it also forms memory dirty pages . Dirty pages eventually need to be written to disk , therefore InnoDB Adopted checkpoint Mechanism Realize the final persistence of data .
1. Introduction to dirty page falling tray
Think about this scenario : If redo logs can grow infinitely , At the same time, the buffer pool is large enough , There is no need to refresh the new version of pages in the buffer pool back to disk . Because when downtime occurs , The data in the entire database system can be recovered by redoing the log until the time of downtime .
But this requires two prerequisites :
1. The buffer pool can cache all the data in the database ;
2. Redo logs can grow infinitely
therefore Checkpoint( checkpoint ) Technology is born , Used to solve the problem of dirty pages falling on the disk .
LSN(log sequence number) Used to record log serial number , It's a growing unsigned long long Type integer . stay InnoDB In the log system of ,LSN Everywhere , It is used to indicate the log sequence number when the dirty page is modified , Also used to record checkpoint. adopt LSN It can be specifically located in redo log Location in file . To manage dirty pages , stay Buffer Pool Of Every instance It's maintained a flush list,flush list Upper page Modify these according to page Of LSN Number to sort . So do it regularly redo checkpoint when , You can find it quickly flush list The oldest one in the world page( Have the smallest LSN).
As a result of WAL The strategy of , Persistence is required every time a transaction is committed redo log Only in this way can we ensure that our affairs are not lost . The delayed brushing of dirty pages has the effect of merging multiple modifications , Avoid performance problems caused by writing data files frequently .
LSN By command SHOW ENGINE INNODB STATUS To observe :
mysql> show engine innodb status \G
Checkpoint When it happened 、 The selection of conditions and dirty pages are very complex . and Checkpoint All you have to do is brush the dirty pages in the buffer pool back to disk , The difference is how many pages are refreshed to disk at a time , Where to get the dirty page every time , And when to trigger Checkpoint.
2.checkpoint Mechanism to drop the plate
stay InnoDB Inside the storage engine , There are two kinds of Checkpoint, Respectively : Sharp Checkpoint 、 Fuzzy Checkpoint
1.sharp checkpoint
1. summary
That is, the synchronous data is dropped , Will block write operations , Affect the throughput of the system .checkpoint All pages in the range fall at the same time, that is, after all write operations are completed checkpoint Will complete . produce sharp checkpoint The timing of :
1. When closing the database , take buffer pool All dirty pages in are flushed to disk .
2. The log file is full , The log of subsequent operations cannot be written , You need to free up some log file space .
3. buffer pool More than 90% When the sharp checkpoint.
2. summary
sharp checkpoint: Forced drop . Blocking write operations , You need to put everything in this checkpoint All dirty pages are written to disk .
① When closing the database
② When redo log When it's full , When redo log More than 90%.
③buffer pool Dirty pages in more than 90% when
2.fuzzy checkpoint
Asynchronous data drop , This time checkpoint The timing points of the data pages within the range may be different . It will not affect the throughput of the system . The purpose is also to avoid sharp checkpoint Performance problems caused by .
1. Buffer pool timing Checkpoint
When the percentage of dirty pages reaches by innodb_max_dirty_pages_pct_lwm When the low watermark value defined by the variable , Buffer pool flushing will be started . The default low watermark is... Of the buffer pool page 10%. And prevent the number of dirty pages from reaching innodb_max_dirty_pages_pct Variable ( The default value is 90) Defined thresholds . If the percentage of dirty pages in the buffer pool reaches innodb_max_dirty_pages_pct After threshold , Will perform sharp checkpoint Refresh the buffer pool page . stay MySQL 8.0 in , Buffer pool flushing is performed by the page cleaner thread . The number of page cleaner threads is determined by innodb_page_cleaners Variable control , The default value of this variable is 4. however , If the number of page cleaner threads exceeds the number of buffer pool instances, it will be automatically set to the same value as the number of buffer pools .
The cleanup thread executes once per second , The scanning range is a distance from the end of the thread pool chain table , The scanning range is determined by the parameter innodb_lru_scan_depth Parameter control , The default is 1024 , So the range of each scan is :
Number of buffer pools (8) * The length of the scan (1024)
namely :
innodb_buffer_pool_instances * innodb_lru_scan_depth
2.Adaptive Flushing Checkpoint
The adaptive refresh algorithm is based on redo log The speed of generation and the current refresh rate are dynamically adjusted . The goal is to smooth overall performance by ensuring that refresh activities are synchronized with the current workload . Automatically adjusting the refresh rate helps to avoid the I/O The sudden impact of activities can be used for ordinary reading and writing activities I/O Capacity time , The throughput drops suddenly . The adaptive refresh algorithm helps avoid this by tracking the number of dirty pages in the buffer pool and the rate at which redo log records are generated . According to this information , It determines how many dirty pages are flushed from the buffer pool per second , So as to adapt to sudden changes in workload . So as to ensure that the utilization rate of redo logs will not reach 75%( achieve 75% After that, asynchronous refresh will be started , Here is hard coded no parameter control )
innodb_adaptive_flushing_lwm Variable defines the low watermark for the redo log capacity . When the threshold is exceeded , Adaptive refresh will be enabled .
3.Async/Sync Flush Checkpoint
Async/Sync Flush checkpoint It's on its own page cleaner Executed in thread .
Async/Sync Flush checkpoint Occurs when the redo log is unavailable , take buffer pool Some dirty pages in are flushed to disk , After dirty pages are written to disk , The redo log corresponding to the transaction can also be released .
About redo_log The size of the file , Can pass innodb_log_file_size To configure the .
For execution Async Flush checkpoint still Sync Flush checkpoint, from checkpoint_age as well as async_water_mark and sync_water_mark To decide .
## namely checkpoint_age Equal to the latest lsn Subtract those that have been flushed to disk lsn Value
checkpoint_age = redo_lsn-checkpoint_lsn
async_water_mark = 75%*innodb_log_file_size
sync_water_mark = 90%*innodb_log_file_size
1. When checkpoint_age<async_water_mark When , No execution required Flush checkpoint. That is to say ,redo log The remaining space exceeds 25% When , No execution required Async/Sync Flush checkpoint.
2. When async_water_mark<checkpoint_age<sync_water_mark When , perform Async Flush checkpoint, That is to say ,redo log Insufficient space left 25%, But more than 10% When , perform Async Flush checkpoint, Refresh to meet the conditions 1
3. When checkpoint_age>sync_water_mark When , perform sync Flush checkpoint. That is to say ,redo log Insufficient space left 10% When , perform Sync Flush checkpoint, Refresh to meet the conditions 1. stay mysql 5.6 after , Whether it's Async Flush checkpoint still Sync Flush checkpoint, Will not block the user's query process .
Because disk is a relatively slow storage device , The interaction between memory and disk is a relatively slow process
because innodb_log_file_size Defines a relatively large value , Under normal circumstances , From the first two checkpoint Refresh dirty pages to disk , In the first two checkpoint After refreshing dirty pages to disk , Dirty pages correspond to redo log The space is released , It doesn't usually happen Async/Sync Flush checkpoint. Also be aware that , In order to avoid frequent low occurrence Async/Sync Flush checkpoint, Should also be innodb_log_file_size The configuration is relatively larger .
4.Dirty Page too much
Dirty Page too much signify buffer pool Too many dirty pages in , perform checkpoint Brush dirty pages into disk , Guarantee buffer pool There are enough pages available in .Dirty Page from innodb_max_dirty_pages_pct To configure ,innodb_max_dirty_pages_pct The default value of is 90, It used to be 75 The increased default value allows a larger percentage of dirty pages in the buffer pool .InnoDB Try refreshing data from the buffer pool , So that the percentage of dirty pages does not exceed this value .
5. summary
fuzzy checkpoint: Fuzzy falling disc , Write operations are not blocked , This time checkpoint Dirty pages of do not require writing to disk at the same time .
①buffer pool Dirty pages in more than 10% when , Start the mechanism of regular dropping , Check once per second . The scope of inspection is in each buffer pool lru After the list 1024 A page .
② The adaptive checkpoint: Check redo log file , According to the frequency and redo log Usage of , Dynamically adjust the data page of the disk . Guarantee redo log The usage of is no more than 75%.
③ When redo log More than 75% when , It will trigger a fuzzy checkpoint.
Dirty pages fall behind the disk , Corresponding redo log Need to delete .
3、Double Write Dirty pages double write down disk
Double write process
① First write a copy of the dirty pages to the double write buffer
② Then write the dirty page to the corresponding data file .
Ensure the security during data writing .
if Insert Buffer to InnoDB The storage engine has brought performance improvements , that Double Write Bring InnoDB Storage engine is the reliability of the data page .
The double write buffer is a storage area , In this area , First refresh the page from the buffer pool , Then write the page to the correct position in the data file . If the operating system crashes while writing pages to disk , In the process of recovery ,InnoDB The storage engine can select from... In the shared table space double write Find a copy of the page in , Copy it to the tablespace file , Then apply redo log .
If you think the article looks good , Welcome to like collection and attention , Three strikes in a row , You must be the driving force of my continuous output , thank !!*
边栏推荐
- C语言简易学生管理系统(含源码)
- [matlab] matlab simulation - narrow band Gaussian white noise
- Nodejs learning document
- appliedzkp zkevm(11)中的EVM Proof
- [matlab] matlab simulation modulation system - DSB system
- [matlab] general function of communication signal modulation - generation of narrow-band Gaussian white noise
- KMP match string
- 如何构建属于自己的知识引擎?社群开放申请
- Analysis of classical pointer and array written test questions in C language
- Flink1.13 basic SQL syntax (II) join operation
猜你喜欢
LM small programmable controller software (based on CoDeSys) note XXI: error 3703
[paper summary] zero shot semantic segmentation
Just do it with your hands 7 - * project construction details 2 - hook configuration
2022 t elevator repair operation certificate examination question bank and simulation examination
Flask
Two sides of the evening: tell me about the bloom filter and cuckoo filter? Application scenario? I'm confused..
C basic (VII) document operation
[technology development -25]: integration technology of radio and television network, Internet, telecommunication network and power grid
Automated testing selenium foundation -- webdriverapi
Customize a pager needed in your project
随机推荐
[matlab] matlab simulation modulation system - VSB system
C语言简易学生管理系统(含源码)
如何构建属于自己的知识引擎?社群开放申请
[matlab] matlab simulation - simulate the AM modulation process of the modulation system
[matlab] matlab simulates digital bandpass transmission system ask, PSK, FSK system
中职组网络安全—内存取证
[interested reading] advantageous filtering modeling on long term user behavior sequences for click through rate pre
中科磐云—2022广东木马信息获取解析
Just do it with your hands 7 - * project construction details 2 - hook configuration
TCP状态转换图
2022G2电站锅炉司炉特种作业证考试题库及答案
Zhongke Panyun - data analysis and forensics packet flag
Flink1.13 basic SQL syntax (II) join operation
中職組網絡安全—內存取證
定制一个自己项目里需要的分页器
A summary of the 8544 problem that SolidWorks Standard cannot obtain a license
[matlab] communication signal modulation general function - low pass filter
Simple g++ and GDB debugging
Annex 4: scoring criteria of the attacker docx
全国职业院校技能大赛(中职组)网络安全竞赛试题—解析