当前位置:网站首页>[high concurrency, high performance and high availability of massive data MySQL practice-7] - memory data drop disk
[high concurrency, high performance and high availability of massive data MySQL practice-7] - memory data drop disk
2022-07-04 05:17:00 【Don't be infatuated with Fage】
WAL Require changes in data to be written to disk , First, the log in memory must be written to disk ; When a transaction is committed , All generated logs must be flushed to disk , If the log is refreshed successfully , The database went down before the data in the buffer pool was flushed to disk , So when you restart , The database can recover data from logs . This method improves the efficiency of data writing , At the same time, it also forms memory dirty pages . Dirty pages eventually need to be written to disk , therefore InnoDB Adopted checkpoint Mechanism Realize the final persistence of data .
1. Introduction to dirty page falling tray
Think about this scenario : If redo logs can grow infinitely , At the same time, the buffer pool is large enough , There is no need to refresh the new version of pages in the buffer pool back to disk . Because when downtime occurs , The data in the entire database system can be recovered by redoing the log until the time of downtime .
But this requires two prerequisites :
1. The buffer pool can cache all the data in the database ;
2. Redo logs can grow infinitely
therefore Checkpoint( checkpoint ) Technology is born , Used to solve the problem of dirty pages falling on the disk .
LSN(log sequence number) Used to record log serial number , It's a growing unsigned long long Type integer . stay InnoDB In the log system of ,LSN Everywhere , It is used to indicate the log sequence number when the dirty page is modified , Also used to record checkpoint. adopt LSN It can be specifically located in redo log Location in file . To manage dirty pages , stay Buffer Pool Of Every instance It's maintained a flush list,flush list Upper page Modify these according to page Of LSN Number to sort . So do it regularly redo checkpoint when , You can find it quickly flush list The oldest one in the world page( Have the smallest LSN).
As a result of WAL The strategy of , Persistence is required every time a transaction is committed redo log Only in this way can we ensure that our affairs are not lost . The delayed brushing of dirty pages has the effect of merging multiple modifications , Avoid performance problems caused by writing data files frequently .
LSN By command SHOW ENGINE INNODB STATUS To observe :
mysql> show engine innodb status \G
Checkpoint When it happened 、 The selection of conditions and dirty pages are very complex . and Checkpoint All you have to do is brush the dirty pages in the buffer pool back to disk , The difference is how many pages are refreshed to disk at a time , Where to get the dirty page every time , And when to trigger Checkpoint.
2.checkpoint Mechanism to drop the plate
stay InnoDB Inside the storage engine , There are two kinds of Checkpoint, Respectively : Sharp Checkpoint 、 Fuzzy Checkpoint
1.sharp checkpoint
1. summary
That is, the synchronous data is dropped , Will block write operations , Affect the throughput of the system .checkpoint All pages in the range fall at the same time, that is, after all write operations are completed checkpoint Will complete . produce sharp checkpoint The timing of :
1. When closing the database , take buffer pool All dirty pages in are flushed to disk .
2. The log file is full , The log of subsequent operations cannot be written , You need to free up some log file space .
3. buffer pool More than 90% When the sharp checkpoint.
2. summary
sharp checkpoint: Forced drop . Blocking write operations , You need to put everything in this checkpoint All dirty pages are written to disk .
① When closing the database
② When redo log When it's full , When redo log More than 90%.
③buffer pool Dirty pages in more than 90% when
2.fuzzy checkpoint
Asynchronous data drop , This time checkpoint The timing points of the data pages within the range may be different . It will not affect the throughput of the system . The purpose is also to avoid sharp checkpoint Performance problems caused by .
1. Buffer pool timing Checkpoint
When the percentage of dirty pages reaches by innodb_max_dirty_pages_pct_lwm When the low watermark value defined by the variable , Buffer pool flushing will be started . The default low watermark is... Of the buffer pool page 10%. And prevent the number of dirty pages from reaching innodb_max_dirty_pages_pct Variable ( The default value is 90) Defined thresholds . If the percentage of dirty pages in the buffer pool reaches innodb_max_dirty_pages_pct After threshold , Will perform sharp checkpoint Refresh the buffer pool page . stay MySQL 8.0 in , Buffer pool flushing is performed by the page cleaner thread . The number of page cleaner threads is determined by innodb_page_cleaners Variable control , The default value of this variable is 4. however , If the number of page cleaner threads exceeds the number of buffer pool instances, it will be automatically set to the same value as the number of buffer pools .
The cleanup thread executes once per second , The scanning range is a distance from the end of the thread pool chain table , The scanning range is determined by the parameter innodb_lru_scan_depth Parameter control , The default is 1024 , So the range of each scan is :
Number of buffer pools (8) * The length of the scan (1024)
namely :
innodb_buffer_pool_instances * innodb_lru_scan_depth
2.Adaptive Flushing Checkpoint
The adaptive refresh algorithm is based on redo log The speed of generation and the current refresh rate are dynamically adjusted . The goal is to smooth overall performance by ensuring that refresh activities are synchronized with the current workload . Automatically adjusting the refresh rate helps to avoid the I/O The sudden impact of activities can be used for ordinary reading and writing activities I/O Capacity time , The throughput drops suddenly . The adaptive refresh algorithm helps avoid this by tracking the number of dirty pages in the buffer pool and the rate at which redo log records are generated . According to this information , It determines how many dirty pages are flushed from the buffer pool per second , So as to adapt to sudden changes in workload . So as to ensure that the utilization rate of redo logs will not reach 75%( achieve 75% After that, asynchronous refresh will be started , Here is hard coded no parameter control )
innodb_adaptive_flushing_lwm Variable defines the low watermark for the redo log capacity . When the threshold is exceeded , Adaptive refresh will be enabled .
3.Async/Sync Flush Checkpoint
Async/Sync Flush checkpoint It's on its own page cleaner Executed in thread .
Async/Sync Flush checkpoint Occurs when the redo log is unavailable , take buffer pool Some dirty pages in are flushed to disk , After dirty pages are written to disk , The redo log corresponding to the transaction can also be released .
About redo_log The size of the file , Can pass innodb_log_file_size To configure the .
For execution Async Flush checkpoint still Sync Flush checkpoint, from checkpoint_age as well as async_water_mark and sync_water_mark To decide .
## namely checkpoint_age Equal to the latest lsn Subtract those that have been flushed to disk lsn Value
checkpoint_age = redo_lsn-checkpoint_lsn
async_water_mark = 75%*innodb_log_file_size
sync_water_mark = 90%*innodb_log_file_size
1. When checkpoint_age<async_water_mark When , No execution required Flush checkpoint. That is to say ,redo log The remaining space exceeds 25% When , No execution required Async/Sync Flush checkpoint.
2. When async_water_mark<checkpoint_age<sync_water_mark When , perform Async Flush checkpoint, That is to say ,redo log Insufficient space left 25%, But more than 10% When , perform Async Flush checkpoint, Refresh to meet the conditions 1
3. When checkpoint_age>sync_water_mark When , perform sync Flush checkpoint. That is to say ,redo log Insufficient space left 10% When , perform Sync Flush checkpoint, Refresh to meet the conditions 1. stay mysql 5.6 after , Whether it's Async Flush checkpoint still Sync Flush checkpoint, Will not block the user's query process .
Because disk is a relatively slow storage device , The interaction between memory and disk is a relatively slow process
because innodb_log_file_size Defines a relatively large value , Under normal circumstances , From the first two checkpoint Refresh dirty pages to disk , In the first two checkpoint After refreshing dirty pages to disk , Dirty pages correspond to redo log The space is released , It doesn't usually happen Async/Sync Flush checkpoint. Also be aware that , In order to avoid frequent low occurrence Async/Sync Flush checkpoint, Should also be innodb_log_file_size The configuration is relatively larger .
4.Dirty Page too much
Dirty Page too much signify buffer pool Too many dirty pages in , perform checkpoint Brush dirty pages into disk , Guarantee buffer pool There are enough pages available in .Dirty Page from innodb_max_dirty_pages_pct To configure ,innodb_max_dirty_pages_pct The default value of is 90, It used to be 75 The increased default value allows a larger percentage of dirty pages in the buffer pool .InnoDB Try refreshing data from the buffer pool , So that the percentage of dirty pages does not exceed this value .
5. summary
fuzzy checkpoint: Fuzzy falling disc , Write operations are not blocked , This time checkpoint Dirty pages of do not require writing to disk at the same time .
①buffer pool Dirty pages in more than 10% when , Start the mechanism of regular dropping , Check once per second . The scope of inspection is in each buffer pool lru After the list 1024 A page .
② The adaptive checkpoint: Check redo log file , According to the frequency and redo log Usage of , Dynamically adjust the data page of the disk . Guarantee redo log The usage of is no more than 75%.
③ When redo log More than 75% when , It will trigger a fuzzy checkpoint.
Dirty pages fall behind the disk , Corresponding redo log Need to delete .
3、Double Write Dirty pages double write down disk
Double write process
① First write a copy of the dirty pages to the double write buffer
② Then write the dirty page to the corresponding data file .
Ensure the security during data writing .
if Insert Buffer to InnoDB The storage engine has brought performance improvements , that Double Write Bring InnoDB Storage engine is the reliability of the data page .
The double write buffer is a storage area , In this area , First refresh the page from the buffer pool , Then write the page to the correct position in the data file . If the operating system crashes while writing pages to disk , In the process of recovery ,InnoDB The storage engine can select from... In the shared table space double write Find a copy of the page in , Copy it to the tablespace file , Then apply redo log .
If you think the article looks good , Welcome to like collection and attention , Three strikes in a row , You must be the driving force of my continuous output , thank !!*
边栏推荐
- [matlab] matlab simulates digital baseband transmission system eye diagram of bipolar baseband signal (class I part response waveform)
- 【兴趣阅读】Adversarial Filtering Modeling on Long-term User Behavior Sequences for Click-Through Rate Pre
- Integer type of C language
- Zkevm (12) state proof of appliedzkp
- [matlab] matlab simulation - simulate the AM modulation process of the modulation system
- 补某视频网站的js,进行视频解密
- 中科磐云—模块A 基础设施设置与安全加固 评分标准
- 空洞卷积、可变形卷积、可变形ROI Pooling
- [QT] timer
- Unity is connected to the weather system
猜你喜欢
Programming example of stm32f1 and stm32subeide -74hc595 drives 4-bit 7-segment nixie tube
Graduation design of small programs -- small programs of food and recipes
appliedzkp zkevm(13)中的Public Inputs
Zhongke panyun-d module analysis and scoring standard
Automated testing selenium foundation -- webdriverapi
LM小型可编程控制器软件(基于CoDeSys)笔记二十二:错误4268/4052
Flask
LM小型可编程控制器软件(基于CoDeSys)笔记二十一:错误3703
2022G2电站锅炉司炉特种作业证考试题库及答案
Topological sorting and graphical display of critical path
随机推荐
KMP匹配字符串
[matlab] general function of communication signal modulation - generation of narrow-band Gaussian white noise
KMP match string
LM小型可编程控制器软件(基于CoDeSys)笔记二十二:错误4268/4052
[技术发展-25]:广播电视网、互联网、电信网、电网四网融合技术
2022广东省赛——编码信息获取 解析flag
Zhongke Panyun - module a infrastructure setting and safety reinforcement scoring standard
Flink1.13 basic SQL syntax (II) join operation
《Cross-view Transformers for real-time Map-view Semantic Segmentation》论文笔记
数据标注是一块肥肉,盯上这块肉的不止中国丨曼孚科技
[QT] timer
The second case analysis of the breakthrough of defense system from the perspective of the red team
Unity is connected to the weather system
Void convolution, deformable convolution, deformable ROI pooling
Notes on the paper "cross view transformers for real time map view semantic segmentation"
Nodejs learning document
When using flash to store parameters, the code area of flash is erased, which leads to the interrupt of entering hardware error
Download kicad on Alibaba cloud image station
空洞卷积、可变形卷积、可变形ROI Pooling
How to use postman to realize simple interface Association [add, delete, modify and query]