当前位置:网站首页>02 _ Log system: how does an SQL UPDATE statement execute?
02 _ Log system: how does an SQL UPDATE statement execute?
2022-06-11 15:12:00 【cjh-Java】
Previously, we learned about the execution process of a query statement , And introduced the processing module involved in the implementation process . I believe you still remember , The execution process of a query statement is generally through the connector 、 analyzer 、 Optimizer 、 Actuator and other functional modules , Finally to the storage engine .
that , What is the execution flow of an update statement ?
You probably used to listen to DBA Colleagues said ,MySQL It can be restored to any second in half a month , While marveling , Do you have curiosity in mind , How can this be done ?
Let's start with an update statement of a table , Here is the creation statement of this table , This table has a primary key ID And an integer field c:
mysql> create table T(ID int primary key, c int);
If you want to ID=2 The value of this line plus 1,SQL That's how the statement says :
mysql> update T set c=c+1 where ID=2;
I introduced you to SQL Statement basic execution link , Here I'll take that picture back , You can also take a look at this picture to review . First , You can say for sure , The process of query statement , UPDATE statement will also go through the same .

You need to connect to the database before executing the statement , This is the job of the connector .
We said earlier , When there is an update on a table , The query cache associated with this table will fail , So this statement will put the table T All cached results on are cleared . That's why we generally don't recommend using query caching .
Next , The analyzer will know this is an update statement through lexical and syntax parsing . The optimizer decided to use ID This index . then , The actuator is responsible for the specific execution , Find this line , And then update .
Unlike the query process , The update process also involves two important logging modules , They are the main characters we are going to discuss today :redo log( Redo log ) and binlog( Archive log ). If contact MySQL, Those two words can't get around , I will continue to emphasize with you later . But then again ,redo log and binlog There are many interesting things about design , These design ideas can also be used in your own programs .
Important log module :redo log
I don't know if you remember 《 Kong Yiji 》 This article , The innkeeper has a pink board , It is specially used to record the credit records of the guests . If there are not many people on credit , Then he can write down the customer's name and account on the board . But if there are more people on credit , There will always be times when you can't remember , At this time, the shopkeeper must have a special account book for recording credit accounts .
If someone wants to credit or pay back , The shopkeeper usually has two methods :
- One way is to turn over the account book directly , Add up or deduct this credit account ;
- Another way is to write down the account on the powder board first , After closing, turn over the account book for accounting .
In a busy business counter , The shopkeeper will definitely choose the latter , Because the former is too troublesome . First , You have to find the record of this person's total credit . Do you think , Dozens of pages , The shopkeeper wants to find the name , Maybe I have to bring my presbyopic glasses to look for , Find out and then take out the abacus to calculate , Finally, write the result back to the account book .
The whole process is troublesome to think about . by comparison , It's better to write it down on the powder board first . Do you think , If the shopkeeper doesn't have the help of the pink board , You have to turn over the account book every time you make an account , Is efficiency intolerable ?
Again , stay MySQL There's a problem in , If every update needs to be written to disk , Then the disk also needs to find the corresponding record , Then update , The whole process IO cost 、 Search costs are high . To solve this problem ,MySQL The designer of the hotel uses the idea similar to the manager's Pink board to improve the update efficiency .
And the whole process of powder board and account book cooperation , In fact, that is MySQL It's often said in WAL technology ,WAL The full name is Write-Ahead Logging, The key point is to write a log first , Write the disk again , That is to write the pink board first , Don't write down the account book until you are not busy .
say concretely , When a record needs to be updated ,InnoDB The engine will write the record first redo log( Powder board ) Inside , And update memory , At this time, the update is finished . meanwhile ,InnoDB The engine will... At the right time , Update this operation record to disk , And this update is often done when the system is relatively idle , It's like what the shopkeeper does after closing .
If there are not many on credit today , The shopkeeper can wait until the closing time . But if one day there's a lot of credit , The pink board is full , What to do ? At this time, the shopkeeper had to put down his work , Update part of the credit record in the pink board to the account book , Then erase the records from the powder board , Make room for new accounts .
A similar ,InnoDB Of redo log It's fixed size , For example, it can be configured as a group 4 File , The size of each file is 1GB, So this one “ Powder board ” In total, you can record 4GB The operation of . Write from the beginning , Write at the end and go back to the beginning , As shown in the figure below .

write pos Is the location of the current record , Move back as you write , Write to No 3 Go back to... At the end of file 0 The beginning of file No .checkpoint Is the current location to erase , It's also going back and forth , Before erasing a record, update the record to a data file .
write pos and checkpoint Between is “ Powder board ” The empty part of the top , It can be used to record new operations . If write pos Catch up checkpoint, Express “ Powder board ” Full of , No new updates can be performed at this time , You have to stop and erase some records , hold checkpoint Push on .
With redo log,InnoDB It can guarantee that even if the database is restarted abnormally , No records submitted before will be lost , This ability is called crash-safe.
To understand crash-safe The concept , Let's think about our previous example of credit records . As long as the credit record is written on the pink board or on the account book , After that, even if the shopkeeper forgot , For example, a few days of sudden closure , After the resumption of business, you can still clear the credit accounts through the data in the account books and powder boards .
Important log module :binlog
We talked about ,MySQL On the whole , In fact, there are two pieces : One is Server layer , The main thing it does is MySQL Functional things ; The other is the engine layer , Responsible for specific storage related matters . The pink board we talked about above redo log yes InnoDB Engine specific logs , and Server Layer also has its own log , be called binlog( Archive log ).
I'm sure you'll ask , Why are there two logs ?
Because in the beginning MySQL Not in InnoDB engine .MySQL The engine is MyISAM, however MyISAM No, crash-safe The ability of ,binlog Logs can only be used for archiving . and InnoDB It's another company that introduced... In the form of plug-ins MySQL Of , Since we only rely on binlog It's not crash-safe The ability of , therefore InnoDB Use another logging system —— That is to say redo log To achieve crash-safe Ability .
There are three differences between the two kinds of logs .
redo log yes InnoDB Engine specific ;binlog yes MySQL Of Server Layer , All engines can use .
redo log It's a physical log , The record is “ What changes have been made on a data page ”;binlog It's a logical log , What is recorded is the original logic of this statement , such as “ to ID=2 In this line c Field plus 1 ”.
redo log It's written in cycles , The space will be used up ;binlog Can be added to write .“ Additional writing ” Refer to binlog When the file is written to a certain size, it will switch to the next , Does not overwrite previous logs .
With a conceptual understanding of these two logs , Let's look at the actuators and InnoDB The engine is executing this simple update The internal flow of the statement .
Find the engine for the actuator first ID=2 This business .ID It's the primary key , The engine uses tree search to find this line . If ID=2 The data page where this line is located is already in memory , Directly back to the actuator ; otherwise , You need to read memory from disk first , And then back again .
The actuator gets the row data given by the engine , Add this value to 1, Like it turns out to be N, Now is N+1, Get a new line of data , Then call the engine interface to write the new data .
The engine updates this row of new data into memory , At the same time, record the update operation to redo log Inside , here redo log be in prepare state . Then tell the actuator that the execution is finished , You can commit a transaction at any time .
The actuator generates the binlog, And put binlog Write to disk .
The executor calls the engine's commit transaction interface , The engine just wrote redo log Change to submit (commit) state , Update complete .
Here I give this update Statement execution flowchart , The light color box in the picture indicates that it is in InnoDB Internally executed , The dark box indicates that it is executed in the actuator .

You may have noticed , The last three steps look a little “ Around the ”, take redo log The write of is divided into two steps :prepare and commit, This is it. " Two-phase commit ".
Two-phase commit
Why must there be “ Two-phase commit ” Well ? This is to make the logic between the two logs consistent . Explain the problem , We have to start with the question at the beginning of the article : How to restore the database to any second in half a month ?
We talked about that before ,binlog Will record all logical operations , And it uses “ Additional writing ” In the form of . If your DBA Promise to recover in half a month , Then the backup system will certainly save all the last half month binlog, At the same time, the system will backup the whole database regularly . there “ regular ” Depends on the importance of the system , It can be prepared one day , It can also be prepared once a week .
When you need to recover to a specified second , For example, one day at two o'clock in the afternoon, it was found that there was a mistake in deleting the watch at twelve o'clock in the afternoon , Need to retrieve data , Then you can do this :
- First , Find the most recent full backup , If you are lucky , Maybe it was a backup last night , Restore from this backup to a temporary Library ;
- then , From the point in time of the backup , Will back up binlog Take them out one by one , Replay to the moment before deleting the table by mistake at noon .
In this way, your temporary library is the same as the online library before deleting by mistake , Then you can take the table data out of the temporary database , Restore to online library as needed .
Okay , Finish the data recovery process , Let's come back and talk about , Why does the journal need “ Two-phase commit ”. Here may as well use the counter evidence method to explain .
because redo log and binlog It's two separate logics , If you don't have to commit in two stages , Or write it first redo log To write binlog, Or in reverse order . Let's see what's wrong with these two ways .
Still use the front update Sentence for example . Assuming the current ID=2 The line of , Field c The value of is 0, Let's suppose we execute update Statement after writing the first log , During the period when the second log has not been written crash, What will happen ?
First write redo log Post write binlog. Suppose that redo log finish writing sth. ,binlog Before I finished writing ,MySQL Abnormal process restart . Because of what we said earlier ,redo log After you've written , Even if the system crashes , Still able to recover data , So the line after recovery c The value of is 1.
But because of binlog I didn't finish writing crash 了 , Now binlog There is no record of this statement . therefore , When you back up the logs later , Saved up binlog There is no such sentence in it .
And then you'll see , If you need to use this binlog To restore the temporary storage , Because of the binlog The loss of , This temporary library will be short of this update , The restored line c The value is 0, It is different from the value of the original library .First write binlog Post write redo log. If in binlog After you've written crash, because redo log Not yet , This transaction is invalid after crash recovery , So this line c The value of is 0. however binlog It has been recorded “ hold c from 0 Change to 1” This journal . therefore , Use after binlog When it comes to recovery, one more transaction comes out , The restored line c The value is 1, It is different from the value of the original library .
You can see , If not used “ Two-phase commit ”, Then the state of the database may be inconsistent with the state of the database recovered with its logs .
You might say , Is the probability very low , At ordinary times, there is no scene that needs to restore the temporary library ?
It's not , It's not just a process to recover data after misoperation . When you need to expand , That is to say, it is necessary to build more backup databases to increase the reading ability of the system , Now common practice is to use full backup plus application binlog To achieve , This “ atypism ” It will lead to inconsistency between master and slave databases on your line .
In short ,redo log and binlog Can be used to represent the commit state of a transaction , And two-phase commit is to keep these two states logically consistent .
Summary
today , I introduced MySQL The two most important logs in it , Physical log redo log And logic log binlog.
redo log Used to guarantee crash-safe Ability .innodb_flush_log_at_trx_commit This parameter is set to 1 When , For each transaction redo log All persist directly to disk . I suggest you set this parameter to 1, This ensures MySQL Data will not be lost after abnormal restart .
sync_binlog This parameter is set to 1 When , For each transaction binlog All persistent to disk . I also suggest that you set this parameter to 1, This ensures MySQL After abnormal restart binlog No loss .
I also introduced you to MySQL The logging system is closely related to “ Two-phase commit ”. Two phase commit is a common solution to maintain logical consistency of data across systems , Even if you don't do database kernel development , It may also be used in daily development .
边栏推荐
- 数据库优化
- A brief talk on the feelings after working at home | community essay solicitation
- 2021 年 CNCF 调查:Kubernetes 跨越鸿沟的一年
- Uniapp développe des applets Wechat, de la construction à la mise en ligne
- Hashicopy之nomad应用编排方案03(运行一个job)
- Individual income tax rate table
- 回溯法/活动安排 最大兼容活动
- 02 Tekton Pipeline
- What is excess product power? Find the secret key of the second generation cs75plus in the year of the tiger
- Why do I need the public static void main (string[] args) method?
猜你喜欢

社交软件Soul撤回IPO申请:上市只差临门一脚 腾讯是大股东

07 _ 行锁功过:怎么减少行锁对性能的影响?

Raspberry pie obtains the function of network installation system without the help of other devices

111. minimum depth of binary tree

LoveLive! Published an AI paper: generating models to write music scores automatically

【SystemVerilog 之 验证】~ 测试平台、硬件设计描述、激励发生器、监测器、比较器

Uniapp develops wechat applet from build to launch

Individual income tax rate table

清北力压耶鲁,MIT蝉联第一,2023QS世界大学排名最新发布

19. insertion, deletion and pruning of binary search tree
随机推荐
Knowledge of affairs
【SystemVerilog 之 接口】~ Interface
Can we really make money by doing we media editing?
In the "ten billion blue ocean" database, each player can find a boat | c-position face-to-face
河北 黄金寨景区新增“AED自动除颤器”保障游客生命安全!
基于 GateWay 和 Nacos 实现微服务架构灰度发布方案
Cisco Rui submitted the registration of sci tech Innovation Board: proposed to raise 600million yuan, with annual revenue of 222million yuan
Hashicopy之nomad应用编排方案05(访问web页面)
基于STM32F1的开源小项目
In depth analysis of "circle group" relationship system design | series of articles on "circle group" technology
如何做好自媒体?这几个步骤你做对了吗?
Raspberry school literacy
Elk log analysis system
Nexus configuration Yum repository for repository manager
2022 Hunan Provincial Safety officer-c certificate examination practice questions and online simulation examination
Hashicopy之nomad应用编排方案02
简单的C语言版本通讯录
China's technology goes to sea, tidb database's overseas exploration road | interview with excellent technical team
Riskscanner of multi Cloud Security compliance scanning platform
[SystemVerilog interface] ~ interface