当前位置:网站首页>Series of how MySQL works (VIII) 14 figures explain the atomicity of MySQL transactions and the principle of undo logging

Series of how MySQL works (VIII) 14 figures explain the atomicity of MySQL transactions and the principle of undo logging

2022-07-05 06:19:00 Zhangbaipei


One 、 What is? undo journal

 

How to understand undo journal

Database transactions are mysql The smallest logical unit of operation , A transaction can contain one or more sql sentence , these sql Either all of them succeed or all of them fail , This is one of the four characteristics of database transactions Atomicity .

The key to the implementation of atomic performance is to rollback in case of failure , It depends on undo log journal . The transaction will back up the data to be changed to undo log in (undo log The data before the change will be saved , This is a row level historical data ), If an error occurs or the user executes rollback, You can go through undo log Restore the data to the state before the transaction started .

 

We can think of it this way undo What does the log record :

  • When inserting a record , At least write down the primary key value of this record , In this way, when rolling back, you only need to delete the record corresponding to the primary key value ;

  • When deleting a record , At least write down all the data in this record , In this way, when rolling back, insert the records composed of these contents into the table ;

  • When modifying a record , At least write down the old values of the updated columns , In this way, it is good to update these columns to the old values when rolling back .

 

 

undo Log type and format

undo The log is divided into 3 class :

  • insert Type of undo journal (TRX_UNDO_INSERT_REC)

  • delete Operation of the undo journal (TRX_UNDO_DEL_MARK_REC)

  • Update the undo journal (TRX_UNDO_UPD_EXIST_REC)

 

and redo The difference between logs is , One redo Log one change of a page , So one sql Will produce multiple redo journal ; And one undo Log the modification of a record , So one sql It will only produce one undo journal ( In some cases 2 strip ).

In a transaction , When adding, deleting, modifying, and querying a table, a transaction will be allocated to this transaction id, If the transaction is full of query statements , Then this transaction will not be allocated id.

Previously, we said that there are several hidden columns in the page records of the primary key index , One of the hidden columns is transactions id(trx_id), Indicates which transaction has recently modified this line of records .

undo The log will be saved to undo Page ,undo The general format of the log includes : The undo Log in undo In page address of page 、undo The table where the record corresponding to the log is located id、undo Log number 、undo Log type 、 The next one undo The address of the log .

Each write operation of each user data record corresponds to one undo Record , Therefore, each leaf node record of the cluster index corresponds to one undo journal , So how to know a data record undo Where are the logs stored ? Primary key B+ There is a record in the page of the tree roll_pointer Hidden fields , It points to the corresponding undo Log in undo The address in the page , Here's the picture :

 

Two 、undo What is in the log

Here are the introduction 3 Types of logs and their corresponding behaviors .

 

insert The operation corresponds to undo journal

 

If a transaction executes a insert operation , Want to rollback this operation , Just according to the primary key of the record just inserted id Delete the record . therefore insert Operation of the undo The log only needs to record the new line id that will do . If the primary key is a joint index , Then you need to record all field values in the union index .

When we insert a record into a table , You need to insert a record into the clustered index and all secondary indexes . But generate undo When the log , You only need to generate a row for the clustered index undo journal . Later on delete Operation and update The operation is also generated only for the changes of the rows of the clustered index undo The log .

 

 

delete The log corresponding to the operation

In business delete What happens to a record ?

Let's look at the picture below , Deleted records in a page will be linked “ Garbage linked list ” in , The space occupied by records in the garbage linked list is reusable , page page header Of page_free Attribute points to the head node of the garbage linked list ,deleted_flag Indicates whether the record has been deleted .

 

When I'm in business delete A record , There are two stages :

1、 Of the record to be deleted deleted_flag Set as 1( In fact, the record will be modified trx_id、roll_pointer value , But here we don't pay attention to ), This stage is called delete mark. At this time, the record will be in an intermediate state between deleted and undeleted .

 

2、 When the transaction commits ,pruge The thread will move the record from the normal record linked list to the head node of the garbage linked list , That is to delete the record . This stage becomes pruge.

 

Generate a record delete undo When the log , Only the first stage happened , If the operation is rolled back , Just put the deleted_flag Set as 0 that will do .

Here is delete Type of undo Log format (delete_mark type ):

Compared with insert Of undo journal , It has an extra field : Old recorded transactions id and Old record undo Log pointer roll_pointer、 All index field values of old records .

 

Q1: Why delete Of undo The log should record all index field values of the record ?

Because deleting a record is also from all secondary indexes B+ Delete the corresponding record in the tree , Write down the values of all index fields to delete the corresponding records in the secondary index when the transaction is committed .

 

Q2: When a new record is inserted into the page , How to use reusable space ?

When inserting a new record , First, judge whether the space occupied by the record corresponding to the head node of the garbage linked list can accommodate the new record , If you can't, you can directly apply to the page for new space to store this record ( Instead of trying to traverse the entire garbage list , To find a node that can hold new records ), If you can, reuse it directly , And let page_free Point to the next node of the garbage linked list .

If the space occupied by the newly inserted record is less than that occupied by the garbage chain header node record , There will be debris . As new records are inserted more and more , There will be more and more pieces , When there are enough fragments ,innodb The records in the page will be reorganized , The process of organization is to apply for a temporary page , Insert the records of the page into this temporary page closely in turn , Then copy the contents of the temporary page to this page , This process will cost more performance .

 

 

update The operation corresponds to undo journal

There are two kinds of : Update primary key update and Do not update the primary key update.

Do not update the primary key update Can be divided into in-place update and Non in place updates .

 

a. Modify in place

The so-called in place update is All columns updated and Its column before update It takes up as much storage space .

For example, there is a record like this :

Execute the following update sentence

update t set key1 = "ABCD" and col = " pistol " where id = 2;

In this case , Modified fields key1 and col There is no change in the occupied space , This is local modification .

It is easiest to modify the behavior at the bottom in place , Directly modify the new value in place in the field of the corresponding record on the page .

 

b. Non local modification

If All columns updated and Its column before update The storage space occupied is different ( Non local modification ), Then you need to delete the old records in the primary key index , Then insert the new record in the page where the old record is located . The deletion mentioned here is no longer a fake deletion delete mark, But really move the records to the garbage linked list .

If sql Also changed the value of the secondary index , Then delete the record on the page of the secondary index ( Is false delete delete mark), And insert new records in another secondary index page .

If the storage space occupied by the newly created record does not exceed that occupied by the old record , Then you can directly reuse the storage space occupied by the old records added to the garbage linked list , Otherwise, you need to apply for a new space in the page for new records , If there is no space available , Just split the page , And then insert a new record .

Local modification and non local modification will generate a update_exist Type of undo journal .

The following is the update without changing the primary key value undo Log format (update_exist), Meeting Record the values of all updated columns before updating

 

c. Update primary key update

Suppose the update operation puts id = 5 Change it to id = 1000, Update the primary key update The underlying behavior is to delete the page in the primary key index id=5 The record of ( False deletion delete mark), And add id=1000 Record to another page .

This process will produce a delete Type of undo journal and One  insert Type of undo journal .

So update the primary key update Will produce 2 strip undo journal .

In fact, there is another kind called TRX_ UNDO_ UPD _ DEL_ REC Of undo Log types are not introduced here , Mainly to avoid introducing too much complexity , After all, I know undo log The principle of log is our original intention .

 

 

3、 ... and 、 Version chain and undo Storage of logs

 

Version chain

update and delete These two kinds of undo The log records used roll_pointer , Let's assume that a record passes through in a transaction 4 Time modification , Then this record will generate 4 strip undo journal , Every one of them undo The log will record the previous of the data line undo The log roll_pointer, And the data row itself records its latest undo The log roll_pointer, So these undo Logs are organized into a version chain .

 

undo The organizational structure of logs

undo The log is stored in undo In the log page , and undo Pages are organized together in the form of linked lists ,undo There are two kinds of pages :insert Type of undo journal ( It's just inside insert Type of undo journal ) and update Type of undo journal ( discharge update and delete Type of undo journal ).

These two different types of undo Log pages are used insert undo Linked list and update undo Linked list management .

The reason why undo The log is divided into 2 Two categories: , Because insert Type of undo The log can be deleted directly after the transaction is committed , And other types of undo The log also needs to be MVCC( Multi version concurrency control ) service ,

 

You can't delete , Therefore, their treatment needs to be treated differently . The following figure shows undo journal , They are closely connected .

 

Temporary table and ordinary table undo Logs will also be managed separately , Therefore, a transaction has at most 4 strip undo Linked list , And each transaction will be created 4 This one undo Linked list :

 

In a transaction , For different rows of different tables DML All generated by the operation undo The logs are stored here 4 In a linked list .

 

Two transactions create different 2 A list of chains , Such as transaction A And transaction B All data lines X Make changes , Then these two modifications produce undo The log will be placed in two different update undo In the list .( And all the changes of a certain record in different transactions undo Logs are linked by version chains , And here, undo Page list doesn't matter .)

 

 

Each of them undo The first page of the linked list will put some control information of the linked list , For example, affairs. id.

From this we can imagine a How transactions are rolled back Of , The corresponding items will be remembered during the transaction undo The page number of the head node and the end node of the linked list , Just traverse these forward from the end node undo On page undo journal , And execute these in sequence undo The reverse operation of log can realize rollback .

 

One transaction in one undo Generated by linked list undo Logs are called a group undo journal , For example, the transaction in the above figure generates 4 Group undo journal . In some cases , After a transaction is committed , Other subsequent transactions can reuse this undo Page linked list , Instead of creating new undo Linked list , This will result in a undo Pages may hold multiple sets of transactions undo journal ( This is to save undo Page space ). The chain header node will record the next group and the previous group undo The offset of the log in the page .

 

Now let's talk about reuse undo Page things .

 

reusing undo page

innodb Each transaction will be allocated separately undo Page linked list ( Can be allocated separately at most 4 A linked list ), This is to improve the concurrent execution of multiple transaction writes undo Log performance ( Because every transaction puts undo The linked list must be locked before the log is written ).

But this will cause waste undo The problem of page space . such as , A transaction may only produce 3 individual undo journal , This business is undo The linked list only has 1 individual undo page , And this undo The page only needs to lose space . In fact, new affairs can continue to this undo The log heap of page continues to be added undo journal , Multiple transactions have been reused or shared undo Purpose of the page .

So innodb In some cases, different transactions will be reused undo In the page list undo page . reusing undo Pages need to meet 2 Conditions :

1、 The linked list contains only one undo page ;

2、 The undo Pages use less space than the entire page 3/4, otherwise undo When there is only a little space left on the page, reuse is likely to apply for new pages , Resulting in a lot of space left in the new page, causing waste .

 

insert Linked list and update The reuse strategies of linked lists are different , because insert undo The log can be discarded directly after the transaction is committed , So reuse insert undo The page can directly cover the old undo journal . and update undo Page reuse cannot overwrite old undo The log can only be appended , Because of these undo Logs should be used for MVCC.

 

reusing undo A transaction of a page cannot be a concurrent transaction , It must be after a transaction , Only another transaction can reuse the previous transaction undo page . in other words , A business undo Logs are closely linked within the page , There will be no multiple transactions undo The condition that logs are interleaved in one page .

 

Rollback segment

be-all undo Pages are stored in segments for management , And one undo The page linked list corresponds to one undo paragraph , apply undo Page time is also determined by undo Duan applied to the district , But these undo paragraph (undo log segment) Not the rollback segment we want to talk about in this section (rollback segment).

all undo The head node of the list (first undo page) The page number of will be saved to a separate page ,undo The page number of the chain header node is called undo slot, We can think of this single page as storing all undo The warehouse of the base node of the linked list . And this separate page will be put into a separate paragraph , This segment is the rollback segment (rollback segment).

There is only one page in the rollback segment , This page is called rollback segment header , Or directly called rollback page . Through this page, you can find the page number of the head node of the specified linked list .

A rollback segment has only 1024 individual undo slot, One slot There's a corresponding one undo Linked list . This means that a rollback segment can support at most 1024 All transactions are executed at the same time , To increase the number of concurrent transactions ,innodb Will exist 128 Rollback segments hold more undo The head node of the list , Therefore, it can support 1024*128 All transactions are executed at the same time .

this 128 The page number of the rollback page of the rollback segment will be stored in the system table space 5 Page No :

 

undo The role of logs in crash recovery

If a transaction has not been committed when the system crashes , But part of the transaction redo The log has been flushed , So in order to ensure the atomicity of transactions , All changes in this transaction should not be restored , That is to say, this part of the disc has been brushed redo The log is incomplete , Not for these redo Log recovery data . But in fact mysql Will be incomplete for these redo Log data recovery .

here undo Logs will play a role in crash recovery ,mysql Will scan the firm for undo The first node of the list , Look inside TRX_UNDO_ACTIVE attribute , It indicates whether the transaction corresponding to the linked list is active . If active , It indicates that the transaction was not committed when the system crashed , Then the system will follow the undo Log rollback just recovered redo Log data to ensure atomicity .

 

 




More content, please pay attention to WeChat official account.
zbpblog WeChat official account
Reprinted from : Zhang baipei IT Technology blogs > MySQL How to run the series ( 8、 ... and )14 Zhang Tu made it clear MySQL Transaction atomicity and undo Log principle
原网站

版权声明
本文为[Zhangbaipei]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/186/202207050614047732.html