当前位置：网站首页>Correctly understand MySQL mvcc

Correctly understand MySQL mvcc

2022-06-27 08:17:00 【Little moon 6】

What is? MVCC?

MVCC, Full name Multi-Version Concurrency Control, Multi version concurrent control .MVCC It's a method of concurrency control , Generally in the database management system , Realize concurrent access to database , Implement transaction memory in programming language .

MVCC stay MySQL InnoDB In order to improve the database concurrency performance , Deal with reading in a better way - Write conflict , Even if there is a conflict between reading and writing , Can also do without lock , Non blocking concurrent read

What is current read and snapshot read ？

I'm learning MVCC Before multi version concurrency control , We have to learn about , What is? MySQL InnoDB Under the current read and snapshot read ?

The current reading

image select lock in share mode( Shared lock ), select for update ; update, insert ,delete( Exclusive lock ) These operations are all a current read , Why is it called current reading ？ It reads the latest version of the record , When reading, it is also necessary to ensure that other concurrent transactions cannot modify the current record , Will lock the read record

Read the snapshot

Like unlocked select Operation is snapshot reading , Non blocking read without lock ; The premise of snapshot read is that isolation level is not serial level , Snapshot reads at the serial level degrade to current reads ; The reason why snapshot reading occurs , It is based on the consideration of improving concurrent performance , The implementation of snapshot read is based on multi version concurrency control , namely MVCC, It can be said that MVCC It's a variant of a line lock , But it's in many cases , Avoid lock operation , Lower the cost ; Since it's based on multiple versions , That is to say, what the snapshot read may not be the latest version of the data , It could be the previous version of history

To put it bluntly MVCC It's just for reading - Write conflict without lock , This reading refers to snapshot reading , Not the current reading , The current reading is actually a lock operation , It's the realization of pessimistic lock

The current reading , Snapshot read and MVCC The relationship between

Accurately speaking ,MVCC Multi version concurrency control refers to “ Maintaining multiple versions of one data , So that there is no conflict between read and write operations ” Such a concept . Just an ideal concept
And in the MySQL in , To achieve such a MVCC Ideal concept , We just need MySQL Provide specific functions to implement it , And snapshot reading is MySQL For us to achieve MVCC One of the specific non blocking read functions of the ideal model . And relatively speaking , The current read is the specific function of pessimistic lock
I want to be more specific , Snapshot reading itself is an abstraction , Further study .MVCC Model in MySQL The concrete implementation of is by 3 Implicit fields ,undo journal ,Read View Wait to finish , See the following for details MVCC Realization principle

MVCC What problems can be solved , The advantage is ？

There are three scenarios for database concurrency , Respectively ：

read - read ： There is no problem , There is no need for concurrency control
read - Write ： There are thread safety issues , May cause transaction isolation problems , May encounter dirty reading , Fantasy reading , It can't be read repeatedly
Write - Write ： There are thread safety issues , There may be a problem with the update being lost , For example, the first type of update is missing , The second type of update is missing

MVCC The benefits are ？

Multi version concurrency control （MVCC） It's a solution to reading - No lock concurrency control for write conflicts , That is to assign one-way growing timestamps to transactions , Save a version for each change , Version is associated with transaction timestamp , Read operation read only the snapshot of the database before the transaction starts . therefore MVCC You can solve the following problems for the database

When reading and writing database concurrently , You can read without blocking the write operation , Write operations do not block read operations , Improve the performance of database concurrent read and write
At the same time, it can also solve the problem of dirty reading , Fantasy reading , Non repeatable read and other transaction isolation issues , But it can't solve the problem of missing updates

To sum up
All in all ,MVCC It's because of the bulls , Dissatisfied, only let the database use pessimistic lock to solve the problem of poor performance - Write conflict questions , And the proposed solution , So in the database , Because of the MVCC, So we can form two combinations ：

MVCC + Pessimistic locking

MVCC Resolve read-write conflicts , Pessimistic lock solves write conflict

MVCC + Optimism lock

MVCC Resolve read-write conflicts , Optimistic lock solves write conflict
This combination can maximize the performance of database concurrency , And solve the read-write conflict , Problems caused by conflict with writing

MVCC Implementation principle of

MVCC The purpose of multi version concurrency control , Implementation in database , It's to solve the conflict between reading and writing , Its implementation principle mainly depends on the 3 Implicit fields ,undo journal ,Read View To achieve . So let's take a look at these three first point The concept of

Implicit fields

In addition to our custom fields, each line of records , And the database is implicitly defined DB_TRX_ID,DB_ROLL_PTR,DB_ROW_ID Etc

DB_TRX_ID

6byte, Recently revised ( modify / Insert ) Business ID： Record create this record / The transaction that last modified the record ID

DB_ROLL_PTR

7byte, rollback pointer , Point to the previous version of this record （ Store in rollback segment in ）

DB_ROW_ID

6byte, Implied self increasing ID（ Hide primary key ）, If the data table has no primary key ,InnoDB Will automatically use DB_ROW_ID Generate a clustered index

There's actually another deletion flag Hide fields , If a record is updated or deleted, it does not mean that it is really deleted , But delete flag Changed

Pictured above ,DB_ROW_ID Is the only implicit primary key generated by the database for this row by default ,DB_TRX_ID Is the transaction currently operating the record ID, and DB_ROLL_PTR It's a rollback pointer , Used for coordination undo journal , Point to the previous old version

undo journal

undo log There are two main types ：

insert undo log

On behalf of business in insert When new records are made undo log, Only when the transaction is rolled back , And it can be discarded immediately after the transaction is committed

update undo log

The business is going on update or delete When the undo log; Not only is it necessary when the transaction is rolled back , You also need to read the snapshot ; So you can't delete , Only if a fast read or transaction rollback does not involve the log , The corresponding log will be purge Thread unified cleaning

purge
From the previous analysis, we can see that , In order to achieve InnoDB Of MVCC Mechanism , Update or delete operation is just to set the old record deleted_bit, It doesn't really delete obsolete records .
To save disk space ,InnoDB Specialized purge Thread to clean up deleted_bit by true The record of . In order not to affect MVCC Normal operation of ,purge The thread itself maintains a read view（ This read view Equivalent to the oldest active transaction in the system read view）; If a record of deleted_bit by true, also DB_TRX_ID be relative to purge Thread read view so , Then this record must be cleared safely .

Yes MVCC The essence of helping is update undo log ,undo log In fact, there is rollback segment Middle old record chain , Its execution process is as follows ：

One 、 For example, one has a transaction insert persion The table inserts a new record , Record the following ,name by Jerry, age by 24 year , Implicit primary key yes 1, Business ID and rollback pointer , We assume that NULL

Two 、 Now here comes a Business 1 Of the record name Made changes , Change it to Tom

stay Business 1 Modify the line ( Record ) Data time , The database will first add Exclusive lock

Then copy that line of data to undo log in , As an old record , Already in undo log There is a copy of the current row in
After copying , Modify the line name by Tom, And modify the transactions of hidden fields ID For the current Business 1 Of ID, We default from 1 Start , Then increase by degrees , The rollback pointer points to copy to undo log Copy records of , It means that my last version was it

After the transaction is committed , Release the lock

3、 ... and 、 Here comes another Business 2 modify person surface The same record of , take age It is amended as follows 30 year

stay Business 2 When modifying this row of data , The database locks the row first
Then copy that line of data to undo log in , As an old record , It is found that there is already undo log 了 , Then the latest old data is used as the header of the linked list , Inserted in the line of record undo log front

Modify the line age by 30 year , And modify the transactions of hidden fields ID For the current Business 2 Of ID, That's it 2, The rollback pointer points to just copied to undo log Copy records of
Transaction submission , Release the lock

From above , We can see , Modification of the same record by different transactions or the same transaction , Will cause the record to undo log Become a linear table of record versions , Existing linked list ,undo log The head of the chain is the latest old record , The end of the chain is the earliest old record （ Of course, as I said before undo log The node of might be purge The thread is cleared , To the first one in the graph insert undo log, In fact, after the transaction is committed, it may be deleted and lost , But here's to demonstrate , So it's still here ）

Read View( Read view )

What is? Read View?

What is? Read View, To put it bluntly Read View It's just business Read the snapshot It's produced during operation Read view (Read View), At the moment the snapshot of the transaction is read , A snapshot of the current database system will be generated , Record and maintain the current active transactions of the system ID( When each transaction starts , Will be assigned a ID, This ID Is increasing , So the latest business ,ID The bigger the value is. )

So we know Read View It's mainly used for visibility judgment , That is, when we perform a snapshot read for a transaction , Create a Read View Read view , Compare it to a condition to determine which version of the data can be seen by the current transaction , It could be the latest data , It may also be recorded in this line undo log Some version of the data in it .

Link to the original text ：https://blog.csdn.net/SnailMann/article/details/94724197

原网站

版权声明
本文为[Little moon 6]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/178/202206270806050191.html