当前位置：网站首页>MySQL 7 affair

MySQL 7 affair

2022-06-12 10:00:00 【A god given dream never wakes up】

MySQL 7、 ... and . `InnoDB` Business

Transaction is one of the important characteristics of database and file system ; Transactions are the transfer of database data from one state to another . Commit transactions in the database , Make sure that all data changes within the transaction are not written , Or don't save them all ; In the last chapter, we talked about ACID attribute No longer write here ;

7.1 Understanding affairs

7.1.1 summary

ACID Detailed explanation A little

7.1.2 Classification of transactions

Flat affairs Flat Transactions
Flat transactions with savepoints Flat Transactions with SavePoints
Transaction chain Chained Transactions
Nested transactions Nested Transactions
Distributed transactions Distributed Transactions

Flat affairs Transaction type is the simplest of transaction types , But it is also used most in the actual production process , The most frequent one . In this transaction , All operations are at the same level ,begin Start ,commit end ;

Here are three results of flat transactions

Normal completion 96%
The application requests that the transaction be stopped 3%
Timeout transaction rollback 1%

Flat transactions with savepoints , Some affairs are quite large , You cannot fail all because one of the operations fails , Save points allow , Return to a state after the transaction started , Save it Use SAVE WORK To establish the ; You can create multiple , Program processing failure is , Sure Free choice Reply to the previously established SavePoint

Transaction chain It can be regarded as a variant of the savepoint mode You can only revert to the latest savepoint

Nested transactions It's a hierarchical framework . A top-level transaction controls transactions at all levels . Top level transaction and lower level sub transaction subtransaction , Controls every local transformation

Distributed transactions It's usually a flat transaction running in a distributed environment ;

7.2 Implementation of transactions

In the previous chapter, the isolation of transactions is realized by locks . redo log Ensure the atomicity and persistence of transactions .undo log Ensure the consistency of transactions

7.2.1 redo

1. Basic concepts

The redo log has two parts , Part of it is in memory redo log buffer, One part is redo log file redo log file,redo log To achieve transaction persistence ; stay InnoDB When the transaction is submitted , adopt Force log at commit Mechanism :commit The operation of must be completed when all logs are persisted to redo the log file ,redo log It's written in sequence ; To ensure that logs are written to disk , One call is required for each write fsync ;fsync New energy depends on the disk , So the performance of the disk determines the performance of the transaction commit ; As mentioned earlier ,InnoDB You can set it manually fsync When to call , It is not mandatory to call every time a transaction is committed , This will greatly improve the performance , But if the database goes down suddenly , The last transaction data will be lost ;

stay MySQL There is also a binary log in the database binlog , It's for point in time (PIT) Recovering data or replication Master slave data replication , On the surface, they all record the operations on the database , But there are essential differences between the two ;

binlog yes MySQL Database level log files , It's not just InnoDB Transaction operations generate such logs , Other storage engines will also , It is a database level log , Secondly, the form of logging is different ,binlog It's a logical log It's recorded SQL sentence , and redo log It's a physical log Is the recorded operation on each data page ; Last binlog Is written once when the transaction is committed ;

Insert picture description here

2. log block

stay InnoDB Inside , Redo the log with 512 byte Storage , It means to focus on log caching 、 Redo log files are all in blocks （block） The saved , Call it redo log block （redo log block）; If the number of redo logs generated in a page is greater than 512 byte , Then it needs to be divided into multiple redo log blocks for storage . Besides , Because the size of redo log block is the same as that of disk sector , All are 512 byte , Therefore, the writing of redo log can ensure atomicity , Unwanted doublewrite technology

Insert picture description here

log block header 12 byte ; log block tailer 8 byte ; So what can be stored is 492 byte

name	size	effect
`LOG_BLOCK_HDR_NO`	4	log buffer By log block form , Used to mark the position in the array ,4 Bytes , The first is flush bit , So the biggest 2G
`LOG_BLOCK_HDR_DATA_LEN`	2	log block Size occupied . When it's full , The value is `0x200`, Indicates use of full occupancy 512 byte
`LOG_BLOCK_FIRST_REC_GROUP`	2	log block The offset of the first log in the . If the size of the value and `LOG_BLOCK_HDR_DATA_LEN` identical , Is the current log block Do not include new logs .
`LOG_BLOCK_CHECKPOINT_NO`	4	The `log block` Last written `CHECKPOINT` The first 4 Byte value .

Such as Business T1 Redo log of 1 Occupy 762 byte , Business T2 The redo log of takes up 100 byte . Because each log block Actually, you can only save 492 Bytes , So it's in log buffer The situation in should be as shown in Figure 7-8 Shown So it's time to log block Of LOG_BLOCK_FIRST_REC_GROUP by 282（270+12）

Insert picture description here

log block tailer

name	size	effect
`LOG_BLOCK_TRL_NO`	4	`LOG_BLOCK_HDR_NO` The same value

3. log group

log group It's a logical concept , There is no actual corresponding file , It consists of multiple redo log files , The size of each log file is the same , Redo log file is in log buffer Medium log block , According to the following running rules Will be able to log buffer Inside log block Refresh to file disk

When the transaction is submitted
When log buffer Half of the space is used
log checkpoint When

log block Meeting append Added in redo log file Back When one redo log file When it's full , Will write down a redo log file use aroun-robin ;

redo log file In addition to save log buffer Refresh to disk log block, Some other information is also saved , This information takes up a total of 2KB size

name	size byte	effect
log file header	512
checkpoint1	512
empty	512
checkpoint2	512

The above information is only available in each log group Of first redo log file For storage .log group The rest of redo log file Keep only these spaces , But do not save the above information ; Because I saved this information , It means right redo log file The writing of is not completely sequential , Because it's in addition to log block Write operation of , It also needs to be updated before 2KB Part of the information

Insert picture description here

4. Redo log

InnoDB Redo log file management is based on data pages , Different storage engines have common redo log headers

Insert picture description here

redo_log_type： Type of redo log .
space： Table space ID.
page_no： The offset of the page .

redo log body Part of , Depending on the redo log type , There will be different storage contents , for example , For the insertion and deletion of records on the page

Insert picture description here

To InnoDB1.2 version , Altogether 51 A redo log type . As the functionality increases , I believe more and more redo log types will be added

5. LSN
Log Sequence Number , Log serial number . stay InnoDB In the storage engine ,LSN Occupy 8 byte , And monotonically increasing .

The total number of redo log writes
checkpoint The location of
Page version

LSN Indicates the total number of bytes written to the redo log by the transaction . For example, the current redo log LSN by 1 000, There is one Business T1 Written 100 Byte redo log , that LSN It becomes 1100, If so Business T2 Written 200 Byte redo log , that LSN It becomes 1 300. so LSN What is recorded is the total amount of redo logs , Its unit is byte .
LSN Not only in the redo log , It also exists on every page . At the head of each page , There is a value FIL_PAGE_LSN, The... Of this page is recorded LSN. In the page ,LSN Indicates when the page was last refreshed LSN Size . Because redo log records the log of each page , So the LSN Used to determine whether the page needs to be restored . for example , page P1 Of LSN by 10 000, And when the database starts ,InnoDB Detected writing to redo log LSN by 13 000, And the transaction has been committed , Then the database needs to be restored , Apply redo logs to P1 page in . alike , For redo logs LSN Less than P1 page Of LSN, There's no need to redo , because P1 page Medium LSN Indicates that the page has been refreshed to this location .

6. recovery

because checkpoint Indicates that... Has been refreshed to the disk page LSN, Therefore, during the recovery process, you only need to recover checkpoint The beginning of the log section . For Graphs 7-12 Examples in , When the database is in checkpoint Of LSN by 10 000 There was a crash , The recovery operation only recovers LSN 10 000～13 000 Logs in scope

Insert picture description here

You can see that the physical modification of the page is recorded , If the insertion involves B+ Treelike split, There may be more pages to log . Besides , Because the redo log is a physical log , So it is idempotent Of . The concept of idempotence is as follows ：

f(f(x)) = f(x)

yes , we have DBA Or developers mistakenly think that as long as the binary log format is set to ROW, So binary logs are idempotent . This is obviously wrong , A simple example ,INSERT Operations are not idempotent in binary logs , Multiple records may be inserted repeatedly . And the above INSERT Redo logs for operations are idempotent

7.2.2 undo

1. Basic concepts

As mentioned earlier redo It is a record of the physical modification of a data page by a transaction , The role is to persist transactions , But the rollback of a transaction cannot be handled , At this time, we need undo log , So when a database modification transaction is executed , It will not only generate undo log It will also generate redo log, When transaction execution fails or rollback is executed, the undo log ;

And redo Different ,undo A special segment stored inside a database （segment） in , This paragraph is called undo paragraph （undo segment）.undo The segment is in the shared tablespace ,undo It's a logical log , differ redo , Because there are many possible concurrent databases , Colleagues deal with more affairs , The data page will not be restored to the mode before the transaction execution , The price is too high , Because this may affect the transaction operations of other data pages ;

For each INSERT,InnoDB The storage engine will complete a DELETE; For each DELETE,InnoDB The storage engine performs a INSERT; For each UPDATE,InnoDB The storage engine does the opposite UPDATE, Put back the line before modification .
In addition to the rollback operation ,undo Another function of MVCC, That is to say InnoDB In the storage engine MVCC Through undo To complete . When the user reads a row of records , If the record is already occupied by another transaction , The current transaction can be processed through undo Read previous line version information , In order to achieve non lock read .
The last and most important point is ,undo log Will produce redo log, That is to say undo log The birth of will be accompanied by redo log The birth of , This is because undo log There is also a need for durable protection

2. undo Storage management

First InnoDB The storage engine has rollback segment, Each rollback segment records 1024 individual undo log segment, And in each undo log segment In paragraph undo Page application ;1.1 Version start InnoDB Maximum support 128 individual rollback segment, Therefore, the transaction limit of supporting simultaneous online has been raised to 128*1024; from InnoDB1.2 Version start , You can pair... With parameters rollback segment Make further settings . These parameters include ：

innodb_undo_directory
innodb_undo_logs
innodb_undo_tablespaces

Parameters innodb_undo_directory Used for setting up rollback segment The path of the file . It means rollback segment It can be stored outside the shared table space , That is, it can be set as an independent table space . The default value for this parameter is “.”, At present InnoDB The directory of the storage engine .
Parameters innodb_undo_logs Used to set rollback segment The number of , The default value is 128. stay InnoDB1.2 In the version , This parameter is used to replace the previous version of the parameter innodb_rollback_segments.
Parameters innodb_undo_tablespaces Used to set the composition rollback segment Number of documents , such rollback segment It can be evenly distributed among multiple files

notes : Transaction in undo log segment Assign pages and write undo log This process also needs to write redo log .

After the transaction is committed, it cannot be deleted immediately undo log And undo log The page . This is because there may be other things that need to be passed undo log To get the previous version of the row record . The story will be submitted undo log Put it in a list , Whether it can be deleted finally undo log And undo log The page is created by purge Thread to judge

If each transaction is assigned a separate undo Pages can be a huge waste of storage space , stay InnoDB In design undo Pages can be redone , Transaction commit , First of all, will undo log Put it in the linked list , And then determine undo Whether the usage space of the page is less than 3/4, If yes, it means that undo Pages can be reused , After that new undo log Record in the current undo log Behind . therefore purge The operation requires a discrete read operation involving the disk ,

3. undo log Format
stay InnoDB In the storage engine ,undo log It is divided into ：

insert undo log
insert undo log It means in insert In operation undo log. because insert Records of operations , Visible only to the transaction itself , Not visible to other things （ This is the requirement of transaction isolation ）, Therefore, we should undo log You can delete... Directly after the transaction is committed . There is no need for purge operation . In the figure * Indicates that the stored fields are compressed .insert undo log At the beginning next The record is the next undo log(2 Bytes ) The location of type_cmpl Take up one byte , The record is undo The type of , about insert undo log, The value is always 11.undo_no Recording transactions ID,table_id Record undo log The corresponding table object . Both values are saved after compression . The following section records the columns and values of all primary keys . It's going on rollback In operation , According to these values, you can locate specific records , Then delete .
update undo log
update undo log The record is right delete and update The operation produces undo log. The undo log May need to provide MVCC Mechanism , So you can't delete a transaction when it's committed . Submit with undo log Linked list , wait for purge Thread for final deletion .update undo log Record more , Take up more space .next、start、undo_no、table_id With the introduction of insert undo log Part of the same . there type_cmpl, because update undo log It also has classification , Therefore, the possible values are as follows ：
- 12TRX_UNDO_UPD_EXIST_REC to update non-delete-mark The record of
- 13 TRX_UNDO_UPD_DEL_REC take delete Record marked as not delete
- 14 TRX_UNDO_DEL_MARK_REC Mark record as delete
  The next part of the record update_vector Information ,update_vector Express update The column whose operation caused the change . Every modified column information should be recorded undo log in . For different undo log type , You may also need to document changes made to the index column .

Insert picture description here

4. Inquire about undo Information

7.2.3 purge

The previous section said , Data delete Is not a direct delete operation , It's about the row of data deleted flag = 1 , There is no real deletion of data , The data is still b+ Trees above , At the same time, the secondary index on the row data is not processed , The actual deletion is delayed In the end purge What is done in the operation ;

purge Faced with update and delete , Deletion or update cannot be performed immediately after the transaction is completed , Because the database supports MVCC , There may be other things going on ,purge It will determine whether there are transaction references on the data , If not, you can perform a real update or deletion

One undo Multiple transactions are allowed on the page undo Operation exists , Subsequent transactions must be at the back of the data page , But this is not a global transaction with table names undo The process of processing is based on the sequence of transactions ; InnoDB A linked list of transactions will be maintained internally

Insert picture description here

For example, the first processing of the table trx1 , Go back to find trx1 Where undo page Handle trx1 after Find out also trx3. This is the time to deal with trx3 ,trx3 After processing trx5 Find out trx5 There are other transactions for the data ; Will return to history list ,trx4 ,==. trx4 Where undo Data pages ;

benefits : Avoid random reading and writing . Improve purge efficiency

Variable_name	Value
innodb_purge_batch_size	300

Every time purge Clear redo page , Clean up more It means there will be more available next time , however , The resource consumption of cleaning up is relatively large

Variable_name	Value
`innodb\_max\_purge\_lag` ( control history The length of When the length is big delay `DML` operation `dalay` millisecond )	0
`innodb\_max\_purge\_lag\_delay` ( delay `DML` operation The maximum time )	0

delay = ((length(history list) - innodb_max_purge_lag)*10) - 5

Be careful :delay Is each delay/ That's ok if DML rows = 10 So long delay 10* delay This value cannot exceed innodb_max_purge_lag_delay

7.2.4 commit group

stay InnoDB There are two stages of transaction submission

Modify the information of transactions in memory ,undo log write in undo log buffer
call fsync Make sure undo log buffer Data written to disk ( Disk interaction Performance constrains the concurrency of transactions )

stay InnoDB 1.2 Before version Turn on replication Binary log binlog The impact on opening performance is greater In order to preserve the value InnoDB Of redo log and Database level binlog The consistency of , Binary transactions are used between the two There are three stages

After the transaction is committed InnoDB become prepare state
MySQL The database level starts writing binlog journal , It may be written many times binlog Parameters sync_binlog control Between transactions binlog Number of writes
above innoDB The two processes of submission

Therefore, the order of the two logs should be consistent ( Transaction commit order and binlog Write order is consistent , Backup needs ), At least twice fsync Can only be , and binlog Is the continuity of transactions , be undo log The group commit of is invalid

Insert picture description here

InnoDB adopt prepare_coimmit_mutex To ensure the transaction commit order and binlog The consistency of

Insert picture description here

From the above process, you can see that only one transaction can be operated at a time binlog Its performance is relatively low But in MySQL5.6 After version

stay MySQL When the upper layer commits the transaction, it is put into a queue in sequence , The first transaction in the queue is called leader, Other things are called follower

Insert picture description here

Flush : The transaction binlog Write to memory
Sync : binlog Brush the log disk If it is multiple transactions, it is one time fsync Just write
Commit : leader Call the commit of storage engine layer transaction according to the order ,InnoDB The storage engine supports group commit

There is only one transaction in the queue , Then the effect may be the same as before , Even worse . But when more transactions are committed ,group commit The more obvious the effect of , The greater the improvement of database performance ,binlog_max_flush_queue_time Used to control Flush Waiting time in phase ; Bring one-time and more binlog Data volume , But it causes the response time of the transaction to slow down ;

7.3 Transaction control statement

START TRANSACTION|BEGIN： Open a transaction explicitly .
COMMIT： To use the simplest form of this statement , Just send COMMIT. You can also be more detailed , Written as COMMIT WORK, But the two are almost equivalent .COMMIT Will commit the transaction , And make all changes that have been made to the database permanent .
ROLLBACK： To use the simplest form of this statement , Just send ROLLBACK. similarly , It can also be written as ROLLBACK WORK, But the two are almost equivalent . Rolling back ends the user's transaction , And undo all pending changes .
SAVEPOINT identifier∶SAVEPOINT Allows creation of a savepoint in a transaction , There can be more than one... In a transaction SAVEPOINT.
RELEASE SAVEPOINT identifier： Delete a transaction savepoint , When no savepoint executes the statement , It throws an exception .
ROLLBACK TO[SAVEPOINT]identifier： This statement is related to SAVEPOINT Command is used together . You can roll back the transaction to the marked point , Without rolling back any work before this marker point . For example, two messages can be sent UPDATE sentence , Followed by a SAVEPOINT, Then there are two more DELETE sentence . If you execute DELETE An exception occurred during the statement , And catch this exception , At the same time ROLLBACK TO SAVEPOINT command , The transaction will be rolled back to the specified SAVEPOINT, revoke DELETE All work completed , and UPDATE The work done by the statement is not affected .
SET TRANSACTION： This statement is used to set the isolation level of transactions .InnoDB The transaction isolation levels provided by the storage engine are ：READ UNCOMMITTED、READ COMMITTED、REPEATABLE READ、SERIALIZABLE.
START TRANSACTION、BEGIN Statements can be in MySQL Open a transaction explicitly from the command line . But in stored procedures ,MySQL The analyzer of the database will automatically BEGIN Identified as BEGIN…END, Therefore, only... Can be used in stored procedures START TRANSACTION Statement to start a transaction .

7.4 Implicit submission `SQL` sentence

DDL sentence ：ALTER DATABASE…UPGRADE DATA DIRECTORY NAME,ALTER EVENT,ALTER PROCEDURE,ALTER TABLE,ALTER VIEW,CREATE DATABASE,CREATE EVENT,CREATE INDEX,CREATE PROCEDURE,CREATE TABLE,CREATE TRIGGER,CREATE VIEW,DROP DATABASE,DROP EVENT,DROP INDEX,DROP PROCEDURE,DROP TABLE,DROP TRIGGER,DROP VIEW,RENAME TABLE,TRUNCATE TABLE.
Used to implicitly modify MySQL Architecture operation ：CREATE USER、DROP USER、GRANT、RENAME USER、REVOKE、SET PASSWORD.
Management statements ：ANALYZE TABLE、CACHE INDEX、CHECK TABLE、LOAD INDEX INTO CACHE、OPTIMIZE TABLE、REPAIR TABLE.
Be careful 　 I find Microsoft SQL Server Database administrators or developers often overlook the importance of DDL sentence Implicit commit operation of , Because in Microsoft SQL Server In the database , Even if it's DDL It can also be rolled back . This sum InnoDB Storage engine 、Oracle These databases are totally different .
Another thing to note ,TRUNCATE TABLE The sentence is DDL, Therefore, although and for the entire table DELETE The result is the same , But it cannot be rolled back （ This is and Microsoft SQL Server Where the data is different ）.

7.5 Statistics of transaction operations

Consider the number of requests per second （Question Per Second,QPS）

Transaction processing capacity per second （Transaction Per Second,TPS）

Be careful : Calculation TPS Approach is to （com_commit+com_rollback）/time. Premise is ： All transactions must be explicitly committed , If there is an implicit commit and rollback （ Default autocommit=1）, It doesn't count com_commit and com_rollback variable

7.6 The isolation level of the transaction

SQL standard The four isolation levels defined are ：

READ UNCOMMITTED
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
according to the understanding of , Most users question SERIALIZABLE Performance issues with isolation levels , But according to Jim Gray stay 《Transaction Processing》 The book points out that , The cost of both is almost the same , even to the extent that SERIALIZABLE Maybe better !!! So in InnoDB Select... From the storage engine REPEATABLE READ There is no loss of performance at the transaction isolation level of . similarly , Even using READ COMMITTED Isolation level , Users won't get a big performance boost .
stay InnoDB In the storage engine , You can use the following command to set the current session or global transaction isolation level ：

SET[GLOBAL|SESSION]TRANSACTION ISOLATION LEVEL
{
READ UNCOMMITTED
|READ COMMITTED
|REPEATABLE READ
|SERIALIZABLE
}

7.7 Distributed transactions

7.7.1　`MySQL` Database distributed transactions

InnoDB The storage engine provides the ability to XA Business Support for , And pass XA Business To support the implementation of distributed transactions .

Distributed transaction refers to allowing multiple independent transaction resources （transactional resources） Participate in a global transaction . Transaction resources are usually relational database systems , But it can also be other types of resources . A global transaction requires that all participating transactions in it either commit , Either roll back , This is for the original ACID The demands have been raised again . in addition , When using distributed transactions ,InnoDB The transaction isolation level of the storage engine must be set to SERIALIZABLE.

XA A transaction consists of one or more resource managers （Resource Managers）、 A transaction manager （Transaction Manager） And an app （Application Program） form .

Explorer RM： Provides access to transaction resources . Usually a database is a resource manager .
Transaction manager TM： Coordinate and participate in global transactions . Need to communicate with all resource managers involved in global transactions .
Applications AP： Define the boundaries of the transaction , Specifies the operation in a global transaction .
stay MySQL In the distributed transaction of database , The resource manager is MySQL database , The transaction manager is the connection MySQL The client side of the server . chart 7-22 Shows a distributed transaction model .

Insert picture description here

Distributed transactions use Two paragraph submission （two-phase commit） The way . In the first phase , All nodes participating in the global transaction begin to prepare （PREPARE）, Tell the transaction manager that they are ready to commit . In the second phase , The transaction manager tells the resource manager to execute ROLLBACK still COMMIT. If any node shows that it cannot be submitted , Then all nodes are told to roll back . It can be seen that different from local transactions , Distributed transactions require one more PREPARE operation , After receiving the consent information from all nodes , Proceed again COMMIT or ROLLBACK operation .

XA sentence

XA{
   START|BEGIN}xid[JOIN|RESUME]
XA END xid[SUSPEND[FOR MIGRATE]]
XA PREPARE xid
XA COMMIT xid[ONE PHASE]
XA ROLLBACK xid
XA RECOVER

7.7.2　 Inside XA Business

MySQL There is another distributed transaction in the database , It's between the storage engine and the plug-in , Or between the storage engine and the storage engine , Call it interior XA Business

MySQL The database passes through the internal XA Business Make sure the master and slave data are consistent

7.8　 Bad business habits

7.8.1　 Commit... In a loop

Commit operations should not be repeated in a loop , Whether it is an explicit commit or an implicit commit

7.8.2　 Use auto submit

Automatic submission is not a good habit ; When writing application development , It is better to give the control of transactions to the developers , That is, the start and end of the transaction on the program side . meanwhile , Developers must be aware of the problems that automatic submission can cause

7.8.3　 Use auto rollback

The advantage of controlling transactions in a program is , The user can know the cause of the error .

7.9 Long business

Long business (Long-Lived Transactions), seeing the name of a thing one thinks of its function , Transactions that take a long time to execute

Optimize : The logic of completing large transactions by batch processing small transactions
Between the storage engine and the plug-in , Or between the storage engine and the storage engine , Call it interior XA Business

MySQL The database passes through the internal XA Business Make sure the master and slave data are consistent

原网站

版权声明
本文为[A god given dream never wakes up]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/03/202203010528446761.html