Catalog

stay MySQL 5.7 edition , Officially called enhanced multi-threaded slave（ abbreviation MTS）, The replication latency problem has been greatly improved , Can say in MySQL 5.7 After version , Replication latency issues never exist .
5.7 Of MTS Itself is ：master Commit based on group (group commit) To implement concurrent transaction grouping , Again by slave adopt SQL thread Distribute transactions within a group commit to each worker Threads , Implement parallel applications .

MySQL 5.6 Parallel replication architecture

MySQL 5.7 Parallel replication principle

MySQL 5.6 After library based parallel replication , Basically no one cares , After a period of silence ,MySQL 5.7 Come out , Its parallel replication appears in DBA In front of .

MySQL 5.7 It can be called true parallel replication , The most important reason is slave Playback and of server master It's consistent , namely master How is it executed in parallel on the server , that slave How to play back in parallel on the Internet . There are no more parallel replication restrictions on Libraries , There is no special requirement for binary log format （ There is no requirement for parallel replication based on libraries ）.

from MySQL Officially , The original plan of parallel replication is to support table level parallel replication and row level parallel replication , Row level parallel replication through parsing ROW Format binary log to complete ,WL#4648. But the final appearance to small partners is indeed called in the development plan ：MTS（Prepared transactions slave parallel applier）, so ：WL#6314. The idea of parallel replication was first developed by MariaDB Of Kristain Put forward , And already in MariaDB 10 It appears that , Believe in many choices MariaDB One of the most important functions of our small partners is parallel replication .MTS It realizes the parallelism of transactions , To some extent, it also realizes row parallelism （ Transaction to line processing ）.

So let's see MySQL 5.7 How to implement parallel replication in ？

order commit (group commit) -> logical clock ->> MTS

Master

Group to submit （group commit）

Group to submit （group commit）： By grouping transactions , Optimization reduces the number of operations required to generate binary logs . When transactions are committed at the same time , They will be written to the binary log in a single operation . If the transaction can be committed successfully at the same time , Then they won't share any locks , That means they don't conflict , So you can Slave Execute in parallel . So by adding group commit information to the binary log on the host , these Slave You can run transactions safely in parallel .

First ,MySQL 5.7 Parallel replication of is based on one premise , That is, all are already in prepare Stage business , Can be submitted in parallel . Of course, these can also be submitted in parallel from the library , Because dealing with transactions at this stage , There is no conflict , All the resources that should be obtained have been obtained . On the other hand , If there is a conflict , Then, subsequent transactions can continue only after the transaction that has obtained resources is completed , Therefore, it will not enter prepare Stage . This is a new idea of parallel replication , Completely get rid of the distribution algorithm that has been committed to preventing conflicts , Complex and efficient work such as waiting strategy .

MySQL 5.7 In a word, the idea of parallel replication ： A group submits （group commit） Can be played back in parallel , Because these transactions have entered into the transaction prepare Stage , There is no conflict between transactions （ Otherwise, it is impossible to submit ）.

According to the above description , Here's the point ——

How to define which transactions are in prepare Stage ？
In the generated Binlog How to tell Slave Which transactions can be replicated in parallel ？

—— For compatibility MySQL 5.6 Library based parallel replication ,5.7 New variables are introduced slave-parallel-type, The configurable values are ：

DATABASE（ The default value is , Library based parallel replication ）
LOGICAL_CLOCK（ Parallel replication based on group commit ）

Supporting parallel replication GTID

So how do you know if transactions are in the same group ？ The original MySQL No such information was provided .

stay MySQL 5.7 In the version , The design is to store the information submitted by the group in GTID in .

So if the parameter gtid_mode Set to OFF, The user didn't turn it on GTID What about function? ？

MySQL 5.7 It also introduces what is called Anonymous_Gtid（ANONYMOUS_GTID_LOG_EVENT） Binary log event type ,

Such as ：

mysql> SHOW BINLOG EVENTS in 'mysql-bin.000006';

+------------------+-----+----------------+-----------+-------------+-----------------------------------------------+

| Log_name         | Pos | Event_type     | Server_id | End_log_pos | Info                                         |

+------------------+-----+----------------+-----------+-------------+-----------------------------------------------+

| mysql-bin.000006 | 4   | Format_desc    | 88        | 123          | Server ver: 5.7.7-rc-debug-log, Binlog ver: 4|

| mysql-bin.000006 | 123 | Previous_gtids | 88        | 194          |                                              |

| mysql-bin.000006 | 194 | Anonymous_Gtid | 88        | 259          | SET @@SESSION.GTID_NEXT= 'ANONYMOUS'         |

| mysql-bin.000006 | 259 | Query          | 88        | 330          | BEGIN                                        |

| mysql-bin.000006 | 330 | Table_map      | 88        | 373          | table_id: 108 (aaa.t)                        |

| mysql-bin.000006 | 373 | Write_rows     | 88        | 413          | table_id: 108 flags: STMT_END_F              |

......

This means that MySQL 5.7 In the version, even if it is not turned on GTID, There will also be one before each transaction starts Anonymous_Gtid, And this Anonymous_Gtid In the event, there is the information submitted by the group . conversely , If it's on GTID after , There would be no such thing Anonymous_Gtid 了 , Thus, the group submission information is recorded in non anonymous GTID Incident .

PREVIOUS_GTIDS_LOG_EVENT
Used to indicate the last binlog the last one gitd The location of , Every binlog only one , When not on GTID This event is empty when .
GTID_LOG_EVENT
- When open GTID when , Each operation statement （DML/DDL） A... Will be added before execution GTID event , Record the current global transaction ID.
- At the same time MySQL 5.7 In the version , Group submission information is also stored in GTID Incident , There are two key fields last_committed,sequence_number It is used to identify the group submission information .
- stay InnoDB There is a global counter in （global counter）, Before each storage engine commit , The counter value will increase . Enter in transaction prepare Before the stage , The current value of the global counter is stored in the transaction , This value is called the commit-parent（ That is to say last_committed）.

slave

LOGICAL_CLOCK( from order commit Realization ), Realized group commit Purpose

However , Through the above SHOW BINLOG EVENTS, We didn't find any information about the group's submission . But by mysqlbinlog Tools , You can find the internal information submitted by the group ——

$ mysqlbinlog mysql-bin.0000006 | grep last_committed

#150520 14:23:11 server id 88 end_log_pos 259  CRC32 0x4ead9ad6 GTID last_committed=0 sequence_number=1

#150520 14:23:11 server id 88 end_log_pos 1483 CRC32 0xdf94bc85 GTID last_committed=0 sequence_number=2

#150520 14:23:11 server id 88 end_log_pos 2708 CRC32 0x0914697b GTID last_committed=0 sequence_number=3

#150520 14:23:11 server id 88 end_log_pos 3934 CRC32 0xd9cb4a43 GTID last_committed=0 sequence_number=4

#150520 14:23:11 server id 88 end_log_pos 5159 CRC32 0x06a6f531 GTID last_committed=0 sequence_number=5

#150520 14:23:11 server id 88 end_log_pos 6386 CRC32 0xd6cae930 GTID last_committed=0 sequence_number=6

#150520 14:23:11 server id 88 end_log_pos 7610 CRC32 0xa1ea531c GTID last_committed=6 sequence_number=7

#150520 14:23:11 server id 88 end_log_pos 8834 CRC32 0x96864e6b GTID last_committed=6 sequence_number=8

#150520 14:23:11 server id 88 end_log_pos 10057 CRC32 0x2de1ae55 GTID last_committed=6 sequence_number=9

#150520 14:23:11 server id 88 end_log_pos 11280 CRC32 0x5eb13091 GTID last_committed=6 sequence_number=10

#150520 14:23:11 server id 88 end_log_pos 12504 CRC32 0x16721011 GTID last_committed=6 sequence_number=11

#150520 14:23:11 server id 88 end_log_pos 13727 CRC32 0xe2210ab6 GTID last_committed=6 sequence_number=12

#150520 14:23:11 server id 88 end_log_pos 14952 CRC32 0xf41181d3 GTID last_committed=12 sequence_number=13

...

Aforementioned last_committed and sequence_number It represents the so-called LOGICAL_CLOCK.

You can find MySQL 5.7 The binary log contains more content than the original binary log last_committed and sequence_number.

last_committed Indicates the number of the last transaction committed when the transaction was committed , The transaction is entering prepare The last transaction's sequence_number Record as your own last_committed, If the transaction has the same last_committed, Indicates that these transactions are in a group , Can be played back in parallel .
- For example, the above last_committed by 0 There are 6 individual , Indicates that the group submitted 6 One transaction , And this 6 There is a business in slave It can be played back in parallel .
and sequence_number It grows sequentially , Each transaction corresponds to a serial number , When the transaction is completed committed You will get this sequence_number.

in addition , And one more detail , Of the next transaction group last_committed And the last transaction sequence_number They are equal. . It's easy to understand , Because things are submitted sequentially , It's not surprising to understand that . Of this group sequence_number The minimum value must be greater than last_committed.( This description is not rigorous , stay 5.7 In subsequent versions , The official optimized slave In parallel apply The rules of , But for the sake of understanding , No modification , It is also easy to read the lock based parallel rules after understanding this idea .)

The valid scopes for both values are in the file , Just change one binlog file （flush binary logs）, Both values will start from 0 Start counting .

MySQL How to group these transactions ？

There is also an important technical problem ：MySQL How to group these transactions ？

We need to find out the problem , First of all, I need to understand MySQL Transaction submission method .

1. Two phase commit of transaction

There are two main steps to commit a transaction ：

Preparation stage （Storage Engine（InnoDB）Transaction Prepare Phase）
here SQL Has successfully executed , And generate xid Information and redo and undo Memory log of . And then call prepare Methods complete the first phase ,papare The method actually does nothing , Set the transaction state to TRX_PREPARED, And will redo log Brush disk .
Submission phase (Storage Engine（InnoDB）Commit Phase)
1. Record Binlog journal .
  If all the storage engines involved in the transaction prepare All performed successfully , Call TC_LOG_BINLOG::log_xid Methods will SQL The sentence reads binlog.
  （write() take binary log Memory log data is written to the file system cache ,fsync() take binary log File system cache log data is permanently written to disk ）.
  here , The business is committed . otherwise , call ha_rollback_trans Method to roll back the transaction , and SQL Statements don't actually write binlog.
2. Tell the engine to do commit.
  Last , Call engine commit Complete transaction commit . Will clear undo Information , brush redo journal , Set the transaction to TRX_NOT_STARTED state .

（ If you don't understand this paragraph, just look at the diagram above .）

2. Order Commit： yes LOGICAL_CLOCK Fundamentals of parallel replication

About MySQL How it was submitted , For internal use ordered_commit Function . Look at its logic diagram first , as follows ：

As you can see from the diagram , As long as the transaction is committed （ call ordered_commit）, Will join the queue first .

There are three steps to submitting , Include FLUSH、SYNC And COMMIT, Accordingly, there are three queues .

The first thing to join is FLUSH queue ：
1. If a transaction joins , The queue is still empty , Then this matter will be the captain , To commit on behalf of other transactions .
2. And when other transactions continue to join , You will find that the queue is no longer empty , Then these transactions will wait in the queue for the team leader to help them complete the commit operation . In the diagram above , Business 2-6 These are people who enjoy their success , Business 1 It's the captain .
3. Here's a little bit of attention , It's not that the team leader will keep waiting for the affairs to be submitted to join , But there is a time limit , The time limit starts with the captain joining , It's time for it to process the queue —— wait for binlog_group_commit_sync_delay millisecond , Then a group submission , If... Is reached in advance within the scope of waiting events binlog_group_commit_sync_no_delay_count When the number of transactions , A group submission will also be performed directly .
- As long as the leader takes out the transactions in the queue , Other transactions can be added to the waiting queue . The first one to join is the captain , But you have to wait . Because something is being done at this time FLUSH, finish FLUSH after , Only other captains can lead the team members to do FLUSH.
- At the same time , Only one group can do FLUSH. This is the waiting transaction group shown in the figure above 2 And waiting transaction group 3, At this time, the team leader will do it in order FLUSH.
- do FLUSH In the process of , There are some important things to do , as follows ：
  1. To ensure that the order must be the order in which submissions are added to the queue .
  2. If a new transaction is committed , The queue is empty , Can be added to FLUSH In line . however , Because at this time FLUSH The critical area is being occupied , So the new transaction group must wait .
  3. Assign to each transaction sequence_number, If it is the first transaction , This group's last_committed Set to sequence_number-1.
  4. Will take last_committed And sequence_number Of GTID event FLUSH To Binlog In file .
  5. Will be generated by the current transaction Binlog Content FLUSH To Binlog In file .
    such , A business FLUSH It's done. . Next , Finish all the transactions in the group in turn FLUSH. Then I do SYNC.
    
    finish FLUSH after ,FLUSH The critical area will be free , At this point, the group waiting for this critical area can do FLUSH Operation .
SYNC queue

If SYNC The critical zone is empty , Then do it directly SYNC operation , And if there is already a transaction group doing , You have to wait .
COMMIT queue

To COMMIT when , What is actually done is that the storage engine submits , Parameters binlog_order_commits Will affect submission behavior .
- If set to ON, Then the commit becomes a serial operation , The order of submission is the order of the queue .
- If set to OFF, Submission will not take place here , And in every transaction （ Including captain and team members ） do finish_commit（FINISH） Do the submission operation of the storage engine respectively .
  - Each transaction in the group does finish_commit It was done by the captain COMMIT After the process , To step DONE when , It wakes up every transaction waiting to be committed , Tell them they can go on , Then every business will be done finish_commit.
  - Then , The captain will do it himself finish_commit. such , The transactions of a group are submitted step by step .

It's time to figure out about order commit How does it work , And this is also LOGICAL_CLOCK Fundamentals of parallel replication .

because order commit So that all transactions are grouped , And have a serial number , After getting this information from the library , You can safely and boldly distribute according to the serial number .

Explore ：binlog_group_commit_sync_delay 、binlog_group_commit_sync_no_delay_count Yes group commit Influence ：

In terms of time , Join the team from the captain , Get all the transactions in the queue , The time between them is very, very small , So there won't be many transactions in this period of time .

Only under great pressure , When a lot of transactions are committed , To improve concurrency （ The number of transactions in the group becomes larger ）.

But this question also makes sense , When the main reservoir pressure is low , Why do you need so much concurrency from the library ？ Only when the pressure of the main reservoir is high , From the library will be delayed .

In this case, you can also adjust the parameters on the primary server binlog_group_commit_sync_delay、binlog_group_commit_sync_no_delay_count.

binlog_group_commit_sync_delay Indicates how long a transaction is delayed from committing to increase the number of transactions committed by the entire group , So as to reduce disk brushing sync The number of times , Unit is 1/1000000 second , Maximum 1000000 That is to say 1 second ;
binlog_group_commit_sync_no_delay_count Indicates the number of transactions committed by the group. When this value is rounded up, it will jump out of the wait , Then commit the transaction , Without waiting binlog_group_commit_sync_delay Delay time ; however binlog_group_commit_sync_no_delay_count Not more than binlog_group_commit_sync_delay Set up .

Both parameters are used to increase the proportion of transactions committed by the primary server group , Thus, the slave is increased MTS Parallelism of .

Business group commit,logical clock(order commit) Sketch Map ：

Assume that the current environment configuration parameters ：

binlog_group_commit_sync_delay = 1000

binlog_group_commit_sync_no_delay_count = 5

In the figure ：

T0->T1->..->T6, Each interval represents a binlog_group_commit_sync_delay = 1000 Time range , The red dotted line indicates the time range 5 Equal division .

among ,T0 by session1 - session10 The time node when ten sessions start transactions at the same time .

tn-m, by session-n In the current position, the m Submit actions .

When the time comes T1 when , achieve binlog_group_commit_sync_delay = 1000 Of delay The time limit , This time group commit The content is ：( Regardless of the order of the team leader )
```
t1-1,last_committed=0, sequence_number=1

t2-1,last_committed=0, sequence_number=2

t3-1,last_committed=0, sequence_number=3

t5-1,last_committed=0, sequence_number=4
```
When the time comes T2 when , Reach... Again binlog_group_commit_sync_delay = 1000 Of delay The time limit , This time group commit The content is ：( Regardless of the order of the team leader )
```
t2-2,last_committed=4, sequence_number=5

t4-1,last_committed=4, sequence_number=6

t7-1,last_committed=4, sequence_number=7

t8-1,last_committed=4, sequence_number=8
```
When the time comes T3 when , Reach... Again binlog_group_commit_sync_delay = 1000 Of delay The time limit , This time group commit The content is ：( Regardless of the order of the team leader )
```
t3-2,last_committed=8, sequence_number=9

t8-2,last_committed=8, sequence_number=10

t9-1,last_committed=8, sequence_number=11
```
When the time comes T3a when , Although not up to binlog_group_commit_sync_delay = 1000 Of delay The time limit , But it has happened 5 Submission , achieve binlog_group_commit_sync_no_delay_count = 5 Upper count limit , Group commit will now occur , This time group commit The content is ：( Regardless of the order of the team leader )
```
t1-2,last_committed=11, sequence_number=12

t2-3,last_committed=11, sequence_number=13

t6-1,last_committed=11, sequence_number=14

t7-2,last_committed=11, sequence_number=15

t8-3,last_committed=11, sequence_number=16
```
When the time comes T4a when , Although not up to binlog_group_commit_sync_delay = 1000 Of delay The time limit , But it has happened 5 Submission , achieve binlog_group_commit_sync_no_delay_count = 5 Upper count limit , Group commit will now occur , This time group commit The content is ：( Regardless of the order of the team leader )
```
t1-3,last_committed=16, sequence_number=17

t2-4,last_committed=16, sequence_number=18

t4-2,last_committed=16, sequence_number=19

t5-2,last_committed=16, sequence_number=20

t8-4,last_committed=16, sequence_number=21
```
An egg . When t10-1 After the transaction is committed , The group commit... Will be performed immediately , Why? ？
- because T4a After group submission at time point ,delay 1000(5 A unit of time ) The submission time of happens to be t10-1 At the same time that the transaction commit occurs .
- Because T4a After group submission at time point , By t10-1 Transaction submission ,count Just reached the upper limit of the count ——5.
  
  This time group commit The content is ：( Regardless of the order of the team leader )
```
t3-3,last_committed=21, sequence_number=22

t6-2,last_committed=21, sequence_number=23

t7-3,last_committed=21, sequence_number=24

t9-2,last_committed=21, sequence_number=25

t10-1,last_committed=21, sequence_number=26
```

Copy the distribution principle from the library to multiple threads

got it order commit After the principle , Now it's easy to think about how to distribute from the library side ：

From the library, do... On a transaction by transaction basis APPLY Of , There is one for each transaction GTID event , So there is one last_committed And sequence_number value .

1. be based on last_committed The distribution principle is as follows ：

because last_committed Values are recorded in ：master The last in the previous group sequence_number Record as the next group of last_committed, therefore Of this group sequence_number The minimum value must be greater than last_committed, The next group last_committed It must be bigger than the previous group sequence_number The minimum value of （ Because it is equal to sequence_number Maximum ）

sql thread Get a new deal , Take out the last_committed And sequence_number value .
Of a transaction that has already been executed sequence_number The minimum value of （low water mark,lwm）, With the extraction transaction last_committed Value comparison .（ Of this group sequence_number The minimum value must be greater than last_committed）
If you take out the last_committed Less than what has been executed sequence(lwm), Description the fetched transaction is in the same group as the current execution group , No need to wait , Directly by the sql thread Allocate transactions to idle worker Threads .
SQL Thread pass statistics , Find a free worker Threads , If there is no spare time , be SQL Thread goes to wait state , Until you find a free worker Until the thread . Package the current transaction , To the selected worker, after worker The thread will go APPLY This business , At this time SQL The thread will process the next transaction .
If you take out the last_committed Greater than or equal to the already executed lwm, Description the fetched transaction is not in the same group as the current transaction , Take out the transaction as a new group , Need to wait .
wait for lwm growth , When already executed sequence(lwm) Equal to taking out the last_committed when , It indicates that the previous group has been completed .sql thread Start... That will fetch the transaction last_committed Group transactions are distributed to worker Threads in parallel apply.

Principle schematic reference ：

Transaction schematic reference ：

t3-3,last_committed=21, sequence_number=22

t6-2,last_committed=21, sequence_number=23

t7-3,last_committed=21, sequence_number=24

t9-2,last_committed=21, sequence_number=25

t10-1,last_committed=21, sequence_number=26

new,last_committed=26, sequence_number=27

Suppose this time sql thread The transaction has just been t3-3 Distributed to the worker Threads :
- sql thread Take out the business (t6-2) Of last_committed and sequence_number(21,23),
  - If you take out the last_committed(21) Less than the currently executed sequence_number The minimum value of (22), Explain that the transactions taken out are in the same group as the transactions being executed , No need to wait .
- sql thread Take out the business (t7-3) Of last_committed and sequence_number(21,24),
  - If you take out the last_committed(21) Less than the currently executed sequence_number The minimum value of (22), Explain that the transactions taken out are in the same group as the transactions being executed , No need to wait .
  ……
- sql thread Take out the business (new) Of last_committed and sequence_number(26,27),
  - If you take out the last_committed(26) Greater than or equal to the currently executed sequence_number The minimum value of (22), Explain that the proposed transaction is a new group , Take out the business to wait .
  - When sqlthread Judge what has been done sequence_number Equal to taking out the transaction last_committed when , Explain that you can start a new group of apply 了 .
When a transaction (t10-1) After execution , What has been implemented sequence_number(26) = Take out the last_committed(26), The previous group has been completed ,sql thread Begin to last_committed=26 Group transactions of are distributed to worker Threads in parallel apply.

Commit-Parent-Based Scheme brief introduction （WL#7165）

stay master On , There is a global counter （global counter）. Before each time the storage engine completes the commit , The counter value will increase .
stay master On , Enter in transaction prepare Before the stage , The current value of the global counter is stored in the transaction . This value is called the commit-parent（last_committed）.
stay master On ,commit-parent Will be stored at the beginning of the transaction binlog in .
stay slave On , If two transactions have the same commit-parent, They can be executed in parallel .

this commit-parent Is that we are binlog See in the last_committed. If commit-parent identical , namely last_committed identical , Is considered the same group , It can be played back in parallel .

be based on last_committed distribution (Commit-Parent-Based Scheme) The problem is

In a word ：Commit-Parent-Based Scheme It will reduce the parallelism of replication .

Explain the following figure ：

The horizontal dashed line indicates that transactions are going backwards in chronological order .
P Indicates that the transaction is entering prepare What I read before the stage commit-parent It's worth the time (last_committed). It can be simply regarded as a locking time point .
C Indicates that the transaction has incremented the global counter （global counter） The time point of the value of (sequence). It can be simply regarded as the time point for releasing the lock
P Corresponding commit-parent（last_commited） Is the largest... Taken from all completed transactions C Corresponding sequence_number.
- for instance ：
  - Trx4 Of P Corresponding commit-parent（last_commited） The largest of all executed transactions C Corresponding sequence_number=1, That is to say Trx1 Of C Corresponding sequence_number. Because at this time Trx1 It's done , however Trx2 Not finished .
  - Trx5 Of P Corresponding commit-parent（last_commited） The largest of all executed transactions C Corresponding sequence_number=2, That is to say Trx2 Of C Corresponding sequence_number;
  - Trx6 Of P Corresponding commit-parent（last_commited） The largest of all executed transactions C Corresponding sequence_number=2, That is to say Trx2 Of C Corresponding sequence_number. therefore Trx5 and Trx6 Have the same commit-parent（last_commited）, During playback ,Trx5 and Trx6 It can be played back in parallel .
The figure is visible ：
- Trx5 and Trx6 Concurrent execution , Because of their commit-parent It's the same , It's all by Trx2 Set .
- Trx4 and Trx5 Cannot execute concurrently ,
- Trx6 and Trx7 It cannot be executed concurrently .
It can be noted that , At the same time ,Trx4 and Trx5、Trx6 and Trx7 Each holds their own lock , There is no conflict between the transactions . If in slave Concurrent execution on , There will be no problem .
According to the above example , It can be learned that ：
- Based on last_committed Under the rules ,Trx4、Trx5 and Trx6 Hold their locks at the same time , but Trx4 Cannot execute concurrently , because Trx4 Pick up to laste_committed Different from the latter two .
- Trx6 and Trx7 Hold their locks at the same time , but Trx7 Cannot execute concurrently , The reason is the same .
actually ,Trx4 Is that you can and Trx5、Trx6 Parallel execution ,Trx6 You can talk to Trx7 Parallel execution . If this can be achieved , Then the effect of parallel replication will be better .
So the government has improved the parallel replication mechanism , A new way of parallel replication is proposed ：Lock-Based Scheme.# 5.7 Start based on lock interval Parallel rules for （WL#7165）

explain ： The above steps are described in transaction units , In fact, the actual processing is still an event by event distribution . If a transaction has been selected worker, And new event Still in that business , Give it directly to that worker Processing can be .

From the distribution principle above , All are executed at the same time with the same last_committed Value transactions , The difference is that the latter needs to wait until the former is finished , This implementation method is a bit as shown in the figure below ：

It can be seen that , Transactions are randomly assigned to worker In the thread , But if you do , Must be executed line by line . The more transactions in a row , The higher the degree of parallelism , It also shows that the greater the instantaneous pressure of the main reservoir .

2. MySQL 5.7 Start based on lock interval Parallel rules for （WL#7165）

Realization ： If two transactions hold their own locks at the same time , You can execute concurrently .

The former principle needs to be supplemented by ：

because last_committed Values are recorded in ：master The last in the previous group sequence_number Record as the next group of last_committed,master take MySQL Global variables global.max_committed_transaction( All is over lock interval The biggest of the business sequence_number) Record as the next group of last_committed, Therefore, the sequence_number The minimum value must be greater than last_committed, The next group last_committed It must be bigger than the previous group sequence_number The minimum value of （ Because it is equal to sequence_number Maximum ）

# According to the lock based feature , In fact, it is the first one in this group Prepare Last group with time gap C Of that matter sequence, in other words , If the last few transactions of the previous group exist with the first few transactions of the current group lock interval overlap , Then, these transactions of the previous group will go to the previous transaction sequence Is the current group last_committed

Lock-Based Scheme brief introduction （WL#7165）

First , A definition is called lock interval The concept of , meaning ： The interval during which a transaction holds a lock .

When the storage engine commits , The first lock is released ,lock interval end .
When the last lock is acquired ,lock interval Start .

Assume ： The last lock is acquired in binlog_prepare Stage .

Suppose there are two transactions ：Trx1、Trx2.Trx1 Precede Trx2. that ,

If and only if Trx1、Trx2 Of lock interval There is overlap , Can be executed in parallel .

Tx0 ,Tx1 In the same time interval (lock interval), Each holds its own lock .
in other words , At the same time, the two transactions hold their own locks without conflict , So these two transactions can be in parallel apply.lock interval Overlap can be parallel .
In other words , If Trx1 Of lock interval End point and Trx2 Of lock interval There is a gap at the starting point , Can't execute in parallel .

Tx0 ,Tx1 Two things about prepare To committed The occurrence time does not overlap (lock interval No overlap ), It is impossible to determine whether these two transactions hold their own locks at the same time
Therefore, these two transactions cannot be parallel apply.
MySQL Will get global variables global.max_committed_transaction, meaning ： All is over lock interval The biggest of the business sequence_number.
L Express lock interval The beginning of
- about L（lock interval The beginning of ）,MySQL Will be able to global.max_committed_timestamp Assign to a variable , And was named transaction.last_committed.
C Express lock interval The end of
- about C（lock interval The end of ）,MySQL Each transaction is assigned a logical timestamp （logical timestamp）, Name it ：transaction.sequence_number.

transaction.sequence_number and transaction.last_committed Both timestamps will be stored in binlog in .

Based on the above analysis , We can come to the conclusion that slave Conditions for executing transactions on ：

If the smallest of all executing transactions sequence_number Greater than one transaction transaction.last_committed, Then this transaction can be executed concurrently .（ This sentence is too convoluted , Don't force it , Look at the unfashionable flavor below to understand ）

Unfashionable understanding Lock-Based Scheme

Let's get rid of writeset, Don't confuse , Understanding this will help to understand writeset principle .

be based on commit parent The way , The transaction last_committed It must be equal to that of the last transaction of the previous group sequence number.
But based on lock interval When the way , That's not the case , The transaction last_committed Not necessarily equal to the last transaction of the previous group sequence number 了 , It means that everything is over lock interval The biggest of the business sequence_number.
Illustrate with examples ：
Lock-Based Scheme Example

…

t1,last_committed=0, sequence_number=3

t2,last_committed=3, sequence_number=4

t3,last_committed=3, sequence_number=5

t4,last_committed=3, sequence_number=6

t5,last_committed=3, sequence_number=7

t6,last_committed=6, sequence_number=8

t7,last_committed=6, sequence_number=9

t8,last_committed=9, sequence_number=10

Business t1,last_committed=0,sequence_number=3. first work The thread will take over the transaction and start working .
Business t2,last_committed=3, sequence_number=4. Until transaction t1 complete , Business t2 To start . because last_committed=3 No less than the number of transactions being executed sequence_number=3. So these two transactions can only be serialized .
Although before 2 Transactions may be assigned to different work Threads , But they are actually serial , Just like single threaded replication .

When sequence_number=3 The transaction is completed ,last_committed=3 Three transactions can be executed concurrently .

t3,last_committed=3, sequence_number=5

t4,last_committed=3, sequence_number=6

t5,last_committed=3, sequence_number=7

Once the first two （t3,t4） Execution completed , The following two can be executed ：
```
last_committed=6 sequence_number=8 last_committed=6 sequence_number=9
```

because last_committed=6 Less than the number of transactions being executed sequence_number=7, It can be done in parallel .

in other words , When t5,last_committed=3, sequence_number=7 In the process of execution ,sequence_number=8 and sequence_number=9 These two can also be executed concurrently .
The end of these three transactions is not restricted in sequence .
Because of these three transactions lock interval There is overlap , Therefore, you can execute concurrently , So transactions do not interact with each other .
Wait until all the previous transactions are completed , The following transaction can be carried out ：
```
t8,last_committed=9, sequence_number=10
```
I feel more dizzy after watching it ？ No problem , Don't have to struggle , Look below ：
First of all , Transactions in the diagram Tx1 As a reference , Ignore it , Its meaning is to Tx2 Transactions provide a last_committed.
Tx2--Tx5 For the first group ,Tx6~Tx7 For the second group , Made a distinction with the background color .
You can see ：
1. Business Tx2~Tx5 All exist lock interval overlap , this 4 Transactions can be parallelized apply, So this 4 Transactions in a group .
2. Business Tx6 Because of and affairs Tx4 It didn't happen lock interval overlap , So the business Tx6 Can't and Tx4 parallel , You can't be a member of the previous group , You can only set up new groups by yourself .
3. Last transaction of the first group Tx5 And the second group of transactions Tx6、Tx7 Three transactions exist lock interval overlap , Although cross group , But this 3 A transaction satisfies parallel logic , Can be done in parallel .
4. The second group of last value =6, Not the last transaction in the first group sequence_number=7.（ Why? ？↓）
In fact, the second group last_committed The value is taken from this rule ：

Several key time points ：
1. The first transaction of the second group begins prepare The point in time is called A spot （last_committed）.
2. A When point occurs , All in the first group are finished lock interval The biggest of the business sequence_number be called B spot .
3. The last transaction of the first group Tx5 Of commit Time is called C spot （sequence_number）
stay A It's happening prepare when ,B Point and A There is a gap between the points （ That is to say , Business tx4 And transaction tx6 There is no lock overlap ）,Tx4,Tx6 Can't be parallel , therefore A Point proceed prepare The business of Tx6 Becomes a new group transaction .
A Click on all that is over at that time lock interval The biggest of the business sequence_number As one's own last_committed.Tx6 Of last_committed=6.

To sum up, it is ：last_committed Values are taken from the previous group , Transaction does not exist with this group lock interval Of the last transaction that overlaps sequence number

Conclusion ：
- Between transactions lock interval Overlap can be parallel apply, But as long as any two transactions exist gap( Business lock interval No overlap ) This will result in grouping .
- Grouping just avoids lock conflicts , That doesn't mean you can't go parallel （ Whether there is lock conflict or not , As long as the transactions do not overlap, they pessimistically believe that there are conflicts , Reject parallelism ）.
- Whether it can be parallelized depends on only one situation , Between transactions lock interval overlap . So even if the transactions are in different groups , As long as there is a lock interval overlap , It may be parallel apply.

MySQL 5.7 Parallel replication test

The following figure shows the switch on MTS after ,Slave Server's QPS. The test tool is sysbench Single table full update test , The test results show that 16 The performance is best under two threads , Slave QPS You can achieve 25000 above , Further increase the number of threads executing in parallel to 32 Did not lead to higher ascension . The original single thread playback QPS Only in 4000 about , so MySQL 5.7 MTS Performance improvements , Because the test is a single table , therefore MySQL 5.6 Of MTS The mechanism is completely powerless .

Parallel replication configuration and tuning

master_info_repository
Turn on MTS After function , Make sure that the parameters master_info_repostitory Set to TABLE, So the performance can have 50%~80% The promotion of . This is because parallel replication is enabled for master.info The update of this file will be greatly improved , The competition for resources will also increase .
slave_parallel_workers
If the slave_parallel_workers Set to 0, be MySQL 5.7 Degenerate to the original single thread replication , But will slave_parallel_workers Set to 1, be SQL Thread function is transformed into coordinator Threads , But only 1 individual worker Thread playback , It's also single thread replication . However , There are some differences between the two kinds of performance , Because one more time coordinator Thread forwarding , therefore slave_parallel_workers=1 The performance is better than 0 Even worse , Under the test, there are 20% Left and right performance degradation , As shown in the figure below ：

Another problem is introduced here , If the load on the host is small , Then the efficiency of group submission is not high , It is very likely that the number of transactions committed per group is only 1 individual , So when playing back from the slave , Although parallel replication is enabled , However, the performance will be worse than the original single thread , That is, the delay increases . Smart guys , Have you ever thought about optimizing this ？

slave_preserve_commit_order
MySQL 5.7 After MTS Parallel replication with smaller granularity can be realized , But it needs to be slave_parallel_type Set to LOGICAL_CLOCK, But just set it to LOGICAL_CLOCK There will be problems , Because at this time slave The order of application transactions on is out of order , and relay log The order of transactions recorded in is different , In this way, data consistency cannot be guaranteed , In order to ensure that the transaction is in accordance with relay log In order to play back , You need to turn on the parameters slave_preserve_commit_order.
When this parameter is turned on , The execution thread will wait , All transactions until commit . When sql thread Waiting for others worker When committing its transactions , Its status is waiting for the previous transaction to commit .
So although MySQL 5.7 add to MTS after , although slave Can be applied in parallel relay log, but commit The parts are still submitted in sequence , There may be waiting .
When open slave_preserve_commit_order After the parameter ,slave_parallel_type Can only be LOGICAL_CLOCK, If you use cascading replication , that LOGICAL_CLOCK May cause master The farther away slave The worse the parallelism .
But it's been tested , This parameter is in MySQL 5.7.18 After setting in , There's no guarantee slave The order of transaction Submission on is the same as relay log Agreement .
stay MySQL 5.7.19 After setting ,slave The commit order of transactions on is the same as relay log In the agreement （ So if production wants to use MTS characteristic , Version greater than or equal to MySQL 5.7.19 It's safe ）.

Said so much , To turn on enhanced multi-threaded slave It's very simple , Just set it as follows ：

# slave;

slave-parallel-type=LOGICAL_CLOCK

slave-parallel-workers=16

slave_pending_jobs_size_max = 2147483648

slave_preserve_commit_order=1

master_info_repository=TABLE

relay_log_info_repository=TABLE

relay_log_recovery=ON

In the use of the MTS after , Replication monitoring can still pass SHOW SLAVE STATUS\G, however MySQL 5.7 stay performance_schema The following metadata tables are added to the schema , Users can monitor more carefully ：

mysql> show tables like 'replication%';

+---------------------------------------------+

| Tables_in_performance_schema (replication%) |

+---------------------------------------------+

| replication_applier_configuration      |

| replication_applier_status         |

| replication_applier_status_by_coordinator  |

| replication_applier_status_by_worker    |

| replication_connection_configuration    |

| replication_connection_status        |

| replication_group_member_stats       |

| replication_group_members          |

+---------------------------------------------+

8 rows in set (0.00 sec)

adopt replication_applier_status_by_worker You can see worker How the process works ：

mysql> select * from replication_applier_status_by_worker;

+--------------+-----------+-----------+---------------+--------------------------------------------+-------------------+--------------------+----------------------+

| CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION           | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP |

+--------------+-----------+-----------+---------------+--------------------------------------------+-------------------+--------------------+----------------------+

|       |     1 |    32 | ON      | 0d8513d8-00a4-11e6-a510-f4ce46861268:96604 |         0 |          | 0000-00-00 00:00:00 |

|       |     2 |    33 | ON      | 0d8513d8-00a4-11e6-a510-f4ce46861268:97760 |         0 |          | 0000-00-00 00:00:00 |

+--------------+-----------+-----------+---------------+--------------------------------------------+-------------------+--------------------+----------------------+

2 rows in set (0.00 sec)

So how do you know the slave MTS The degree of parallelism is not small . A simple way （ Mr. Jiang gave it ）, have access to performance_schema Library to observe , For example, the following one SQL You can count each Worker Thread Number of transactions executed , On this basis, another aggregation analysis can be made to get each MTS Parallelism of :

SELECT thread_id,count_star FROM performance_schema.events_transactions_summary_by_thread_by_event_name

WHERE thread_id IN (SELECT thread_id FROM performance_schema.replication_applier_status_by_worker);

If thread parallelism is too high , Not average enough , In fact, the parallel effect is not good , You can try to optimize . In this case , You can adjust the parameters on the primary server binlog_group_commit_sync_delay、binlog_group_commit_sync_no_delay_count. The former indicates how long the transaction is delayed , The latter indicates how many transactions the Group commits and then commits together . On the whole , To increase the proportion of transactions committed by the primary server group , Thus, the slave is increased MTS Parallelism of .

although MySQL 5.7 To launch the Enhanced Multi-Threaded Slave To some extent, it solves the problem MySQL Decades of replication delays . However , at present MTS The mechanism is based on group submission , To put it simply, how is it executed in parallel on the master computer , How to play back from the server . There is a possibility , That is, if the parallelism of the primary server is not enough , The effect of the parallel mechanism of the slave will be greatly reduced .MySQL 8.0 Latest based on writeset Of MTS Is the final solution . That is, two transactions , As long as the updated records do not overlap （overlap）, Then it can be executed in parallel on the slave , No need to be in a group , Even if the main server executes in a single thread , The slave server can still play back in parallel . I believe this is the most perfect solution ,MTS The final form of .

Last , If MySQL 5.7 To use MTS function , You must use the latest version , At least upgrade to 5.7.19 edition , Fixed a lot of Bug.

reference information

http://www.ywnds.com/?p=3894

Operation and maintenance internal reference books

The official account of ginger general

http://mysql.taobao.org/monthly/2017/12/03/

https://mp.weixin.qq.com/s/XbWMdVTl9qz1nSwL3l56XQ

MySQL Parallel replication (MTS) principle （ Full version ） More articles about

MySQL Parallel replication (MTS) The actual record library does not exist
Catalog background edition analysis test background Open the semi synchronous slave Library of parallel replication SQL Thread Report 1032 error , There is no error in asynchronous replication from the library , This happens occasionally edition mysql 5.7.16 redhat 6.8 mysql> ...
MySQL Parallel replication evolution and MySQL 8.0 Based on WriteSet The optimization of the
MySQL 8.0 Can be said to be MySQL A milestone version in the history of development , Including a number of major updates , at present Generally Available The version has been released , The official version will be released soon , Here we will introduce 8.0 Version of the introduction of a ...
[ Reprinted from alidinch ] Various versions MySQL The implementation of parallel replication and its advantages and disadvantages
MySQL Parallel replication is a clich é , The author from the 2010 We've been dealing with this online problem since , In the first two or three years, I enjoyed sharing , Now it's inevitable to bring up this topic again " rehash " Suspect . Let's talk about this topic recently , yes ...
Various versions MySQL The implementation of parallel replication and its advantages and disadvantages
MySQL Parallel replication is a clich é , The author from the 2010 We've been dealing with this online problem since , In the first two or three years, I enjoyed sharing , Now it's inevitable to bring up this topic again “ rehash ” Suspect . Let's talk about this topic recently , Because some students think “5.7 Of ...
mysql Parallel replication can reduce the synchronization delay of master-slave system
One . origin mysql Master slave copy , The separation of reading and writing is very popular on the Internet mysql framework , The most critical thing about master-slave replication is , In the scenario of large amount of data and large amount of concurrency , The delay between master and slave will be serious . Why? mysql The master-slave delay is so big ? answer : from ...
InnoSQL/MySQL Implementation and configuration of parallel replication
InnoSQL/MySQL Implementation and configuration of parallel replication http://www.innomysql.net/article/6276.html Solutions before parallel replication InnoSQL stay 5.5.30-v4 Version middle branch ...
【58 Shen Jian Architecture Series 】mysql Parallel replication optimization ideas
One . origin mysql Master slave copy , The separation of reading and writing is very popular on the Internet mysql framework , The most critical thing about master-slave replication is , In the scenario of large amount of data and large amount of concurrency , The delay between master and slave will be serious . Why? mysql The master-slave delay is so big ? answer : from ...
MySQL 5.7 Implementation principle and tuning of parallel replication
MySQL 5.7 Parallel replication era as everyone knows ,MySQL Replication delay is one of the problems that has been criticized , However, in Inside In Jun's previous two blogs (1,2) It has been mentioned in MySQL 5.7 Version already supports “ real ” Parallel copy function of , ...
official ：MySQL 5.7 Implementation principle and tuning of parallel replication | InsideMySQL（ Reprint ）
MySQL 5.7 Parallel replication era as everyone knows ,MySQL Replication delay is one of the problems that has been criticized , However, in Inside In Jun's previous two blogs (1,2) It has been mentioned in MySQL 5.7 Version already supports “ real ” Parallel copy function of , ...
MySQL Automatic restart analysis of parallel replication from the library
Automatic restart analysis of parallel replication from the library background Semi synchronous replication from the library in the early hours of the night 2 An automatic restart occurs at half past , Another asynchronous replication from the library in the early morning of the next day 3 Point also has an automatic restart . analysis edition mysql 5.7.16 mysql> show ...

Random recommendation

I've been doing java Reptiles , Some insights , Share with you ;
First , After reading this article , There's no guarantee you'll be a God , But it can help you understand what a reptile is , How to use crawlers , How to use it http agreement , Hacking into other people's systems , Just a few simple tutorials, of course , Get some simple data : On the first code , Step by step : This is a ...
OBJ File format analysis tool : objdump, nm,ar
First of all, a brief introduction about gcc.glibc and binutils The relationship between modules One . About gcc.glibc and binutils The relationship between modules 1.gcc(gnu collect compiler) It's a set of compiler tools ...
go: A universal log Module implementation
stay go Inside , Although there are log modular , But the function provided by this module is not strong , For example, we don't have what we often use level log function , But make it a reality log Modules are not difficult either . about log Of level, We define it as follows : const ( L ...
One applicationContext Summary of blocking solutions caused by loading errors
The problem is docking a sso Verification module for , The correct docking position is , Access one filter, Then connect to a SsoListener . But after access , But the application can't start normally , Or it looks strange , Let's see what we've come across ...
Java Static variables in 、 Static method problem
By keyword static Defined variables and methods , They are called static variables and static methods , They are also called static members 1. Static methods Objects that do not need this class can also call this method , The call form is “ Class name . Method name ”, Static methods often provide some methods for other classes instead of ...
Windows 64 Bit environment Java Service configuration
There is a task , Remote start-up is required Windows64 The program under the server , So you need to Windows Inject a... Into the server deamon service , We all know Linux It is very simple to make the environment into a background service ,nohup & Can solve the problem quickly , but wi ...
Express frame Fetch signal communication
Recently, I set up a blog site , For the front desk React, The server uses node Realized ,node It's the first time I've been in touch with , So I'm still groping , This article mark Pit encountered during communication . fetch To configure : window.fetchUtility = ...
Selenium Some problems and solutions in installation - Soft and strong 1703 class 3 Group sorting and sharing
Thank you very much for the master of software engineering 1703 class 3 The enthusiasm of the group , They will install Selenium The hole we stepped on in the process was filled up for everyone . I hope you haven't stepped on the pit yet , Or the group that fell into the pit and didn't climb out , Can cross this installation pit smoothly . The following is the original text . Se ...
8-3 Bits Equalizer uva12545
The question : Give the string s contain '0' '1' '?'; Then give the string t Contains only 01: Now we can deal with S Do three operations : hold 0 become 1, hold ? become 0 or 1, Any two positions are exchanged : Ask at least how many times s ＝＝ t: greedy Remove that by default ...
kth-smallest-element-in-a-sorted-matrix
// There was a lot of discussion , such as // https://discuss.leetcode.com/topic/52865/my-solution-using-binary-search-in-c // https ...