当前位置:网站首页>MySQL interview arrangement
MySQL interview arrangement
2022-06-12 16:25:00 【Future shadow】
List of articles
MySQL Implement underlying principles
Query structure
# The way 1:
SELECT ...,....,...
FROM ...,...,....
WHERE Connection conditions of multiple tables
AND Filter conditions that do not contain group functions
GROUP BY ...,...
HAVING Contains filter conditions for group functions
ORDER BY ... ASC/DESC
LIMIT ...,...
# The way 2:
SELECT ...,....,...
FROM ... JOIN ...
ON Connection conditions of multiple tables
JOIN ...
ON ...
WHERE Filter conditions that do not contain group functions
AND/OR Filter conditions that do not contain group functions
GROUP BY ...,...
HAVING Contains filter conditions for group functions
ORDER BY ... ASC/DESC
LIMIT ...,...
# among :
#(1)from: From which tables to filter
#(2)on: When associating multiple table queries , Remove Cartesian product
#(3)where: Conditions filtered from the table
#(4)group by: Group by
#(5)having: Sift through the statistical results again
#(6)order by: Sort
#(7)limit: Pagination
SELECT Execution order
FROM -> WHERE -> GROUP BY -> HAVING -> SELECT Field of -> DISTINCT -> ORDER BY -> LIMIT
SELECT DISTINCT player_id, player_name, count(*) as num # The order 5
FROM player JOIN team ON player.team_id = team.team_id # The order 1
WHERE height > 1.80 # The order 2
GROUP BY player.team_id # The order 3
HAVING num > 2 # The order 4
ORDER BY num DESC # The order 6
LIMIT 2 # The order 7
stay SELECT When the statement executes these steps , Each step produces a Virtual table , Then pass this virtual table into the next step as input
SQL Implementation principle of
SELECT It's to execute first FROM This step of . At this stage , If multiple tables are used for associated query , The following steps will be followed
- 1、 First, through CROSS JOIN Find the Cartesian product , Equivalent to getting a virtual table vt(virtual table)1-1
- 2、 adopt ON Screening , In virtual tables vt1-1 On the basis of , Get the virtual table vt1-2
- 3、 Add external row . If we use the left connection 、 Right link or full link , It will involve external lines , That is, in virtual surface vt1-2 Add external rows based on , Get the virtual table vt1-3
Get the original data of the query data table ( Virtual table vt1), On this basis WHERE Stage , Filter to get virtual table vt2
Proceed again GROUP and HAVING Stage , Yes vt2 Grouping and grouping filtering , Get the intermediate virtual table vt3、vt4
Then enter SELECT and DISTINCT Stage , Get the middle virtual table vt5-1、vt5-2
Specify field sort ,ORDER BY Stage , Get the virtual table vt6
Take out the specified line record ,LIMIT Stage , Get the virtual table vt7
engine
InnBD engine
MySQL5.5 Then the default engine
- Default transactional engine , Designed to handle a large number of short-term transactions , Ensure complete commit and rollback of transactions
- Cache indexes and real data , High memory requirements , Memory size has an absolute impact on performance
- Designed for maximum performance in handling huge amounts of data
MyISAM engine
MySQL5.5 The previous default storage engine
- Provide a large number of features , Include full text index 、 Compress 、 Space function, etc . Transaction is not supported 、 Row-level locks 、 Can't recover safely after a crash
- Fast access , There is no requirement for the integrity of the transaction or with SELECT、INSERT Mainly applications
- There is additional constant storage for data statistics
Archive engine
For data archiving
| features | Support |
|---|---|
| compressed data | Support |
| Backup / Time to recover ( Implement... In the server , Not in the storage engine ) | Support |
| Geospatial data types | Support |
| Encrypt data ( Implement... In the server ) | Support |
| Update the statistics of the data dictionary | Support |
| Lock granularity | Row lock |
| Data caching | I won't support it |
| Foreign keys | I won't support it |
| Full text search index | I won't support it |
| Clustered index | I won't support it |
| Geospatial index | I won't support it |
| Hash index | I won't support it |
| The index buffer | I won't support it |
| B Tree index | I won't support it |
| MVCC | I won't support it |
| Storage limits | unlimited |
| transaction | I won't support it |
| Cluster database | I won't support it |
Memory engine
Table in memory , The logical medium used is memory , Fast response . When mysqld Data is lost when the daemon crashes , In addition, the stored data is required to be in a format with constant data length
features :
- It also supports hash index 、B+ Tree index
- Than MyISAM One order of magnitude fast
- The size of the table mainly depends on two parameters ,max_rows( Specify... When creating a table )、max_heap_table_size( Default 16MB)
- Data files 、 Index files are stored separately
- Data is easy to lose , Short life cycle
Use scenarios :
- Less target data , Frequent visits . Store data in memory , If the data is too large, it will cause internal overflow
- The data is provisional , Must be available immediately
- Stored in Memory It doesn't matter that the data in the table is suddenly lost
Other engines
- Merge engine : Manage multiple MyISAM A collection of tables made up of tables
- NDB engine :MySQL Cluster specific storage engine , It's also called NDB Cluster Storage engine , It is mainly used for MySQL Cluster Distributed cluster environment
Comparison of common engines
| characteristic | MyISAM | InnoDB | MEMORY | MERGE | NDB |
|---|---|---|---|---|---|
| Storage limits | Yes | 64TB | Yes | No, | Yes |
| Transaction security | Support | ||||
| Locking mechanism | Table locks | Row lock | Table locks | Table locks | Row lock |
| B Tree index | Support | Support | Support | Support | Support |
| Hash index | Support | Support | |||
| Full-text index | Support | ||||
| Cluster index | Support | ||||
| Data caching | Support | Support | Support | ||
| The index buffer | Cache index , Don't cache data | Cache index 、 data | Support | Support | Support |
| Data can be compressed | Support | ||||
| Space use | low | high | N/A | low | low |
| Memory usage | low | high | secondary | low | high |
| Speed of batch insertion | high | low | high | high | high |
| Support foreign keys | Support |
InnoDB Supplement to table
InnoDB Advantages of watch
- Convenient operation 、 Improve database performance 、 Low maintenance cost
- The server crashes due to hardware or software reasons , Then no additional operation is required after restarting the server ,InnoDB collapse The recovery function automatically finalizes the previously submitted content , Then undo the uncommitted process , After restart, continue to execute from the crash point
- InnoDB The storage engine maintains buffer pools in main memory , High frequency data will be processed directly in memory . This caching method can be applied to a variety of information , Speed up the processing process
- InnoDB It not only supports current reading and writing , It will also buffer the changed data to the data stream disk
- Create or delete indexes without affecting performance and availability
- When dealing with large amounts of data , InnoDB To two or morethings CPU, For maximum performance
InnoDB and ACID Model
1、 atom , Mainly involves InnoDB Business , And MySQL Related features mainly include :
- Auto submit settings
- COMMIT sentence
- ROLLBACK sentence
- operation INFORMATION_SCHEMA Table data in the library
2、 Uniformity , It mainly involves the internal protection of data from collapse InnoDB Treatment process , And MySQL Related features mainly include :
- InnoDB Double write cache
- InnoDB Crash recovery
3、 Isolation , The level applied to the transaction , And MySQL Related features mainly include :
- Auto submit settings
- SET ISOLATION LEVEL sentence
- InnoDB Low level information of lock
4、 Durability , ACID The durability of the model mainly involves the factors that interact with the hardware configuration MySQL Software features . Due to the complexity and diversity of hardware turn , There are no specific rules to follow in terms of durability . And MySQL Related features are :
- InnoDB Double write cache , adopt innodb_doublewrite Configuration Item Configuration
- Configuration item innodb_flush_log_at_trx_commit
- Configuration item sync_binlog
- Configuration item innodb_file_per_table
- Write cache of storage device
- Backup battery cache of storage device
- function MySQL Operating system of
- Continuous power supply
- Backup policy
- For distributed or hosted applications , The most important thing is the location of hardware equipment and network conditions
InnoDB framework
- Buffer pool : A portion of main memory , Used to cache used table and index data . Buffer pools make them often used Data can be obtained directly in memory , To speed up
- Change cache : A special data structure , When the affected index page is not in the cache , Changing the cache will cache the helper Change of leading page . The index page is loaded into the cache pool when it is read by other operations , The cached changes are merged
- adaptive hash index : Combine the load with enough memory , bring InnoDB Run like an in memory database , There is no need to reduce performance or reliability on transactions
- Redo log cache : Store the data to be put into the redo log , Periodically brush log files to disk , Large redo log cache Enables large transactions to run normally without writing to disk
- SYSTEM tablespace : Include InnoDB The data dictionary 、 Double write cache 、 Update cache and undo logs , It also includes tables and indexes data . Multi table sharing , System tablespaces are considered shared tablespaces
- Double write cache : Located in the system tablespace , Used to write data pages refreshed from the cache pool . Only when refreshing and writing to the double write cache after ,InnoDB Will write the data page to the appropriate location
- Undo log : A collection of transaction related undo records , Contains how to undo the most recent changes to the transaction
- The tablespace of one file per table : Each individual tablespace is created in its own data file , Not in the system tablespace , Each table space consists of a separate .ibd Count According to the document , The file is created in the database directory by default
- General tablespace : Use CREATE TABLESPACE Syntax to create shared InnoDB Table space . Common tablespaces can be created in MySQL Count Outside the directory, it can manage multiple tables and support all row format tables
- Undo tablespace : Consists of one or more files containing undo logs
- Temporary table space : Both user created temporary tablespaces and disk based internal temporary tables are created in temporary tablespaces
- Redo log : Data structure based on disk , Use... During crash recovery , Used to correct data . During normal operation , The redo log encodes the requested data , These requests will change InnoDB Table data . After an unexpected crash , Incomplete changes will be made from The action is repeated during initialization
Business
Business : A set of logical operating units , To change data from one state to another
Transaction Principles : Ensure that all transactions are executed as a unit of work
- When a transaction performs multiple operations , Or all transactions are committed (commit), These changes are permanently saved
- Or database management system (DBMS) Abandon all changes , Transaction rollback to initial state
The transaction ACID characteristic
- Atomicity (atomicity): Business is a An integral Work unit of , Or the modification succeeds , Or fail and roll back all
- undo log( Rollback log ) Guarantee
- Uniformity (consistency): Before and after the transaction , Data from a Legitimacy status Switch to another Legitimacy status
- persistence + Atomicity + Isolation, Guarantee
- Isolation, (isolation): The execution of one transaction cannot be interfered by other transactions . The operations and data pairs used within a transaction Concurrent Other transactions are isolated
- MVCC( Multi version concurrency control ) Or the locking mechanism guarantees
- persistence (durability): Once the transaction is committed , Changes to the data in the database are Permanence Of , Next, other operations and database failures should not have any impact on it
- redo log( Redo log ) Guarantee
The state of the transaction

- Activities (acitvie): The database operation corresponding to the transaction is in progress
- Partial submission (partially committed): When the last operation in the transaction is completed , Since all operations are performed in memory , The impact is not No refresh to disk
- Failure (failed): When the transaction is in Activities or Partial submission In the state of , Encountered some errors ( Error in the database itself 、 Operating system error 、 power failure ) And cannot continue , Or consider stopping the execution of the current transaction
- suspend (aborted): The transaction is partially executed and becomes Failure state , You need to restore the operations in the modified transaction to the state before the transaction is executed , Undo the impact of failed transactions on the current database . The cancellation process is called Roll back , When the rollback is completed , The database is restored to the state before execution , The business is in suspend state
- Submit (commited): When one is in Partial submission The transaction in status will all the modified data Sync to disk After that , The transaction is in Submit state
Problems caused by parallel transactions
MySQL The server allows multiple clients to connect , It means MySQL Multiple transactions can be processed at the same time
When processing multiple transactions at the same time , Dirty reading may occur (dirty read)、 It can't be read repeatedly (non-repeatable read)、 Fantasy reading (phantom read)
- Dirty reading : To uncommitted data of other transactions
- It can't be read repeatedly : The data read before and after is inconsistent
- Fantasy reading : The number of records read before and after is inconsistent
Severity sort : Dirty reading > Not sortable > Fantasy reading
The isolation level of the transaction
SQL The standard proposes four isolation levels to avoid these phenomena , Higher isolation level , The less efficient
- Read uncommitted (read uncommitted): When a transaction is not committed , The changes it makes can be seen by other transactions
- Read the submission (read committed): After a transaction is committed , The changes it makes can only be seen by other transactions
- Repeatable (repeatable read): Data seen during the execution of a transaction , It is consistent with the data seen when the transaction is started ,InnoDB Default isolation level
- Serialization (serializable): Put a read-write lock on the record , When multiple transactions read and write this record , If there is a read-write conflict , The post accessed transaction must wait for the execution of the previous transaction to complete , In order to proceed
For different isolation levels , The possible phenomena of concurrent transactions are as follows

How are these four isolation levels achieved ?
Read uncommitted : Read the latest data directly
Read the submission 、 Repeatable : adopt Read View Realization , Can be Read View Understood as a snapshot
- Read the submission stay Before each statement is executed Will regenerate a Read View
- Repeatable stay When starting a transaction Generate a Read View, Use this throughout the transaction Read View
Serialization : Add read-write lock
Read View stay MVCC Work in

Read View There are four important fields :
- Create the Read View The business of affairs id
- m_ids: establish Read View when , In the current database Active affairs The business of id list
- min_trx_id: establish Read View when , In the current database Active affairs in Business id Minimum The business of
- max_trx_id: No m_ids The maximum value in , It's about creating Read View When the current database should be given to the next transaction id value , It is also the largest transaction in the global transaction id value +1

For the use of InnoDB The database table of the storage engine , Its clustered index records contain the following two hidden columns :
- trx_id: When a transaction changes a clustered index record , The transaction of this transaction id Recorded in the trx_id Hide in the column
- roll_pointer: Every time a cluster index record is changed , Will write the records of the old version to undo In the log , The hidden column is a pointer , Point to each old version record , Through it, you can find the records before modification
Creating Read View after , Can be recorded in trx_id It is divided into three cases :
- trx_id < min_trx_id: Indicates that the record of this version is being created Read View Generated by a previously committed transaction , Therefore, the record of this version is visible to the current transaction
- trx_id > max_trx_id: Indicates that the record of this version is being created Read View Generated by the committed transaction , Therefore, the record of this version is applicable to the current transaction No so
- min_trx <= trx_id <= max_trx: You need to determine whether you are creating a Read View Inside m_ids In the list
- In the list , Indicates that the active transaction generating the version record is still active ( The transaction hasn't been committed yet ), This version records the current transaction No so
- Not in the list , Indicates that the active transaction generating the version record has been committed , This version of the record is visible to the current transaction
Such passage Version chain The behavior of controlling concurrent transactions to access the same record is called MVCC( Multi version concurrency control )
Indexes

MySQL The official definition of : help MySQL Data structure for efficient data acquisition
The picture below is MySQL Structure diagram , Indexes and data are in the storage engine

Advantages and disadvantages of index
advantage
- Improve the efficiency of data retrieval , Reduce the IO cost , The main reason for creating an index
- Create unique index , Ensure the uniqueness of each row of data in the database table
- In terms of reference integrity of data , The accelerometer is directly connected to the meter
- Significantly reduce the time of grouping and sorting in queries , Reduce CPU Consume
shortcoming
- Creating and maintaining indexes takes time
- Indexes need to occupy disk space . In addition to the data space occupied by the data table , Each index also takes up a certain amount of physical space , Stored on disk
- Reduce the speed of updating tables , When adding data in the table 、 Delete 、 When modifying , Indexes also need to be maintained dynamically
When ( No ) Need index
Need index
- Fields have uniqueness restrictions , Such as commodity code
- Often used for
whereThe field of the query condition , If query reconciliation is not a field , Joint index can be established - Often used for
GROUP BYandORDER BYField of , In this way, you don't need to sort again when querying , Because after indexing B+Tree The records in are sorted
There is no need to index
where、group by、order byFields used in , The value of an index is to locate quickly , If you cannot locate a field, you usually do not need to create an index- There is a lot of duplicate data in the field , Such as gender field , men and women .MySQL There is also a query optimizer , When the query optimizer finds that a value has a high percentage of the data rows in the table , It generally ignores indexes , Perform a full table scan
- When there is too little table data
- Frequently updated fields , Due to maintenance B+Tree The order of , It takes so long to rebuild the index frequently , This process will affect the database performance
Classification of indexes
It can be classified and indexed from four perspectives
- data structure :B+tree Indexes 、Hash Indexes 、Full-text Indexes
- Physical storage : Cluster index ( primary key )、 Secondary indexes ( Secondary index )
- Field properties : primary key 、 unique index 、 General index 、 Prefix index
- Number of fields : Single index 、 Joint index
Data structure classification
From the perspective of data structure ,MySQL Common indexes are B+Tree Indexes 、Hash Indexes 、Full-Text Indexes
The index types supported by each storage engine are not necessarily the same , Here is MySQL Index types supported by common storage engines in

InnoDB Is in MySQL 5.5 Then it becomes the default MySQL Storage engine ,B+Tree The index type is also MySQL The storage engine uses the most index types
When the table is created ,InnoDB The storage engine will select different columns as indexes according to different scenarios :
- If there is a primary key , By default, the primary key is used as the index key of the cluster index (key)
- If there is no primary key , Select the first one that does not contain NULL The unique column of the value is used as the index key of the cluster index (key)
- In the absence of both of the above ,InnoDB An implicit auto increment is automatically generated id Column as the index key of cluster index (key)
Other indexes are secondary indexes , Also known as a secondary index or non clustered index , The primary key index and secondary index created by default use B+Tree Indexes
The leaf node of the primary key index stores the actual data ; The leaf node of the secondary index stores the primary key value
Back to the table : First check the secondary index to find the one B+Tree The index of the value , Find the corresponding leaf node , Then get the primary key value , In the primary key index B+Tree The tree queries the corresponding leaf node , Then get the whole row of data , This process is called back to table , That means checking two B+Tree To find the data
Overlay index : When the data to be queried can be in the secondary index B+Tree Can be queried in the leaf node of ( Check the primary key value ), There is no need to check the primary key index , This process is called overlay indexing , That is, only one B+Tree You can find the data
Why? MySQL InnoDB choice B+Tree Data structure as index ?
B+Tree vs B Tree
B+Tree Store data only in leaf nodes , and B Trees The non leaf nodes of also store data , therefore B+Tree The data volume of a single node is smaller , On the same disk I/O Times , You can query more nodes ; in addition ,B+Tree Leaf nodes are connected by double linked lists , fit MySQL Common range based sequential lookup in , and B Trees can't do this
B+Tree vs Binary tree
For having N individual Leaf node B+Tree, The search complexity is O(logdN), among d Indicates that the maximum number of child nodes allowed by the node is d individual
in application ,d Value is greater than 100 Of , This ensures that even if the data reaches tens of millions ,B+Tree The height of the remains at 3~4 Around the floor , A data query operation only needs to be done 3~4 This disk I/O The operation can query the target data
The number of child nodes of each parent node of a binary tree can only be 2 individual , It means that the search complexity is O(logN), Zhubi B+Tree A lot higher , Therefore, the target data retrieved by binary tree goes through the disk I/O More times
B+Tree vs Hash
Hash It is very efficient when doing equivalent query , The search complexity is O(1), But it is not suitable for range query . This is also B+Tree Index is better than Hash The reason why table indexes have a broader application scenario
Sort by physical storage
From a physical storage perspective , The index is divided into : Cluster index ( primary key )、 Secondary indexes ( Secondary index )
The difference between the two is :
- Primary key index B+Tree The leaf node of is used to store the actual data , All complete user records are stored in the primary key index B+Tree In the leaf node of
- Secondary index B+ Tree The leaf node of stores the primary key value , Not the actual data
The secondary index is used in the query , If the queried data can be queried in the secondary index , You don't have to go back to the table , This process is called overlay indexing ; If the queried data is no longer in the secondary index , Then the secondary index will be retrieved first , Find the corresponding leaf node , Get the primary key value , Then retrieve the primary key index , You can query the data , This process is called return table
Sort by field properties
From the perspective of field properties , The index is divided into : primary key 、 unique index 、 General index 、 Prefix index
- primary key : Index based on primary key field , Tables are usually created together when they are created , A table has at most one primary key index , The value of an index column cannot have a null value
- unique index : Based on the UNIQUE Index on field , A table can have multiple unique indexes , The value of the index column must be unique , But you can have an empty value
- General index : An index based on a common field , Fields are not required to be primary keys , The field is not required to be UNIQUE
- Prefix index : An index of the first few characters of a character type field , Instead of indexing the entire field , Prefix index can be established when the field type is char、varchar、binary、varbinary On the list of . The purpose of using prefix index is to reduce the storage space occupied by the index , Improve query efficiency
Sort by the number of fields
In terms of the number of fields , The index is divided into : Single index 、 Joint index ( Composite index )
Single index : An index built on a single column is called a single column index , Such as primary key index
Joint index : An index built on multiple columns is called a federated index
Left most matching principle : Match the index in the leftmost first way
The premise of using the index is that key Is ordered
The leftmost matching policy of the union index is invalid : When a range query is encountered (>、<、between、like), Will stop matching , The range column can use the union index , But the column after the range column cannot use the union index
Index push down optimization : During the joint index traversal , Judge the fields contained in the union index first , Filter out unqualified records directly , Reduce the number of times to return to the table
Improve indexing efficiency : When building a federated index , Put the fields with large discrimination first , In this way, the fields with a high degree of discrimination are more likely to be used SQL Use to
Discrimination is a field column The number of different values divided by the total number of rows in the table
District branch degree = d i s t t i n c t ( c o l u m n ) c o u n t ( ∗ ) Degree of differentiation = \frac{disttinct(column)}{count(*)} District branch degree =count(∗)disttinct(column)
Index optimization
Common index optimization methods : Prefix index optimization 、 Coverage index optimization 、 The primary key index should be self incrementing 、 Prevent index invalidation
Prefix index optimization
Use the first few characters of a string in a field to build an index , Reduce the index field size , You can add an index page to store index values , Effectively improve the query speed of the index
Limitations of prefix indexing :
- orber by Cannot use prefix index
- The prefix index cannot be referenced as an overlay index
Coverage index optimization
Overlay index :SQL All fields queried in , In the index B+Tree You can find those indexes on the leaf nodes of , Query the records from the secondary index , It doesn't need to query the cluster index to get , You can avoid the operation of returning tables
The benefits of covering indexes : There is no need to query all the information including the whole line of records , It reduces a lot of I/O operation
The primary key index should be self incrementing
Build table , By default, the primary key index is set to auto increment
InnoDB Creating a primary key index defaults to a clustered index , The data is stored in B+Tree On the leaf node of . Each data in the same leaf node is stored in primary key order , Whenever a new piece of data is inserted , The database will insert it into the corresponding leaf node according to the primary key value
Use auto increment primary key : Each time new data is inserted, it will be added to the current inode in order , No need to move existing data , When the page is full , Will automatically open a new page . Every time Insert a new record , All are additional operations , No need to re move data , This method of inserting data once is very efficient
Use non self incrementing primary keys : Every time you insert a primary key, the index value is random , So every time you insert new data , It may be inserted into the middle of the existing data page , This will have to be moved to other data to satisfy the insertion of new data , You even need to copy data from one page to another , We usually call this situation Page splitting . Page splitting can cause a lot of memory fragmentation , The index structure is not compact , This will affect the query efficiency
The best index setting is NOT NULL
- Index column exists NULL This will cause the optimizer to make index selection more complicated , More difficult to optimize . For example, when making index statistics ,count Will be omitted as NULL The line of
- NULL Value is a meaningless value , But it takes up physical space , It will cause storage space problems ,InnoDB Default row storage format COMPACT, Will use 1 Byte space storage NULL List of values

Prevent index invalidation
Using an index does not mean that the index will be used when querying , So we know in our minds what can lead to index invalidation , So as to avoid writing out query statements with invalid indexes
Common index failures occur :
- When using left or left-right fuzzy matching , That is to say
like %xxperhapslike %xx%Both of these methods will cause index invalidation - The index column is calculated in the query criteria 、 function 、 Type conversion operation , In these cases, the index will be invalidated
- In order to use the joint index correctly, we need to follow the leftmost matching principle , Otherwise, the index will be invalidated
- stay WHERE clause , If in OR The condition column before is the index column , And in the OR The condition column after is not an index column , Then the index will fail

lock
The type of lock
According to the lock range , Can be divided into : Global lock 、 Table lock 、 Row lock
Global lock
Use global locks -> flush tables with read lock
After execution , The entire database is only read-only , Other threads do the following , Will be blocked :
- Add, delete and modify data , Such as insert、delete、update sentence
- Change the benchmarking structure , Such as alter table、drop table sentence
Release global lock -> unlock tables
Global lock application scenario ?
It is mainly used for full database logical backup , This is done during the backup of the database , Not because of the update of data or table structure , The data of the backup file is not the same as expected
The disadvantage of adding a global lock ?
The entire database is read-only . If there is a lot of data in the database , Backup will take a lot of time , During this period, the business can only read data , Instead of updating the data , Will cause business stagnation
Using global locks will affect the business , How to avoid ?
If transaction supported by database engine supports Repeatable isolation level , Then start the transaction before backing up the database , Will first create Read View, Then this is used throughout the execution of the transaction Read View, And because of MVCC Support for , During backup, the business can still update the data
Table lock
MySQL There are table level locks inside : Table locks ; Metadata lock (MDL); Intent locks ;AUTO-INC lock
Table locks :
Table level shared locks ( Read the lock ) -> lock tables t_student read;
Table level exclusive lock ( Write lock ) -> lock tables t_stuent wirte;
In addition to restricting the reading and writing of other threads , It also limits the subsequent read and write operations of this thread
Try to avoid using InnoDB Use table lock when engine tables , The particle size of the watch lock is too large , It will not affect the performance
Metadata lock (MDL):
No need to show the use of MDL, When we operate on database tables , Will automatically add MDL
- On a table CRUD In operation , Plus MDL Read the lock
- When you change the structure of a table , Plus MDL Write lock
MDL This is to ensure that when the user executes CRUD In operation , Prevent other threads from making changes to the table structure
- When there are threads to execute select sentence ( Add MDL Read the lock ) Period , If another thread wants to change the table structure ( apply MDL Write lock ), Then it will be blocked , Until the end of execution select sentence ( Release MDL Read the lock )
- When a thread changes the table structure ( Add MDL Write lock ) Period , If there are other threads executing CRUD operation ( apply MDL Read the lock ), Then it will be blocked , Until the table structure change is completed ( Release MDL Write lock )
MDL There's no need to show calls , So when was it released ? -> MDL It will not be released until the transaction is committed , During the execution of the transaction ,MDL Has been held
Why is the thread unable to apply MDL Write lock , Subsequent query operations that apply for read locks will be blocked ? -> apply MDL Lock operations form a queue , The priority of acquiring write lock in the queue is higher than that of reading lock
Intent locks :
The purpose of intention lock : Quickly determine whether any records in the table are locked
- In the use of InnoDB Add... To some records in the engine table Shared lock Before , You need to add a Intention sharing lock
- In the use of InnoDB Add... To some records in the engine table An exclusive lock Before , You need to add a Intent exclusive lock
- If there is no intention lock , So when you add an exclusive lock , You need to traverse all the records , Check whether the record has an exclusive lock , It's slow
- Intentional lock , So when you add an exclusive lock , Directly check whether the table intends to exclusive lock , No need to traverse the records in the table
Intention lock is a table lock , There is no conflict between intent locks ; It will conflict with the table lock ; It will not conflict with row level lock
AUTO-INC lock :
Special table locking mechanism , The lock is no longer a transaction and is released after it is committed , Instead, it will be released immediately after the insertion statement is executed
A transaction is holding AUTO-INC In the process of locking , Other transactions that insert statements into the table will be blocked , This ensures that when inserting data , By AUTO_INCREMENT Decorated fields are incremented
MySQL 5.1.22 Version start ,InnoDB The storage engine provides a lightweight lock to realize self incrementing
- When innodb_autoinc_lock_mode = 0, Just use AUTO-INC lock
- When innodb_autoinc_lock_mode = 2, Use lightweight locks
- When innodb_autoinc_lock_mode = 1, Two locks are mixed , To determine the number of records to insert, a lightweight lock is used , When uncertain, use AUTO-INC lock
Row-level locks
Row level locks fall into three categories :
-Record Lock( Record locks ): That is, just lock a record
-Gap Lock( Clearance lock ): Lock a range , Does not include the record itself
-Next-Key Lock:Record Lock + Gap Lock The combination of , Lock a range , Include the record itself
Lock rule
Row level locking rules are complex , Different scenarios have different locking forms ,next-key lock In some scenarios, it will degenerate into a record lock or a gap lock
Different versions may have different locking rules , following MySQL Version is 8.0.26
- Unique index equivalent query
- Query record exists , Degenerate to Record locks
- Query record does not exist , Degenerate to Clearance lock
- Non unique index equivalent query
- Query record exists , Extra Clearance lock
- Query record does not exist , Degenerate to clearance lock
Locking rules for range queries with unique and non unique indexes differ in that
- A unique index when some conditions are met , Degenerate to Clearance lock and Record locks
- Non unique indexes do not degenerate
journal
Classification of logs
MySQL There are different types of logs , Used to store different types of logs : Slow query log 、 General query log 、 Error log 、 Binary log . stay MySQL8 After that, two kinds of logs were added : relay logs 、 Data definition statement log
- Slow query log : Record all execution times over long_query_time Of all inquiries , It is convenient to optimize queries
- General query log : Record the start time and end time of the index connection , And connect all instructions sent to the database service , The actual scenario of the recovery operation 、 Find the problem 、 Audits of database operations are helpful
- Error log : Record MySQL Start of service 、 To run or stop MySQL Problems with service , It is convenient to understand the status of the server , So as to maintain the server
- Binary log : Record all statements that change data , Used for data synchronization between master and slave servers 、 Lossless recovery of data in case of server failure
- relay logs : Used for master-slave server architecture , A middleware file used by the slave server to store the binary log contents of the master server . Read the contents of the relay log from the server , To synchronize operations on the primary server
- Data definition statement log : Record the metadata operation performed by the data definition statement
In addition to binary logs , Other logs are text files . By default , All logs were created on MySQL In the data directory
Binary log
Three formats
binlog Three formats of :Row、Statement、Mixed
- Row: The default format , Don't record sql Statement context information , Save only the record and do not modify it ;
- advantage :row level The log content of will record the details of each row of data modification very clearly . And there will be no stored procedures under certain circumstances , or function、trigger The call and trigger of cannot be copied correctly
- Statement: Each one will modify the data sql It will be recorded in binlog in
- advantage : You don't need to record every line change , Less binlog Log volume , save IO, Improve performance
- Mixed: since 5.1.8 Versions began to be provided
Write mechanism
During transaction execution , Write the log to binlog cache, Transaction commit , And then binlog cache writes binlog In file . A business binlog Can't be taken apart , Ensure write once , The system will allocate a block of memory for each thread binlog cache

write and fsync The timing of , By the parameter sync_binlog control , The default is 0
- 0: Every time a transaction is committed, only write, It is up to the system to decide when to execute fsync. Although the performance has been improved , But the machine is down ,page cache Inside binlog Will lose
- 1: Every commit transaction is executed fsync, Like redo log The process of disc brushing is the same
- N: Every time a transaction is committed write, But cumulatively N After a business fsync
In the presence of I0 When the bottleneck is , take sync_binlog Set to a larger value , Can improve performance . alike , If the machine goes down, the opportunity is lost recently N A business binlog journal
binlog And reolog contrast
- redolog: Physical log ,InnoDB The storage engine layer generates , The record content is “ What changes have been made on a data page ”
- binlog: Logic log ,MySQL Server layer , The record content is “ The original logic of a statement ”
Two-phase commit
During the execution of the update statement , Will record redolog、binlog Two logs , In basic transactions .
redolog During the execution of the transaction, you can continuously write ,binlog Write only when the transaction is committed
To solve the problem of logical consistency between logs ,InnoDB The storage engine uses a two-phase commit scheme

Master slave copy
effect
- Read / write separation
- The data backup
- High availability
principle
In fact, the principle of master-slave synchronization is based on binlog For data synchronization . In the master-slave replication process , Will be based on 3 Two threads to operate , A main library thread , Two slave threads

- Binary log dump thread (Binglog dump thread) It's a main library thread , When connecting from a library thread , The master library can send the binary log to the slave library , When the main library reads Events , Will be in Binlog Lock it up , Release the lock after reading
- Slave Library I/O Threads connect to the main library , Send request update to main library Binlog. At this point from the library I/O The thread can read the binary log dump of the main library Binlog Update section , And copy it to the local relay log
- Slave Library SQL The thread reads the relay log from the library , And execute the events in the log , Keep the data in the slave library synchronized with the master library
Simply put, there are three steps :
- Master Record the write operation to binlog
- Slave take Master Of binlog Copy to its trunk log
- Slave Redo events in relay log , Apply the changes to your own database .MySQL Replication is asynchronous and serial , Start replication from the access point after restart
Synchronous data consistency
Master slave synchronization requirements
- Read the library 、 The data written to the library is consistent
- Write data must be written to the read library
- To read data, you must go to the Reading Library
The reason for the delay between master and slave
Under normal network conditions , The main source of the master-slave delay is the completion of the standby database receiving binlog And the time difference between the execution of the transaction
- The machine performance of the slave library is worse than that of the master library
- The pressure from the reservoir is high
- Execution of big business
Direct manifestation of master-slave delay : The slave database consumes relay logs faster than the master database produces binlog It's slow
A scheme to reduce master-slave delay
- Reduce the probability of concurrency of multithreaded large transactions , Optimize business logic
- Optimize SQL, Avoid slow SQL, Reduce batch operations , It is recommended to write scripts to update-sleep It's done in this way
- Improve slave machine configuration , Reduce the main database write binlog And reading from the library binlog Poor efficiency
- Try to use a short link , That is, the distance between the master database and the slave database server should be as short as possible , Increase port bandwidth , Reduce binlog Network delay of transmission
- The real-time businesses are forced to go to the main database , Only do disaster recovery from the library 、 Backup
Data consistency problem solving
If the data of the operation is stored in the same database , When updating data , Write lock can be applied to records , In this way, data inconsistency will not occur when reading , But the slave library is only used for backup , No read / write separation 、 Share the reading pressure of the main library

In the case of separation of reading and writing , Solve the problem of inconsistent data in master-slave synchronization , It is to solve the problem of data replication between master and slave . If data consistency is divided from "if" to "strong" , Yes 3 Copy mode in : Asynchronous replication 、 Semi-synchronous replication 、 Group replication
Asynchronous replication

Semi-synchronous replication

Group replication
Asynchronous replication 、 Semi synchronous replication cannot ultimately guarantee data consistency
Group replication technology ,MRG(MySQL Group Replication), On MySQL stay 5.7.17 Introduced a new data replication technology , be based on Paxos Protocol state machine replication
MGR How to work ?
Combine multiple nodes into a replication group , In the process of reading and writing (RW) In business , It needs to pass the consistency protocol layer , When the number of agreed nodes is greater than (N/2+1) Can only be submitted , For read-only (RO) Transactions do not require intra group consent , direct COMMIT that will do
There are multiple nodes in a replication group , Each maintains its own copy of the data , Atomic messages and globally ordered messages are implemented in the consistency protocol layer , So as to ensure the consistency of data within the group

Reference material
The illustration MySQL Introduce | Kobayashi coding (xiaolincoding.com)
边栏推荐
- Project training of Shandong University rendering engine system (IV)
- The process of "unmanned aquaculture"
- The C Programming Language(第 2 版) 笔记 / 8 UNIX 系统接口 / 8.6 实例(目录列表)
- acwing 2816. Judgement subsequence
- The C Programming Language(第 2 版) 笔记 / 7 输入与输出 / 7.8 其它函数
- acwing 801. Number of 1 in binary (bit operation)
- 1.delete
- MySQL - server configuration related problems
- Interview: why do integer wrapper classes try to use equals() to compare sizes
- 每日一题-890. 查找和替换模式
猜你喜欢

收藏 | 22个短视频学习Adobe Illustrator论文图形编辑和排版

Review of the development of China's medical beauty (medical beauty) industry in 2021: the supervision is becoming stricter, the market scale is expanding steadily, and the development prospect is bro

MySQL面试整理

generate pivot data 0

RTOS rt-thread裸机系统与多线程系统

Gopher to rust hot eye grammar ranking

Kill program errors in the cradle with spotbugs

pbootcms的if判断失效直接显示标签怎么回事?

线程池执行流程

RTOS RT thread bare metal system and multi thread system
随机推荐
acwing 801. Number of 1 in binary (bit operation)
Batch --03---cmdutil
js监听用户是否打开屏幕焦点
MySQL - server configuration related problems
acwing 790. The cubic root of a number (floating-point number in half)
MongoDB系列之SQL和NoSQL的区别
【研究】英文论文阅读——英语poor的研究人员的福利
d的sha6转大整
【DSP视频教程】DSP视频教程第8期:DSP库三角函数,C库三角函数和硬件三角函数的性能比较,以及与Matlab的精度比较(2022-06-04)
Interview: hashcode() and equals()
acwing 800. 数组元素的目标和
武汉大学甘菲课题组和南昌大学徐振江课题组联合招聘启事
d结构用作多维数组的索引
The C Programming Language(第 2 版) 笔记 / 8 UNIX 系统接口 / 8.7 实例(存储分配程序)
Global and Chinese market of soft capsule manufacturing equipment 2022-2028: Research Report on technology, participants, trends, market size and share
[fishing artifact] UI library second change lowcode tool -- List part (I) design and Implementation
Comprendre le go des modules go. MOD et go. SUM
批量--03---CmdUtil
acwing 高精度乘法
看《梦华录》上头的人都该尝试下这款抖音特效