当前位置:网站首页>MySQL advanced - index optimization (super detailed)
MySQL advanced - index optimization (super detailed)
2022-06-30 18:10:00 【Noblegasesgoo】
Performance analysis
MySQL Query Optimizer
Mysql In particular Responsible for optimizing SELECT Statement optimizer , The main function is By calculating the statistics collected in the analysis system , Requested for client Query Provide the best execution plan he thinks ( He thinks the best , but ** Is not necessarily DBA Feel the best , This part takes the most time **).
When the client MySQL Ask for one Query When , The command parser module completes the request classification , The difference is SELECT And forward it to MySQL Query Optimizer when ,MQO For the whole Query To optimize , Deal with the budget of some constant expressions , Convert it to a constant value and correct the ,Query Simplification and transformation of query conditions in , Such as Remove some useless or obvious conditions 、 Structural adjustment etc. , Then analysis Query Medium Hint Information ( If any ), Look at the display Hint Information Is it possible to be completely certain that Query Implementation plan of . If No, Hint or Hint The information is not enough to fully determine the implementation plan , will Read the statistics of the object involved , according to Query Carry out corresponding calculation and analysis , Let's get the final implementation plan .
MySQL Common bottlenecks
CPU
- CPU In the bag and It usually occurs when data is loaded into memory or read from disk When
IO
- disk I/O Where is the bottleneck When the loaded data is much larger than the memory capacity .
Server hardware
- The performance bottleneck of hardware :top,free,iostat and vmstat To see the performance status of the system .
Explain( a key )
What is it?
Use EXPLAIN keyword Sure simulation Optimizer perform SQL Query statement , Never know MySQL How to deal with you SQL Of the statement , Analyze the performance bottleneck of your query statement or table structure .
stay MySQL 5.7
in , Merge and optimize the derived tables , If you want to view it intuitively select_type
Value , The function needs to be turned off temporarily ( The default is on )
-- Turn off the merge optimization of derived tables
set global optimizer_switch='derived_merge=off';
-- Turn off the merge optimization of derived tables ( Valid only for this session )
set session optimizer_switch='derived_merge=off';
-- Open the merge optimization of the derived table ( Valid only for this session )
set session optimizer_switch='derived_merge=on';
What can I do?
- Read order of tables
- Which indexes can be used
- Operation type of data read operation
- Which indexes are actually used
- References between tables
- How many rows per table are queried by the optimizer
How do you play?
Explain + SQL Sentence can be used .
Analysis of each field
Of the test case sql
CREATE TABLE t1(id INT(10) AUTO_INCREMENT,content VARCHAR(100) NULL , PRIMARY KEY (id));
CREATE TABLE t2(id INT(10) AUTO_INCREMENT,content VARCHAR(100) NULL , PRIMARY KEY (id));
CREATE TABLE t3(id INT(10) AUTO_INCREMENT,content VARCHAR(100) NULL , PRIMARY KEY (id));
CREATE TABLE t4(id INT(10) AUTO_INCREMENT,content VARCHAR(100) NULL , PRIMARY KEY (id));
INSERT INTO t1(content) VALUES(CONCAT('t1_',FLOOR(1+RAND()*1000)));
INSERT INTO t2(content) VALUES(CONCAT('t2_',FLOOR(1+RAND()*1000)));
INSERT INTO t3(content) VALUES(CONCAT('t3_',FLOOR(1+RAND()*1000)));
INSERT INTO t4(content) VALUES(CONCAT('t4_',FLOOR(1+RAND()*1000)));
id Field
select The serial number of the query , Contains a set of numbers , Express Execute in query select The order of clauses or operation tables .
Situation 1 :id identical
id identical , The order of execution is from top to bottom
EXPLAIN SELECT *
FROM t1, t2, t3
WHERE
t1.id = t2.id
AND t2.id = t3.id;
In this case , Yes where First statement after t1.id = t2.id adopt t1.id relation t2.id, and t2.id The result is Based on the t2.id = t3.id On the basis of .
It may also occur that the execution order is t1 -> t3 -> t2 The situation of .
Situation two :id Different
EXPLAIN SELECT t2.*
FROM t2
WHERE id = (SELECT t1.id
FROM t1
WHERE id = (
SELECT t3.id
FROM t3));
From the results we can see that , If it is Subquery ,id The serial number of will increase ,id The bigger the value is. , The higher the priority , The first to be executed .
It can be understood as , Priority of parentheses .
id Increasing -> id The bigger the value is. -> The higher the priority -> The first to be executed
Situation three :id Same and different , At the same time
EXPLAIN SELECT t2.*
FROM (SELECT t3.id
FROM t3) AS s1, t2
WHERE s1.id = t2.id;
id If the same , It can be thought of as a group , Execute from top to bottom in the group , So in all groups ,id The higher the value, the higher the priority , The group with higher priority , Execute first .
The result once shocked me , Because there is no derived table . I went to check , The occurrence of derived tables is not a good query scheme , A derived table is essentially a temporary table , If it is too large, you may have to create it on disk instead of in memory , Such creation and access will consume IO Resources , May be in MySQL5.7 Optimizer in , Will automatically help optimize derived tables , Merge derived tables into outer queries .
such as :
-- Before optimization
SELECT * FROM (SELECT * FROM t1) AS s1;
-- After optimization
SELECT * FROM t1;
select_type
Type of query , It is mainly used for Distinguish ordinary query , The joint query , Complex queries such as subqueries .
What are they?
That's all :
Let's take a look one by one .
- SIMPLE
ordinary SELECT Inquire about , The query does not contain subqueries or UNION
PRIMARY
If the query contains any Complex sub parts , The outermost query is marked PRIMARY, The last one .
5.7 Version of the optimizer directly optimizes , Here we use other people's pictures .
- DERIVER
stay FROM The subqueries contained in the list are marked with DERIVERD( The derived )MySQL These subqueries will be executed recursively , Put the results in the provisional table .
- SUBQUERY
stay SELECT or WHERE Package in the list Contains subqueries .
- DEPENDENT SUBQUERY(5.7 There is no )
stay SELECT or WHERE The list contains subqueries , Subqueries are based on the outermost layer
DEPENDENT SUBQUERY And SUBQUERY The difference between
Dependent subqueries : The subquery result is multivalued | Subquery : The query result is single value .
- UNCACHEABLE SUBQUREY(5.7 There is no )
Subqueries that cannot be cached , It is estimated that the temporary table is too large .
- UNION
If the second SELECT Appear in the UNION after , Is marked as UNION;
if UNION Included in FROM Clause , The outer SELECT Will be marked as DERIVED
- UNION RESULT
Use union
A temporary table will be generated after keywords , For this temporary table select
.
table
Table corresponding to row query .
- When
from
If there is a subquery in the clause ,table List as deriverN The format of , Indicates that this line executes id = N Query of row . - When there is
union
when ,table
The data is<union M,N>
The format of , M and N To participate inunion
Ofselect
That's ok id.
type( It's very important )
type Indicates the association type of this row query ( Access type , Or query type ), Through this value, you can know the approximate range of query data records in this row .
Its value is related to whether we have optimized SQL Is closely linked .
The common values from the best to the worst are :system > const > eq_fef > ref > range > index > all
;
In general , We want to To ensure efficiency , We should optimize our statements to at least range
Level , If possible, it is better to optimize to ref
; range
Level is generally used for range lookup , So in other words , In addition to range lookup , We Other query statements should be optimized to ref
Level .
What are they?
These are the only common :
Now let's take a look at how many situations will occur one by one :
NULL
- Express MySQL Be able to decompose query statements in optimization phase , There is no need to access tables and indexes during execution .
system / const
- MySQL It can optimize a query part and convert it into a constant ( Can pass
show warnings
View the optimization results ), It is mainly used to query the primary key (primary key
) Or a unique index (Unique Key
) Corresponding records , Because there is no repetition , So you can only query one record at most , So it's faster .system yes const The special case of , When When there is only one record in the temporary table, it is system.
- MySQL It can optimize a query part and convert it into a constant ( Can pass
eq_ref
- Unique index scan , For each index key , Only one record in the table matches it , Common in primary key or unique index scan .
ref
- Non unique index scan , Returns a match All rows of individual values , In essence, it is also an index access , It returns all rows that match a single value , However , It may find more than one eligible row , So it should be a mixture of search and scan .
range
- Retrieve only rows in the given range , Use an index to select rows ,
key
Column shows which index is used It's usually in yourwhere
In the sentencebetween、<>、in
And so on . - such Range scanning is better than full table scanning , Because it only Need to start at some point in the index , And it ends at another point in the index , Don't scan all indexes .
- Retrieve only rows in the given range , Use an index to select rows ,
index
Full Index Scan
,index
AndALL
The difference forindex
Type only traverses the index tree . This is usually better thanALL
fast , Because index files are usually smaller than data files .- That is to say, although
all
andindex
Read the whole watch , howeverindex
Read from index ,all
It's read from the hard disk .
all
Full Table Scan
, Will traverse the entire table to find the matching rows .
index_merge
- In the query process, we need to use multiple indexes in combination , It usually occurs when there is
or
Keywordssql
in .
- In the query process, we need to use multiple indexes in combination , It usually occurs when there is
ref_or_null
- For a field, both association conditions are required , Also needed
null
When it's worth it , The query optimizer will choose to useref_or_null
Link query . - Compare with the above , Primary key id In general, it can't be null So it's not ref_or_null.
- For a field, both association conditions are required , Also needed
index_subquery
- Using indexes to associate subqueries , No longer scan the full table .
unique_subquery
- The connection type is similar to
index_subquery
. Unique index in subquery .
- The connection type is similar to
possible_keys
- Show the indexes that may be applied to this table , One or more .
- If there is an index on the field involved in the query , Then the index will be listed , But it doesn't have to be actually used by the query .
key
- Actual index used . If null, No index is used
- If an overlay index is used in the query , Then the index and query select Fields overlap .
- in other words , When querying a field , And that field has a corresponding index ,key The value of is displayed as an index , instead of null.
key_len
- Represents the number of bytes used in the index , adopt This column calculates the length of the index used in the query , stay Without loss of accuracy , Usually the less the better .
- It shows the maximum possible length , and Not necessarily the actual length used .
- That is, it is calculated according to the table definition rather than through intra table retrieval .
- key_len Field can help you check whether you make full use of the index .
ref
- Shows which column of the index is used , If possible , It's a constant .
- Which columns or constants are used to find values on the index .
rows
- rows Columns show MySQL The number of rows that it must check to execute the query .
Extra( important )
contain Not suitable for displaying in other columns however Very important additional information .
Using filesort
- explain MySQL Will use a for the data External index sort , Instead of reading according to the index order in the table .
- MySQL in A sort operation that cannot be done with an index is called “ Sort in file ”.
- This situation is tantamount to a narrow escape .
- For example, composite index , Only one field of the composite index is used , So it can't be sorted , Will appear .
Using temporary
- Temporary tables are used to save intermediate results ,
MySQL
Use temporary tables when sorting query results . - It is common in sorting
order by
And group queriesgroup by
. - To be in Use index in grouping , You also have to follow the order in which indexes are defined , No castles in the air , Otherwise, it is easy to cause this Extra, By the way, bring another one on top Extra
- Temporary tables are used to save intermediate results ,
Using index
It means corresponding select Used in the operation Overlay index (Covering Index), Avoid the data row of the table , Good efficiency !
If it appears at the same time using where, indicate The index is only used to read data and Non index lookup .
Overlay index
- Simply put, it's me ** The fields of the built Composite Index , Exactly all the fields I'm looking for , And in the same order , It's actually an index scan INDEX**.
- The index is efficient to find that That's ok One way , however General databases can also use indexes to find one Column The data of , So it doesn't have to read the entire line , After all, index leaf nodes store the data they index , When you can get the desired data by reading the index , Then there is no need to read .
- ① An index ② Containing or covering select Clause and query condition where clause ③ All the required fields are called overlay indexes .
SELECT id , name FROM t_xxx WHERE age = 18;
- There is a composite index
idx_id_name_age_xxx
Containsid,name,age
Three fields . When querying, the indexed columns are directly retrieved , Instead of looking for other data in the row , More efficient . - It feels that it can be used for a large amount of data , And this kind of index can be used for many fixed field queries .
- Be careful : If you want to Use overlay index , Be sure to pay attention to
select list
Only the required columns are extracted from the , And the columns are included in the coverage index . Can not beselect *
, If all fields are indexed together, the index file will be too large , Query performance degradation .
Using where
- indicate Used where Filter .
using join buffer
indicate Connection caching is used .
Occurs when two tables are joined ,
- The driver table (join Which side is the former simplicity , Which side is the drive table ,inner join The table with less data ) Without index , Indexing the driver table solves this problem , also type Will become ref.
join More , In the configuration file join buffer You can turn it up a little bit .
impossible where
- indicate where The value of the clause is always false, Can't be used to get any tuples .
select tables optimized away
- In the absence of
GROUP BY
In the case of clause , Index based optimization MIN/MAX Operation or for MyISAM Storage engine optimization COUNT(*) operation , You don't have to wait for the execution phase to calculate , The query execution plan generation phase completes the optimization .
- In the absence of
Example
Running order ?
t2 -> t1 -> t3 -> <derived3> -> <union1, 4>
Index optimization
Optimization analysis
Performance degradation SQL slow , Long execution time , The reasons for the long waiting time are as follows :
- The query statement is poorly written
- No index , Too many subqueries .
- Index failure
- It was indexed but not used , Or not indexed .
- Too many associated queries join
- It may be caused by the design defect of the database , Or a last resort .
- Often cause Long execution time .
- Server tuning and parameter settings
- buffer 、 Number of threads, etc .
- It often leads to the maximum number of concurrency The waiting time is long .
Single table optimization
Build table SQL
CREATE TABLE IF NOT EXISTS `article` (
`id` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
`author_id` INT(10) UNSIGNED NOT NULL,
`category_id` INT(10) UNSIGNED NOT NULL,
`views` INT(10) UNSIGNED NOT NULL,
`comments` INT(10) UNSIGNED NOT NULL,
`title` VARBINARY(255) NOT NULL,
`content` TEXT NOT NULL
);
INSERT INTO `article`(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
SELECT * FROM article;
Case realization SQL
-- Inquire about category_id by 1 And comments Greater than 1 Under the circumstances ,views The most article_id
SELECT id,author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;
Optimization starts
EXPLAIN SELECT `id`,`author_id` FROM `article` WHERE `category_id` = 1 AND `comments` > 1 ORDER BY `views` DESC LIMIT 1;
We can see , At this point we type yes ALL( Full table scan ), And it is a simple single table query , however Using filesort 了 , Put it all together , This is the worst case .
We can index query types to range Direction depends ;
-- Build composite index ALTER TABLE `article` ADD `index` idx_article_ccv(`category_id`,`comments`,`views`);
We can see that it has not been eliminated Using filesort however range It is already within the acceptable range .
Why do we all have indexes , But it's still file sorting , There is no index sort ?
- Because the BTree How index works , Prioritize category_id Reordering comments, If the same comments Then reorder view,
- Now our SQL In the sentence comments Field in Composite index of In the middle ,
- here comments > 1 The condition is a Range ( So it is range),
- therefore MySQL We can't use the index to look at the following views Partial search , namely range Invalid index after type query field .
After the analysis, let's try to build a composite index across the fields that need a range query .
-- Delete the first index DROP INDEX idx_article_ccv ON article; -- Create a new index CREATE INDEX idx_article_cv ON article(`category_id`,`views`);
The last analysis :
It can be seen that this is the result we want most ,type = ref,ref = const,Extra Medium Using filesort It's gone .
Association query optimization
Build table SQL
CREATE TABLE IF NOT EXISTS `class` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`card` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE IF NOT EXISTS `book` (
`bookid` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`card` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`bookid`)
);
CREATE TABLE IF NOT EXISTS `phone` (
`phoneid` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`card` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`phoneid`)
);
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO class(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
INSERT INTO phone(card) VALUES(FLOOR(1 + (RAND() * 20)));
Case a : Double table
Case realization SQL
-- Double table
SELECT * FROM class LEFT JOIN book ON class.card = book.card;
Optimization starts
-- Analyze the following statements
EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;
It's coming straight ALL and Using join buffer, This is what we do not want , We are heading for ref | range as well as const To optimize .
Try adding an index to eliminate ALL.
ALTER TABLE `book` ADD INDEX idx_c(`card`);
We can see that the second line is ref and Eliminated Using join buffer 了 , But the first line is still ALL. This is from The left connection feature determines .LEFT JOIN Conditions apply to Determine how to search rows from the right table , There must be... On the left , So on the right is our key point , There must be an index .
Let's see if there is any way to divide the first line ALL, Try... In the left table card Field creation index .
ALTER TABLE `class` ADD INDEX idx_c(`card`);
Effect grouping , Successfully optimized the first line of ALL by index , Can accept .
But generally we Just optimize to the third step , Multiple indexes may also take up space , Try to avoid excessive consumption of space resources , because rows The indicators are the same , We might as well save more space .
Case 2 : Three watches
Case realization SQL
-- Delete the redundant index first
DROP INDEX idx_c ON class;
DROP INDEX idx_c ON book;
-- This case is to be analyzed SQL
SELECT * FROM class LEFT JOIN book ON class.card = book.card LEFT JOIN phone ON book.card = phone.card;
Optimization starts
First analyze this statement
EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card LEFT JOIN phone ON book.card = phone.card;
It can be seen that , This is a normal statement that has not been optimized , It can be like a double watch , Optimize according to that set of rules .
We start adding indexes to two right tables that participate in the corresponding left join
-- Add a new index ALTER TABLE `book` ADD INDEX B (`card`); ALTER TABLE `phone` ADD INDEX P (`card`);
Optimized to an acceptable level .
MySQL Of FILESORT Sort ( a key )
One way sorting
What is it? ?
from ** disk Read all the columns required by the query **, according to ORDER BY Column stay buffer Sort them out , Then scan the sorted list for output , its Be more efficient , Avoid reading data for the second time . also Put random IO It becomes a sequence IO, But he Will use more space , Because it Save each line in memory 了 .
Two way sorting
What is it? ?
MySQL 4.1 Before that, we used two-way sorting , Literally Scan the disk twice , And finally get the data . Read Row pointer and ORDER BY Column , Sort them out , Then scan the ordered list , Re read the corresponding data output from the list according to the value in the list .
Take sort field from disk , stay buffer Sort , Take other fields from disk .
Compare
- Multiple sorting , You need to sort by disk , So take the data 、 When you're ready to get the data , Twice IO operation , It will be slow .
- One way sorting , Store the arranged data in memory , One time is omitted IO operation , So it will be faster , however You need enough memory on your computer .
Conclusion and question
One way sorting is the last out , So the overall performance is better than two-way sorting .
But in sort_buffer in , Method B Than Method A Take up a lot of space , because Method B yes Take out all the fields ,, therefore ** It is possible that the total size of the retrieved data exceeds sort_buffer The capacity of , Cause only access sort_buffer Data of capacity size **, Sort ( establish tmp file , Multiplex merge ), Take it after you've finished sort_buffer Capacity size , Arrange again …… So many times I/O.
I wanted to save it once I/O operation , Instead, it leads to a lot of I/O operation , It's not worth it .
So we need to transport peacekeepers at this time DBA appear Adjust according to the business MySQL In the database configuration file sort_buffer Size .
ORDER BY Optimize
MySQL Support two ways of sorting ,FileSort and Index,Index Efficient , It means MySQL Scan index itself to complete sorting .FileSort The way is less efficient .
We are starting to optimize order by Before clause , We need to know what stage we are going to optimize , about ORDER BY Sort , Try to optimize to INDEX The best way is to sort , avoid FILESORT.
case analysis
Build table SQL
CREATE TABLE tblA(
id int primary key not null auto_increment,
age INT,
birth TIMESTAMP NOT NULL,
name varchar(200)
);
INSERT INTO tblA(age,birth,name) VALUES(22,NOW(),'abc');
INSERT INTO tblA(age,birth,name) VALUES(23,NOW(),'bcd');
INSERT INTO tblA(age,birth,name) VALUES(24,NOW(),'def');
CREATE INDEX idx_A_ageBirth ON tblA(age,birth,name);
SELECT * FROM tblA;
Situation 1
The first sentence : The leader is in , however Sort according to the younger brother , Big brother is in the back , The order is out of order , appear filesort.
The second sentence : The leader is in , And the eldest brother and the second younger brother .
The third sentence : The leader is in , And use the big brother that has been indexed to sort .
The fourth sentence : The leader is in , But use younger brother sorting , be not in where The index fields that appear in the clause are order by An occurrence in a clause will invalidate the index , appear filesort.
Situation two
The first sentence : Index sort invalid , Because it doesn't start with big brother .
The second sentence : Index sort invalid , Because it doesn't start with big brother .
The third sentence : Index sorting succeeded , because WHERE Clauses come in indexed field order and are not invalidated until they are scoped , So there is no full table scan , also ,ORDER BY The clause begins with big brother .
The fourth sentence : Index sort invalid , Because the index is sorted by default , Then you force the reverse order , Can cause indexes to fail .
Case summary
- about ORDER BY Sort , Try to optimize to INDEX The best way is to sort , avoid FILESORT.
- Sort as much as possible on the index columns , Follow the best left prefix rule for indexes .
- If ** Not in index column On ,filesort There are two algorithms :mysql It's about to start Two way sorting and One way sorting **.
- ORDER BY There are two cases , Will use INDEX Sort by :
- ORDER BY sentence Use the leftmost column of the index .
- Use WHERE Clause and ORDER BY Clause The condition column combination satisfies Index leftmost prefix rule .
- WHERE Clause if index range query appears ( namely explain It appears that range It can lead to )order by Index failure .
GROUP BY Optimize
Let's start with an example ,GROUP BY grouping , It must be sorted before grouping , There will be temporary tables
Prevent index failure
Case study SQL
CREATE TABLE staffs (
id INT PRIMARY KEY AUTO_INCREMENT,
NAME VARCHAR (24) NULL DEFAULT '' COMMENT ' full name ',
age INT NOT NULL DEFAULT 0 COMMENT ' Age ',
pos VARCHAR (20) NOT NULL DEFAULT '' COMMENT ' Position ',
add_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT ' Entry time '
) CHARSET utf8 COMMENT ' Employee record form ' ;
INSERT INTO staffs(NAME,age,pos,add_time) VALUES('z3',22,'manager',NOW());
INSERT INTO staffs(NAME,age,pos,add_time) VALUES('July',23,'dev',NOW());
INSERT INTO staffs(NAME,age,pos,add_time) VALUES('2000',23,'dev',NOW());
INSERT INTO staffs(NAME,age,pos,add_time) VALUES(null,23,'dev',NOW());
ALTER TABLE staffs ADD INDEX idx_staffs_nameAgePos(name, age, pos);
The leftmost prefix rule ( a key )
If more than one column is indexed , Follow the leftmost prefix rule . The leftmost prefix rule refer to ** The query starts at the top left of the index and does not skip the columns in the index .**
Start with the first condition on the far left , If an index is used , Then you have to start with the leftmost field of the index , You can't lose , Don't omit , It's like a building , The third floor cannot be built without the first floor and the second floor .
The middle brother can't break , It is impossible to build the third floor directly after the first floor .
Indexes idx_staffs_nameAgePos When indexing , With name,age ,pos The order of the establishment of . A full value match indicates Queries that match in order .
therefore According to the leftmost prefix rule , It will reduce the probability of index failure .
EXPLAIN SELECT * FROM staffs WHERE NAME = 'July';
EXPLAIN SELECT * FROM staffs WHERE NAME = 'July' AND age = 25;
-- Full value matching my favorite
EXPLAIN SELECT * FROM staffs WHERE NAME = 'July'AND age = 25 AND pos = 'dev';
Do not operate on index columns
Do nothing on the index column , such as Calculation 、 function 、 Automatically | Manual type conversion , because Doing so invalidates the index and leads to a full table scan .
-- Wrong operation application left Functions operate on index fields
EXPLAIN SELECT * FROM staffs WHERE left(NAME,4) = 'July';
Storage engine problem
The storage engine cannot use the column to the right of the range condition in the index
Compare the two pictures , We found that , The second uses only two index fields , The index of the third field is invalid , All invalid after range condition field , So we have to optimize this range query as much as possible .
Try to use index overlay
Reduce use SELECT *
, As the title .
MySQL Use is not equal to ( important )
MySQL Use It's not equal to (!= | <>) You will not be able to use the index , Causes a full table scan , Attention is being taken **!= operation ** The index field of cannot be used , All indexes are invalid .
is not null and is null
The former cannot be indexed , The latter can be indexed .
To use less like keyword ( a key )
**like Start with a wildcard (‘%xxxx’)** Such conditions ,MySQL The index will fail and become a full table scan .
like ‘abc%’
type The type is range , It's the scope , You can use index .
But the inevitable use %xxx% What to do with the scene ?
- We can use Index overlay To solve .
CREATE TABLE `tbl_user` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`NAME` VARCHAR(20) DEFAULT NULL,
`age` INT(11) DEFAULT NULL,
email VARCHAR(20) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
INSERT INTO tbl_user(NAME,age,email) VALUES('1aa1',21,'[email protected]');
INSERT INTO tbl_user(NAME,age,email) VALUES('2aa2',222,'[email protected]');
INSERT INTO tbl_user(NAME,age,email) VALUES('3aa3',265,'[email protected]');
INSERT INTO tbl_user(NAME,age,email) VALUES('4aa4',21,'[email protected]');
INSERT INTO tbl_user(NAME,age,email) VALUES('aa',121,'[email protected]');
-- before index
-- There is no index , So it doesn't matter how the following statements come from
EXPLAIN SELECT NAME,age FROM tbl_user WHERE NAME LIKE '%aa%';
EXPLAIN SELECT id FROM tbl_user WHERE NAME LIKE '%aa%';
EXPLAIN SELECT NAME FROM tbl_user WHERE NAME LIKE '%aa%';
EXPLAIN SELECT age FROM tbl_user WHERE NAME LIKE '%aa%';
EXPLAIN SELECT id,NAME FROM tbl_user WHERE NAME LIKE '%aa%';
EXPLAIN SELECT id,NAME,age FROM tbl_user WHERE NAME LIKE '%aa%';
EXPLAIN SELECT NAME,age FROM tbl_user WHERE NAME LIKE '%aa%';
EXPLAIN SELECT * FROM tbl_user WHERE NAME LIKE '%aa%';
EXPLAIN SELECT id,NAME,age,email FROM tbl_user WHERE NAME LIKE '%aa%';
-- create index Create a composite index
CREATE INDEX idx_user_nameAge ON tbl_user(NAME,age);
-- DROP INDEX idx_user_nameAge ON tbl_user
-- after index
EXPLAIN SELECT * FROM tbl_user WHERE NAME =800 AND age = 33;
To sum up, to solve this problem , We have to build a composite index , Then make a query within the index field , For example, composite index a,b,c
Three fields , You can only use d Field , perhaps a,b,c,d
Four fields ( One more field comes out ) No index other than is invalidated .
Type conversion
Type conversion will cause index invalidation , such as String type == No addition Single quotation marks == Meeting Automatic type Convert to another type .
To use less or keyword
or Also need to use less , use or Keyword will also invalidate the index .
summary
Optimization summary chart ( important )
Example summary diagram
General advice
- For single key indexes , Try to choose the current query Better filtering index .
- In the choice Composite index When , At present query The most filterable field in the index field order , The higher the position, the better .
- When choosing a composite index , Try to include the current query Medium where An index of more fields .
- As much as possible by analyzing statistics and adjusting query To achieve the purpose of selecting the appropriate index .
Summary of association query optimization
Make sure that the driven meter join The field has been indexed ( Was the driver table :join Table after is driven table ( Need to be queried )).
MySQL Automatically select the small table as the driving table . Because the driver table will be scanned by the whole table anyway , So the fewer scans, the better
left join when , Choose a small watch as the driving watch , Big watch as driven watch ,right join conversely ( however left join It must be the driving table on the left , On the right is the driven table ,right join conversely ).
inner join when ,mysql Will help you choose the table of small result set as the driving table .
Subqueries should not be placed in the driven table , It's possible that indexes are not used .
Reduce... As much as possible join The number of times .
Always remember to use Small tables drive large tables .
Optimize statements in nested loops first .
There is no guarantee that the driven table join The condition field is called Cited and On the premise of sufficient memory resources , Don't be too stingy JoinBuffer Set up .
EXISTS And IN Subquery optimization summary
With an index :
- use inner join It's the best The second is in ,exists The worst .
- inner join > in > exists.
No index
- Small tables drive large tables
- because join The way needs to be distinct , No index distinct High consumption performance therefore exists > in > join.
- Big tables drive small ones
- in and exists The performance of the should be close to that of the are relatively poor exists A little better More than 5% , however inner join Better than using join buffer So much faster .
- inner join > exists > in
- If it is left join Is the slowest .
- Small tables drive large tables
The content of this article is based on Shang Silicon Valley MySQL Advanced tutorials .
Code cloud warehouse synchronization notes , You can take it yourself. Welcome star correct :https://gitee.com/noblegasesgoo/notes
If something goes wrong, I hope the leaders in the comment area can discuss and correct each other , Maintain the health of the community. Let's work together , There is no tolerance for wrong knowledge .
—————————————————————— Love you noblegasesgoo
边栏推荐
- Analysis on the construction scheme and necessity of constructing expressway video monitoring platform
- 现在玩期货需要注意什么,在哪里开户比较安全,我第一次接触
- Develop those things: how to add text watermarks to videos?
- 联想“双平台”运维解决方案 助力智慧医疗行业智慧管理能力全面提升
- 基于SSM的新闻管理系统
- C语言结构体
- [Architecture] 1366- how to draw an excellent architecture diagram
- Word中添加代码块(转载)
- 基于SSH的网上商城设计
- Switching routing (VLAN) experiment
猜你喜欢
Conception d'un centre commercial en ligne basé sur SSH
Redis (VI) - master-slave replication
[零基础学IoT Pwn] 环境搭建
Send the injured baby for emergency medical treatment. Didi's driver ran five red lights in a row
vue3 响应式数据库—— reactive
New research of HKUST & MsrA: about image to image conversion, finishing is all you need
How to write a technical proposal
Fragmentary knowledge points of MySQL
腾讯持久化框架MMKV原理探究
ABAP publish restful service
随机推荐
同济、阿里的CVPR 2022最佳学生论文奖研究了什么?这是一作的解读
Inventory in the first half of 2022: summary of major updates and technical points of 20+ mainstream databases
[bjdctf2020]the mystery of ip|[ciscn2019 southeast China division]web11|ssti injection
DeFi借贷协议机制对比:Euler、Compound、Aave和Rari Capital
[zero basic IOT pwn] environment construction
港科大&MSRA新研究:关于图像到图像转换,Finetuning is all you need
Advanced Mathematics (Seventh Edition) Tongji University General exercises one person solution
Small Tools(3) 集成Knife4j3.0.3接口文档
LRN local response normalization
Redis (VII) - sentry
Customer relationship CRM management system based on SSH
[machine learning] K-means clustering analysis
[sword finger offer] 53 - I. find the number I in the sorted array
基於SSH的網上商城設計
Taishan Office Technology Lecture: how to align and draw words of different sizes on the same line
Share 5 commonly used feature selection methods, and you must see them when you get started with machine learning!!!
What will be the game changes brought about by the meta universe?
TFTP download kernel, NFS mount file system
Shortcut keys for the rainbow brackets plug-in
MIT science and Technology Review released the list of innovators under the age of 35 in 2022, including alphafold authors, etc