当前位置:网站首页>Do you want to explore MySQL not in indexing?

Do you want to explore MySQL not in indexing?

2022-06-09 08:59:00 Dada Liu__

One 、 Create simulation data

Because creating analog data requires stored procedures , The last article has briefly introduced , I'm not going to repeat this .

Create table :

CREATE TABLE test (
id INT NOT NULL AUTO_INCREMENT,
second_key INT,
text VARCHAR(20),
field_4 VARCHAR(20),
status VARCHAR(10),
create_date date,
PRIMARY KEY (id),
KEY idx_second_key (second_key)
) Engine=InnoDB CHARSET=utf8;

Insert 100 Ten thousand data

call test_insert(1000000);

Some of the data are shown below :

mysql> select * from test.test limit 10;
+----+------------+------+------------+--------+-------------+
| id | second_key | text | field_4    | status | create_date |
+----+------------+------+------------+--------+-------------+
|  1 |          0 | t0   | 367a170042 | good   | 1974-04-02  |
|  2 |         10 | t1   | 14fcc361da | good   | 1981-02-06  |
|  3 |         20 | t2   | ad27ff39dd | good   | 1987-12-14  |
|  4 |         30 | t3   | cc25aba017 | good   | 1994-10-20  |
|  5 |         40 | t4   | ce6e4bacb1 | good   | 1974-04-10  |
|  6 |         50 | t5   | b0eb6d3801 | good   | 1981-02-13  |
|  7 |         60 | t6   | bb005167b1 | good   | 1987-12-21  |
|  8 |         70 | t7   | 37ea9bb71f | good   | 1994-10-27  |
|  9 |         80 | t8   | 300393e7e5 | good   | 1974-04-17  |
| 10 |         90 | t9   | 89e861ceb6 | good   | 1981-02-21  |
+----+------------+------+------------+--------+-------------+
10 rows in set (0.00 sec)

Two 、explain Detailed explanation

function explain select * from test\G Command us to get the following

mysql> explain select * from test\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 996473
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

We need to focus on these fields

  • type
  • rows
  • Extra

Let's explain one by one :
type Express MySQL The type executed when executing the current statement , There are several values system,const,eq_ref,ref,fulltext,ref_or_null,index_merge,unique_subquery,index_subquery,range,index,all.

The result value is from good to bad :
system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL , Generally speaking , Make sure that the query is at least range Level , It's better to achieve ref.

  1. system Relatively rare , When the engine is MyISAM perhaps Memory And there is only one record , Namely system, It means that it can be accessed precisely at the system level , This is unusual and can be ignored .
  2. const When the query hits the primary key or the unique secondary index matching . such as where id = 1
mysql> explain select * from test where id=1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
PS:const  When it is determined that there will be at most one line matching ,MySQL The optimizer will read it before the query and only once , So very fast . When the primary key is put in where When clause ,mysql Turn this query into a constant ( Efficient )
  1. eq_ref When you can use the primary key or unique index for equivalent matching when connecting tables .
mysql> explain select a.second_key from test a left join test1 b on a.id=b.id where a.second_key = 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
   partitions: NULL
         type: ref
possible_keys: idx_second_key
          key: idx_second_key
      key_len: 5
          ref: const
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.a.id
         rows: 1
     filtered: 100.00
        Extra: Using index
2 rows in set, 1 warning (0.00 sec)
  1. ref and ref_or_null, When a non unique index and a constant are matched equivalently . It's just ref_or_null Indicates that the query condition is where second_key is null
  2. fulltext, index_merge Infrequently skip .
  3. unique_subquery and index_subquery Indicates that the union statement uses in Statement hit the equivalent query of the unique index or the ordinary index .
  4. range Represents a range query using an index , such as where second_key > 10 and second_key < 90
mysql> explain select second_key from test  where second_key >10 and second_key < 90\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: range
possible_keys: idx_second_key
          key: idx_second_key
      key_len: 5
          ref: NULL
         rows: 7
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)
  1. index We hit the index , But you need to scan all the indexes .
mysql> explain select second_key from test\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: index
possible_keys: NULL
          key: idx_second_key
      key_len: 5
          ref: NULL
         rows: 996473
     filtered: 100.00
        Extra: Using index
1 row in set, 1 warning (0.00 sec)
  1. All, This is very intuitive , That is, no index is used , It's a full scan .

Let's move on rows,MySQL When executing a statement , Evaluate the expected number of rows to scan .
The last is the key content Extra, Although he is expanding . But it's important , Because he can better assist you in positioning MySQL How to execute this statement . Let's choose some key points to talk about .

  1. Using index, When our query conditions and returned contents are stored in the index , You can overwrite the index , There is no need to return the form , such as select second_key from test where second_key = 10
  2. Using index condition, Classic index push down , Although it hit the index , But it's not a strict match , You need to use the index for scanning comparison , Finally, go back to the table , such as explain select * from test where second_key > 9000000 and second_key like ‘%0’\G
mysql> explain select * from test where second_key > 9000000  and second_key like '%0'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: range
possible_keys: idx_second_key
          key: idx_second_key
      key_len: 5
          ref: NULL
         rows: 202716
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)
  1. Using where, When we use full table scanning , also Where When there is a condition to trigger a full table scan in , Will hit . such as select * from test where text = ‘t’
  2. Using filesort, The query did not hit any indexes , Need to sort in memory or hard disk , such as select * from test where text = ‘t’ order by text desc limit 10
mysql> explain select * from test where text = 't' order by text desc\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 996473
     filtered: 10.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

You can also find , Whether it's type still Extra, Their performance is getting worse and worse from the past , So we are optimizing SQL When , Try to optimize to the front . Well, here we have a brief introduction to the key words , But then we can analyze not in Whether the index is hit or not is almost the same . We need to know MySQL The index principle of . Here's a B+ Tree Index diagram of , It's also MySQL The principle of indexing .

3、 ... and 、 Index principle

MySQL Each index builds a tree , We should also be able to do what we have in mind “ Trees ”. So the two trees in my heart are like this .

  1. The first tree is the primary key index , every last Page Namely B+ The most important concept in the tree —— page , Here we also call it node . Non leaf node does not store data , Only store pointers to child nodes , The leaf node stores the primary key and all other column values . Each node links the left and right nodes through two-way pointers to form a two-way linked list , Each block within a page can be understood as a record , Multiple records in the page are linked by a one-way pointer , Form a single linked list , All pages and records within pages are incremented from left to right according to the primary key .
  2. The second tree is a secondary index , Non leaf node does not store data , Only store pointers to child nodes , Leaf nodes store secondary indexes and primary keys , All pages and records within pages are incremented from left to right according to the secondary index , These are the biggest differences from the primary key index , The rest is the same .
     Insert picture description here
     Insert picture description here

So let's start by analyzing the query principle of the index

select * from test where second_key = 40;

The query flow of this statement is :

  1. because second_key There is an index , So it's going to be idx_second_key The tree generated by the secondary index .
  2. clear through Page 1 The records we need to query are found in Page 12 In the leaf node to which it belongs .
  3. By inquiring Page 12 The records we need to query are found in Page 27 In node .
  4. from Page 27 Traverse from left to right within the node of , obtain 40 node .
  5. Get 40 The primary key stored in the node ID 4.
  6. Because there is no data in the secondary index , So you need to go back to the table , Re pass when returning to the table ID 4 lookup primary_key Primary key index tree .
  7. In the order just now , Finally find the content in Page 27 The nodes inside , return .

At the same time, let's run explain Check it out ,type yes ref, It is the equivalent matching of non unique index .

mysql> explain select * from test where second_key = 40 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: ref
possible_keys: idx_second_key
          key: idx_second_key
      key_len: 5
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

Above is a very simple query , So let's take a look at the slightly complicated .

select * from test where second_key > 10 and second_key < 50;

The query flow of this statement is :

  1. because second_key There is an index , So it's going to be idx_second_key The tree generated by the secondary index .
  2. Because the index is incremented from left to right , So let's look for second_key > 10, Through the previous explanation , We will locate Page 23 Of the 2 Nodes .
  3. Because the leaf node is a two-way linked list , So we don't need to look for anything else from the root node , Let's go through the comparison directly from left to right , Until content >= 50 stop it , So we can locate Page 16 Of the 1 Nodes stopped .
  4. So the result we get is Page 23 and Page 27 Of 20,30,40 node .
  5. Then go back to the table , Find... Separately 20,30,40 The corresponding primary key 2,3,4 The content of , Return the data .

Let's move on explain,type yes range Represents a range query using an index , Extra There's something in it .Using index condition Express range The query uses the index for comparison before returning to the table .

mysql> explain select * from test where second_key > 10 and second_key < 50 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: range
possible_keys: idx_second_key
          key: idx_second_key
      key_len: 5
          ref: NULL
         rows: 3
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

Four 、not in principle

well , Then we have entered the climax of this article , Do you know whether the following sentences go through the index ?

select * from test where second_key not in(10,30,50);
mysql> explain select * from test where second_key not in(10,30,50)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: ALL
possible_keys: idx_second_key
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 996473
     filtered: 50.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

You can see it , There is no index , Let's try another sentence

select second_key from test where second_key not in(10,30,50);

Try running it again , This time I left the index .

mysql> explain select second_key from test where second_key not in(10,30,50)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: range
possible_keys: idx_second_key
          key: idx_second_key
      key_len: 5
          ref: NULL
         rows: 498239
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

So why didn't you go to the index for the first time ?
MySQL It will be optimized when selecting indexes , If MySQL It is considered that full table scanning is better than index walking + High meter return efficiency , Then he will choose to scan the whole table . Back to our example , Full table scan rows yes 996473, There is no need to return the form ; But if you use the index , More than just scanning 498239 Time , You need to go back to the table 498239 Time , that MySQL It is considered that the performance consumption of repeated table returning is not as good as that of direct full table scanning , therefore MySQL The default optimization results in a direct full table scan .

So I just want to select * What if we go to the index ?

The first way :

select * from test where second_key not in(10,30,50) limit 5;

perform explain as follows :

mysql> explain select * from test  where second_key not in(10,30,50) limit 5\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: range
possible_keys: idx_second_key
          key: idx_second_key
      key_len: 5
          ref: NULL
         rows: 498239
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

because limit An increase in , Give Way MySQL During optimization, I found , Indexes + The performance of the back table is higher . therefore not in As long as the use is reasonable , It must be indexed , And in the real world , We have a lot of records ,MySQL Generally, it will not be evaluated ALL Higher performance .

The second way ( Force index ,force index ):

 select * from test force index(idx_second_key) where second_key not in(10,30,50);

perform explain as follows :

mysql> explain select * from test force index(idx_second_key) where second_key not in(10,30,50)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: range
possible_keys: idx_second_key
          key: idx_second_key
      key_len: 5
          ref: NULL
         rows: 498239
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

Finally, let's say not in Follow the principle of indexing , In this way, you can use it more confidently and boldly not in 了

select * from test where second_key not in(10,30,50) limit 5;

This statement is actually disassembled when it is actually executed

select * from test where 
(second_key < 10) 
or 
(second_key > 10 and second_key < 30) 
or 
(second_key > 30 and second_key < 50) 
or 
(second_key > 50);

After this statement is decomposed, it is equivalent to ,4 An open interval , Find the start node once respectively , Then you can search according to the index .

5、 ... and 、 summary

MySQL It will be optimized when selecting indexes , If MySQL It is considered that full table scanning is better than index walking + High meter return efficiency , Then he will choose to scan the whole table , If you think that the efficiency of indexing is high , So it must be indexed .

原网站

版权声明
本文为[Dada Liu__]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206090844147100.html