当前位置:网站首页>MySQL index

MySQL index

2022-06-12 23:52:00 InfoQ

Indexes

Index is help  MySQL 
Efficient access to data
Of
data structure ( Orderly )
. Out of data , The database system also maintains a data structure that satisfies a specific search algorithm , These data structures are referenced in some way ( Point to ) data , In this way, advanced query algorithms can be implemented on these data structures , This data structure is the index .

Advantages and disadvantages :

advantage :

  • Improve the efficiency of data retrieval , Reduce the IO cost
  • Sort data through index columns , Reduce the cost of sorting data , Reduce CPU Consumption of

shortcoming :

  • Index columns also take up space
  • Indexing greatly improves query efficiency , But it reduces the speed of update , such as  INSERT、UPDATE、DELETE

Index structure




B-Tree
 Binary tree
The disadvantage of binary tree can be solved by red black tree :

 Red and black trees
Red and black trees also have a large amount of data , Deeper levels , The problem of slow retrieval speed .

In order to solve the above problems , have access to  B-Tree  structure .B-Tree ( Multiway balanced search tree )  At a maximum degree (max-degree, Refers to the number of child nodes of a node ) by 5(5 rank ) Of  b-tree  For example ( Each node can store up to 4 individual key,5 A pointer to the )

B-Tree structure
B-Tree  Data insertion process animation reference :https://www.bilibili.com/video/BV1Kr4y1i7ru?p=68 Demo address :https://www.cs.usfca.edu/~galles/visualization/BTree.html
B+Tree
chart :

B+Tree chart
Demo address :https://www.cs.usfca.edu/~galles/visualization/BPlusTree.html

And  B-Tree  The difference between :

  • All the data will appear in the leaf node
  • Leaf nodes form a one-way linked list

MySQL  Index data structure for classic  B+Tree  optimized . In the original  B+Tree  On the basis of , Add a pointer to the linked list of adjacent leaf nodes , So we have a sequence pointer  B+Tree, Improve the performance of interval access .

MySQL B+Tree chart
Hash
Hash index is to use a certain hash Algorithm , Convert key values to new hash value , Map to the corresponding slot , Then stored in hash In the table . If two ( Or more ) Key value , Map to the same slot , They produced hash Conflict ( Also known as hash Collision ), It can be solved by linked list .

Hash Index schematic
characteristic :

  • Hash Indexes can only be used for peer-to-peer comparisons (=、in), Range query is not supported (betwwn、>、<、...)
  • Cannot complete sort operation with index
  • High query efficiency , Usually only one search is needed , Efficiency is usually higher than  B+Tree  Indexes

Storage engine support :

  • Memory
  • InnoDB:  It has adaptive function hash function ,hash The index is the storage engine  B+Tree  The index is automatically built under specified conditions
Interview questions
  • Why?  InnoDB  The storage engine chooses to use  B+Tree  Index structure ?

  • Relative to a binary tree , Fewer levels , High search efficiency
  • about  B-Tree, Whether leaf nodes or non leaf nodes , Data will be saved , This results in fewer key values stored in a page , The pointer decreases as well , Save a lot of data as well , Can only increase the height of the tree , Resulting in reduced performance
  • be relative to  Hash  Indexes ,B+Tree  Support range matching and sorting operations

Index classification



stay  InnoDB  In the storage engine , According to the storage form of the index , It can be divided into the following two types :



Illustration :

 General principles
 Illustration
Clustered index selection rules :

  • If there is a primary key , A primary key index is a clustered index
  • If there is no primary key , The first unique... Will be used (UNIQUE) Index as clustered index
  • If the table does not have a primary key or an appropriate unique index , be  InnoDB  It will automatically generate a  rowid  As a hidden clustered index
Thinking questions
1.  following  SQL  sentence , Which is more efficient ? Why? ?

select * from user where id = 10;
select * from user where name = 'Arm';
--  remarks :id Primary key ,name Fields are indexed

answer : The first sentence , Because the second item needs to be queried back to the table , Equivalent to two steps .

2. InnoDB  Primary key index  B+Tree  What's the height ?

answer : Suppose the size of a row of data is 1k, A page can store 16 Line this data .InnoDB  Pointer occupation of 6 Bytes of space , The primary key is assumed to be bigint, The number of bytes occupied is 8. We can get the formula :
n * 8 + (n + 1) * 6 = 16 * 1024
, among  8  Express  bigint  Number of bytes occupied ,n  Represents the storage of the current node key The number of ,(n + 1)  Indicates the number of pointers ( Than key More than a ). Work out n about 1170.

If the height of the tree is 2, Then the amount of data he can store is about :
1171 * 16 = 18736
; If the height of the tree is 3, Then the amount of data he can store is about :
1171 * 1171 * 16 = 21939856
.

in addition , If there are thousands of data , Then we should consider the sub table , Knowledge involved in operation and maintenance .

grammar

Create index :
CREATE [ UNIQUE | FULLTEXT ] INDEX index_name ON table_name (index_col_name, ...);
If not  CREATE  Do not add index type parameters after , Then a general index is created

Look at the index :
SHOW INDEX FROM table_name;

Delete index :
DROP INDEX index_name ON table_name;

Case study :

-- name The field is the name field , The value of this field may be repeated , Create an index for this field
create index idx_user_name on tb_user(name);
-- phone The value of the mobile phone number field is not empty , And only , Create a unique index for this field
create unique index idx_user_phone on tb_user (phone);
--  by profession, age, status Create a federated index
create index idx_user_pro_age_stat on tb_user(profession, age, status);
--  by email Establish an appropriate index to improve query efficiency
create index idx_user_email on tb_user(email);

--  Delete index
drop index idx_user_email on tb_user;

Usage rule

The leftmost prefix rule
If the index is associated with multiple columns ( Joint index ), Follow the leftmost prefix rule , The leftmost prefix rule means that the query starts from the leftmost column of the index , And don't skip columns in the index . If you jump a column , The index will be partially invalidated ( The following field index is invalid ).

In the union index , Range query appears (<, >), The column index on the right side of the range query is invalid . It can be used >= perhaps <= To avoid index failure .
Index failure
  • Perform operations on index columns , Index will fail . Such as :
    explain select * from tb_user where substring(phone, 10, 2) = '15';
  • When using string type fields , Without quotes , Index will fail . Such as :
    explain select * from tb_user where phone = 17799990015;
    , here phone The value of is not quoted
  • Fuzzy query , If it's just tail blur matching , The index will not be invalidated ; If it's a fuzzy head match , Index failure . Such as :
    explain select * from tb_user where profession like '% engineering ';
    , Before and after  %  It's going to fail .
  • use  or  The conditions of separation , If  or  The column of one of the conditions has no index , Then the indexes involved will not be used .
  • If  MySQL  Evaluation uses indexes more slowly than full tables , Index is not used .
SQL  Tips
Is an important means to optimize the database , Simply speaking , Is in the SQL Add some human prompts in the statement to optimize the operation .

for example , Use index :
explain select * from tb_user use index(idx_user_pro) where profession=&quot; Software Engineering &quot;;
Which index is not used :
explain select * from tb_user ignore index(idx_user_pro) where profession=&quot; Software Engineering &quot;;
Which index must be used :
explain select * from tb_user force index(idx_user_pro) where profession=&quot; Software Engineering &quot;;

use  It's a suggestion , Which index is actually used  MySQL  They will also weigh their running speed to change ,force Is to force the use of the index anyway .
Overlay index & Return to the table for query
Try to use overlay index ( The query uses an index , And the columns that need to be returned , All can be found in this index ), Reduce  select *.

explain  in  extra  Field meaning :
using index condition
: Search uses index , But you need to return the table to query the data
using where; using index;
: Search uses index , But all the data needed can be found in the index column , So there is no need to query back to the table

If the corresponding row can be found directly in the clustered index , Then directly return row data , Just one query , Even if it is select *; Find the secondary index in the secondary index , Such as
select id, name from xxx where name='xxx';
, You only need to use the secondary index (name) Find the corresponding id, return name and name The index corresponds to id that will do , Just one query ; If you are looking for other fields through the secondary index , You need to query back to the table , Such as
select id, name, gender from xxx where name='xxx';

So try not to use
select *
, Easy to appear back to table query , Reduce efficiency , Unless a union index contains all fields

Interview questions : A watch , There are four fields (id, username, password, status), Because of the amount of data , The following SQL Statement optimization , How to proceed is the best solution :
select id, username, password from tb_user where username='itcast';

Explain : to username and password Field to create a federated index , There is no need to query back to the table , Directly overwrite the index
Prefix index
When the field type is string (varchar, text etc. ) when , Sometimes you need to index long strings , This makes the index big , When inquiring , Waste a lot of disk IO, Affecting query efficiency , In this case, only part of the prefix of the string can be reduced , Index , This can greatly save index space , To improve index efficiency .

grammar :
create index idx_xxxx on table_name(columnn(n));
Prefix length : It can be determined according to the selectivity of the index , Selectivity refers to index values that are not repeated ( base ) And the total number of records in the data table , The higher the index selectivity, the higher the query efficiency , The only index selectivity is 1, This is the best index selectivity , Performance is also the best . Find the selectivity formula :

select count(distinct email) / count(*) from tb_user;
select count(distinct substring(email, 1, 5)) / count(*) from tb_user;

show index  Inside sub_part You can see the length of the connection
Single index & Joint index
Single index : That is, an index contains only a single column union index : That is, an index contains multiple columns in the business scenario , If there are multiple query criteria , Consider when indexing fields , It is recommended to establish a joint index , Instead of a single column index .

Single column index :
explain select id, phone, name from tb_user where phone = '17799990010' and name = ' Han xin ';
This sentence only uses phone Index field
matters needing attention
  • When multi condition joint query ,MySQL The optimizer evaluates which fields are more efficient to index , The index will be selected to complete this query

Design principles

  • For large amount of data , The tables that are frequently queried are indexed
  • For often used as query criteria (where)、 Sort (order by)、 grouping (group by) Index the fields of the operation
  • Try to select highly differentiated columns as indexes , Try to build a unique index , The more distinguishable , The more efficient the index is
  • If it is a string type field , Long field length , You can focus on the characteristics of the field , Building prefix index
  • Try to use a federated index , Reduce single column index , When inquiring , Joint indexes can often overwrite indexes , Save storage space , Avoid returning to your watch , Improve query efficiency
  • To control the number of indexes , The index is not that more is better , More indexes , The more expensive it is to maintain the index structure , It will affect the efficiency of addition, deletion and modification
  • If the index column cannot store NULL value , Please use... When creating the table NOT NULL Constrain it . When the optimizer knows whether each column contains NULL When the value of , It can better determine which index is most effectively used for queries
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/163/202206122348177815.html