当前位置：网站首页>"MySQL things" explains the indexing principle in detail

"MySQL things" explains the indexing principle in detail

2022-07-27 18:40:00 【Java sharing Officer】

MySQL What are the indexes of ？

 primary key ： The primary key column of the table adds an index by default , All the data recorded in this row is saved in the index 
 unique index (upique)： All values of this column must be unique 
 General index (normal)： An index of ordinary columns 
 Aggregate index ： It's one of the common indexes , But it is an index replication code composed of multiple columns

How to use index ？

Suppose we have several tables , as follows

Use navicat Design students student surface , And add the index as follows

1、 primary key

mysql An index is automatically added to the primary key of each table , In the leaf node of its index tree , It holds all the data of the primary key row , I'll talk about this later . That is to say, just find the primary key , It means that we have found the record , So using the primary key index will be faster

2、 unique index

The column corresponding to the unique index , The data is not repeatable , This is equivalent to , The difference is that the leaf node of the index tree does not store all the data of the row , It's the value of the column , The query speed is slower than the primary key

3、 General index

Ordinary index has nothing to say , Its value can be repeated , And the leaf node of the index tree holds the value of the column , Instead of the whole line of data

4、 Aggregate index

When you need to combine several fields to query , Using aggregate indexes can be faster than using multiple normal indexes , Because each index corresponds to an index tree , Multiple common indexes , Although it's all indexed , But you have to traverse several index trees , Using aggregate index only needs to traverse an index tree

If you don't know anything about the index tree , Let's take a look at the following analysis first , Let's go back to these four indexes , There must be a new feeling

What is the underlying structure of the index ？

 The underlying structure of index is divided into full-text index 、 Hash index 、B+ Tree index 
 Full-text index ： Only MyISAM Engine support , No introduction 
 Hash index ： Calculate the number of index columns hashCode, And store it in the index , If there is a conflict , It's stored in the form of a linked list , similar hashMap structure 
B+ Tree index ： Sort the values of the index column , And put it in the specified position in the index tree (Mysql The default index structure ) Copy code

The principle of hash index

hash It's a kind of key-value Data structure in form , Hash index is based on index column hashCode As the key , The address pointer of a data row is an index of values , It's a very compact address space , Think of it as an array

If we want to check 【 Liu bei 】, So first, by calculating the hash code hashCode( Liu bei )=002 obtain , Then find... In the hash index key=002 The location of , Then go to the disk address where the data is really saved 311, And find the data row .

So we looked it up twice , The first was based on hashCode Address found , The second time is to find the data row according to the address , But this kind of query speed is very fast , Because it doesn't traverse every row of data , But through hashCode Find the disk address of the data row directly .

If it happens hash What about the conflict ？ For example, Guan Yu's and Zhang Fei's hashCode All equal to 010, At this time, Zhang Fei will be connected behind Guan Yu , Form a chain structure , Then save Zhang Fei's address in Guan Yu's next address pointer .

When looking for 【 Zhang Fei 】 when , adopt hashCode( Zhang Fei )=010 Address found 45, Then find Guan Yu through the address , By judging names 【 Zhang Fei 】!=【 Guan yu 】, So through the next address pointer 46 Continue to find , I found Zhang Fei the second time .

This method of conflict resolution is called chain address method . So the structure of hash index is array + Form of linked list , And hashMap The same , But when too many conflicts lead to a long list , When you operate data, you will always traverse the linked list to find the data , This will affect performance .

You can use the hash index structure , Then according to their usual writing sql The conditions used , Think about his strengths and weaknesses , I will summarize at the end of the article .

B+ Tree index principle

B+ The evolution of trees

 Binary tree  ——>  Binary search tree  ——>  Balanced binary trees  ——> B Trees  ——> B+ Tree copy code

Here, we don't give an in-depth introduction to binary tree and other structures , The following algorithm sections will describe in detail , Brief introduction only B Trees and B+ Trees .

What is? B Trees ？

The basic definition ：

1、 The root node has at least two children

2、 The leaf nodes are all at the same height

3、 If the non leaf node has n Key words , So he has n+1 Child node , And this n The nodes are arranged incrementally

What is? B+ Trees ？

B+ By B It's evolved from trees , So it has B All the characteristics of trees , There are two other things

1、B+ The non leaf nodes of the tree only store keywords , Don't store data

2、B+ The leaf nodes of the tree are connected by pointers , It's a two-way list

MySQL Medium B+ Tree index

Let's use the student Look at the table B+ How a tree stores indexes , Suppose that student Add the following data to the table

We've added an index to it , Now? student All indexes in are as follows

id： primary key , Created by default 
code： unique index 
class_id： General index 
name_class：name Column sum class_id Aggregate index copy code of column

id Indexed B+ Tree index structure as follows ：

Thus we can see that , The leaf node of the primary key index stores the entire row of records , So using the primary key index sql Query speed is very fast .

A unique index is the same as a normal index , It's just that the value of the index is unique , There are no duplicate values

Common column class_id Index tree of as follows ：

The key is coming. ！ The key is coming. ！ The key is coming. ！

It can be seen that the leaf node holds the... In its row record id, Let's look at the following two sql What's the difference?

sql1：select * from tb_student where class_id=834

sql2：select id from tb_student where class_id=834 Copy code

It looks the same. , But it is not . Article 1 with a sql The query is the whole line of records , The row records are stored in the index tree of the primary key , So the query step is ： According to the general index class_id Find the leaf node in the index tree of , Get row records id, And then according to id Get the whole row of records from the primary key index tree

This query process is called back to table , It can be seen that the return table will reduce the query efficiency

And the second one sql The query is id,class_id The leaf node of the index tree holds id Value , So you don't need to go to the primary key index tree , Direct will id Return to , So the efficiency is higher than the former .（ Seeing this, you should be able to think of why you need aggregate index ）

Know what it is , Know why , We will win every battle .

Let's take a look at The tree structure of aggregate index ：

To make the data more intuitive , We add a new aggregate index

The index tree is as follows

The first index in the union index is classid, So the index tree will start with class_id To sort , And then sort them according to the following index columns . So the leftmost principle of union index can be embodied here .

SELECT * FROM tb_student where class_id=18 and id=834 Copy code

For the above statement , He has two indexes to go through , The first is the joint index id_class, The second is the primary key index id, Which way will you go ？ The answer is the primary key index

Be careful ： When the union index and the primary key index exist at the same time , Take the primary key index first

Why? ？ This is a mysql The optimization strategy adopted , Because the primary key index can directly find out the whole line of data , So no matter you select * still select I can satisfy all the other fields , And joint index and select * It also involves a return table operation

summary

Advantages and disadvantages of hash index

 advantage ：
1、 Fast query speed 
2、 The cost of maintaining the index is relatively low 

 shortcoming ：
1、 Can't do range query , Because it's by calculating the elements hashCode Look for , image age>50 This kind of range lookup can't use hash index 
2、 Cannot sort by index , The biggest feature of hashing is hash distribution , It's almost irregular , So you can't sort the copy code

B+ The advantages and disadvantages of tree index

 advantage ：
1、 Index trees are generally 2-4 layer , High query efficiency ,IO Less consumption 
2、 Support all kinds of range query 
3、 Support index sorting 

 shortcoming ：
1、 The cost of maintaining the index tree is high 
2、 Too many indexes will also take up more space to copy code

原网站

版权声明
本文为[Java sharing Officer]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/208/202207271612462763.html