当前位置:网站首页>[MySQL from introduction to proficiency] [advanced chapter] (VIII) clustered index & non clustered index & joint index
[MySQL from introduction to proficiency] [advanced chapter] (VIII) clustered index & non clustered index & joint index
2022-07-26 19:09:00 【Man Nong Feige】
Hello! , I'm Manon Feige , Thank you for reading this article , Welcome to three links with one button .
1. Python Basic column , Basic knowledge in a net ,9.9 Yuan can't buy a loss , I can't buy it . Python From entry to mastery
️ 2. Python Crawler column , Systematically learn the knowledge points of reptiles .9.9 Yuan can't buy a loss , I can't buy it .python Reptile beginner level
️ 3. Ceph actual combat , Everything from principle to actual combat . Ceph actual combat
️ 4. Java Introduction to high concurrency programming , Punch in to learn Java High concurrency . Java Introduction to high concurrency programming
5. Take a stroll around the community , Weekly benefits , There are surprises every week . Manon Feige community , Leap plan
The whole network has the same name 【 Manon Feige 】 Welcome to your attention , personal VX: wei158556
List of articles
1. brief introduction
Last article we introduced 【MySQL From entry to mastery 】【 Advanced 】( 7、 ... and ) Design an index &InnoDB Index scheme in , This paper introduces how to design an index , as well as InnoDB How to form the index in .
This article then introduces InnoDB The index of , The index is physically implemented , The index can be divided into 2 Clustering index and non clustering index . We also call non clustered indexes secondary indexes or auxiliary indexes .
2. Environmental Science
| Environmental Science | edition |
|---|---|
| Red Hat | 4.8.5-39 |
| MySQL | 5.7 |
3. Cluster index
Clustered index is not an index type , It's a way of storing data ( All user records are stored in the leaf node ), That is to say, index is data , Data is index .
The term :“ Clustering ” Indicates that data rows and adjacent key values are clustered and stored together .
characteristic :
- Use the size of the record primary key value to sort records and pages , This includes three meanings :
- The records in the page are arranged into a one-way linked list according to the size order of the primary key value .
- Each store Page of user record It is also arranged into a two-way linked list according to the primary key size of user records in the page .
- Deposit Page of directory entry record Divided into different levels , The pages in the same level are also arranged into a two-way linked list according to the primary key size of the directory item records in the page .
- B+ Treelike Leaf node What is stored is a complete user record .
The so-called complete user record means that the values of all columns are stored in this record ( Include hidden columns ).
We put B+ The tree is called Cluster index , All complete user records are stored in this Cluster index At the leaf node of . This clustering index does not require us to MySQL Pass through INDEX Statement to create ,InnoDB The storage engine will automatically create clustered indexes for us .
advantage :
- Data access is faster , Because the clustered index keeps the index and data in the same B+ In the tree , So getting data from clustered indexes is faster than non clustered indexes .
- Clustering index for primary key Sort search And range lookup speed is very fast .
- Sort by cluster index , When a query displays a certain range of data , Because the data is tightly connected , The database doesn't have to extract data from multiple data blocks , So it saves a lot of IO operation .
shortcoming :
- The insertion speed depends heavily on the insertion order , Inserting in the order of the primary keys is the fastest way , Otherwise, there will be page splitting , Seriously affect performance . therefore , about InnoDB surface , We usually define a Self increasing ID List as primary key .
- The cost of updating the primary key is very high , Because it will cause the updated row to move , therefore , about InnoDB surface , We generally define that the primary key cannot be updated .
- Secondary index access requires two index lookups , Find the primary key value for the first time , The second time, find the row data according to the primary key value .
Limit :
- about MySQL The database currently has only InnoDB Data engine supports clustering index , and MyISAM Clustering index is not supported .
- There can only be one way to sort data in physical storage , So each MySQL The table of can only have one clustered index . In general, it is the primary key of the table .
- If there is no primary key defined ,InnoDB Will choose Non empty unique index Instead of , If there is no such index ,InnoDB Will implicitly define a primary key as the cluster index .
- In order to make full use of the clustering characteristics of clustering index , therefore InnoDB The primary key column of the table should be Orderly order id, It is not recommended to use disordered id, such as UUID、MD5、HASH、 The character string column as the primary key cannot guarantee the sequential growth of data .
4. Nonclustered index ( Secondary indexes , Secondary index )
The clustering index on the top is only when the search condition is Primary key value It can only work when , because B+ The data in the tree is sorted by primary key . What if we want to use other columns as search criteria ? It must not be traversing and recording along the linked list from beginning to end .
The answer is : We can build more trees B+ Trees , Different B+ The data in the tree uses different sorting rules , For example, we use age The size of the column is used as the collation of the records in the data page , Build another one B+ Trees , The effect is shown below :
- Use records age Sort records and pages according to the size of columns , This includes three meanings :
- The records in the page are in accordance with age The size order of the columns is arranged into a one-way linked list .
- Each store Page of user record Also according to the record on page age The columns are arranged in order of size Double linked list .
- Deposit Page of directory entry record Divided into different levels , Pages in the same hierarchy are also recorded according to the directory entries in the page age The columns are arranged in order of size Double linked list .
- The directory entry record is no longer Primary key + Page number , And become age Column + Page number The collocation of .
So if we want to pass now age The value of the column can be used to find some records B+ Trees , To find age The value of the column is 28 For example , The search process is as follows ;
- determine Catalog item record page
according to Root page , That's the page 33, Can quickly locate Catalog item record The page is page 42 - adopt Catalog item record Page determines the page where the user record is actually located .
stay page 42 You can quickly locate Catalog item record The page is page 46 ( because 20<28<30). - Locate the specific record in the page where the user record is actually stored .
- But this B+ The records in the leaf node of the tree only store age and id( That's the primary key ) Two columns , So we have to find the complete user record again in the clustering index according to the primary key value .
4.1. Back to the table
Based on this, we take age Sort by column size B+ We can only determine the primary key value of the record we are looking for , So if we want to be based on age If the value of the column finds the complete user record , Still need to Cluster index Check again in , This process is called Back to the table . That is to say, according to age To query a complete user record by using the value of the column 2 Tree B+ Trees .
problem : Why do we need a table return operation ? Can't you put the complete user record directly on the leaf node ?
answer : If you put the complete user record in the leaf node, you don't need to go back to the table , But there is more than one non clustered index , If every tree is built B+ The tree needs to copy all user records once , This is a waste of storage space .
Because of this according to Non primary key columns The establishment of a B+ The tree needs a table back operation to locate the complete user record , So this kind of B+ The tree is also called Secondary indexes , perhaps Secondary index . Because we use age The size of the column is used as B+ Tree sorting rules , So we also call this B+ The tree is age Index created by column .
The existence of non clustered index does not affect the organization of data in clustered index , So there can be multiple non clustered indexes in a table .
5. Joint index
We can also use the size of multiple columns as the sorting rule at the same time , That is, index multiple columns at the same time , Let's say we want B+ The tree follows age Column sum name Sort column sizes , This contains two meanings :
- First, follow the records and pages age Sort columns .
- On record age In the same case , use name Sort columns .
by age Column sum name The diagram of the index created by the column is as follows :
As shown in the figure , We need to pay attention to the following :
- Every one of them Catalog item record All by age、name、 The page number consists of three parts , Each record is based on age Sort the values of the columns , If it's recorded age The columns are the same , According to name Sort the values of the columns .
- B+ Trees Leaf node The user record at is by age,name And the primary key id Column composition .
Be careful. , With age Column sum name The size of the column is set up by the collation B+ The tree is called Joint index , It's essentially a secondary index . Its meaning and distinction are age and name Columns are indexed differently , The differences are as follows ; - Set up a joint index Only one tree like the one above will be built B+ Trees .
- by age and name Columns are indexed separately with age and name The size of the column creates two columns for the collation B+ Trees .
summary
This paper introduces clustering index in detail , Non clustered index and federated index ,MySQL A clustered index will be created with the primary key column by default , Used to store complete user record data , Non clustered and federated indexes only store index columns and primary keys . Querying a complete user record according to a non clustered index requires a back table query .
边栏推荐
- MySQL learning notes -2. how to improve the query performance of SQL statements
- Covos: no need to decode! Semi supervised Vos acceleration using motion vectors and residuals of compressed video bitstreams (CVPR 2022)
- rancher部署kubernetes集群
- 我酷故我在
- 当前占位,之后再写
- Agenda express | list of sub forum agenda on July 27
- Redis learning notes-2. Use of the client
- Interview summary of some large factories
- 2022上海市安全员C证操作证考试题库模拟考试平台操作
- Advanced template (runner's notes)
猜你喜欢

Write a thesis and read this one

Safer, healthier and without endurance anxiety, Wei brand latte dht-phev is here

如何成为一名优秀的测试/开发程序员?专注谋定而后动......

SSM integration configuration

Verification palindrome string II of leetcode simple question

一文详解MES系统给企业带来的5大好处,附应用场景

ALV screen input option learning

2022 welder (elementary) operation certificate examination question bank and simulation examination

MySQL练习题初级45题(统一表)

深度学习的数学基础
随机推荐
CoVOS:无需解码!利用压缩视频比特流的运动矢量和残差进行半监督的VOS加速(CVPR 2022)...
工赋开发者社区 | 定了!就在7月30日!
rancher部署kubernetes集群
Brand new! Uncover the promotion route of Ali P5 Engineer ~p8 architect
一文详解MES系统给企业带来的5大好处,附应用场景
Likeshop takeout order system is open source, 100% open source, no encryption
微软默默给 curl 捐赠一万美元,半年后才通知
NFT数字藏品系统开发:上线即售罄,网民“秒杀”数字藏品
JS uses readLine to realize terminal input data
What aspects should be considered in the selection of MES system?
LeetCode简单题之数组能形成多少数对
2022年化工自动化控制仪表考题模拟考试平台操作
立创EDA使用笔记
当前占位,之后再写
MySQL练习题初级45题(统一表)
详细介绍@GetMapping和@PostMapping的区别
JS question brushing plan - linked list
The diagram of user login verification process is well written!
Write a thesis and read this one
Kubectl common commands and simple explanations