当前位置:网站首页>MySQL index failure scenarios and Solutions
MySQL index failure scenarios and Solutions
2022-07-26 03:46:00 【Big programmers don't have big heads】
One 、 Preface
In the face of SQL Statement will encounter index invalidation , It has a crucial impact on the feasibility and performance efficiency of the statement , This analysis Why does the index fail , What will happen Cause index to fail And when the index fails Optimize the solution , It focuses on Leftmost prefix matching principle 、MySQL Logical architecture and optimizer 、 Index failure scenarios and why .
Two 、 Leftmost prefix matching principle
I wrote an article about The basic concept of index and the use of index Articles on , Next, we will introduce the relevant contents of index invalidation .
First, introduce a principle that will be used in the subsequent causes of index failure : Leftmost prefix matching principle .
The bottom principle of the leftmost prefix : stay MySQL The leftmost prefix matching principle is followed when building a union index , That is, the leftmost priority , Match from the far left of the union index when retrieving data .
What is the leftmost prefix matching principle ? To understand the leftmost matching principle of a union index , Let's first understand the underlying principles of indexing : At the bottom of the index is a B+ Trees , So the bottom layer of the union index is a B+ Trees , It's just a joint index B+ The tree node stores key values . Because of building a B+ The tree can only determine the index relationship based on one value , So the database relies on the leftmost field of the union index to build .
give an example : Create a (a,b) Joint index of , So its index tree looks like the following figure .
You can see a The values of are ordered ,1,1,2,2,3,3, and b There is no order in the values of 1,2,1,4,1,2. But we can find that a In the case of equivalence ,b The values are in order again , But this order is relative . This is because MySQL The rule for creating a union index is to sort the first field on the leftmost side of the union index first , Based on the sorting of the first field , Then sort the second field . therefore b=2 There is no way to make use of the index for this query condition .
Because the whole process is based on explain Result analysis , Next, let's understand explain Medium type Fields and key_lef Field .
1.type: Connection type .
- system: There is only one line in the table ( It's equal to the system table ), This is a const Special case of type , Not usually , Negligible
- const: Indicates that it is found through index once ,const For comparison primary key perhaps unique Indexes . Because only one row of data needs to be matched , All very soon . If you put the primary key in where In the list ,mysql You can convert the query to a const.
- eq_ref: Unique index scan , For each index key , Only one record in the table matches it . Common in primary keys or Unique index scan . Be careful :ALL The tables scanned by the whole table record the least t1 surface
ref: Non unique index scan , Returns all rows that match a single value . In essence, it is also an index access , It returns all rows that match a single value , However, he may find more than one qualified line , So it should be a mixture of search and scan .range: Retrieve only rows in the given range , Use an index to select rows .key Column shows which index is used . It's usually in where In the sentence bettween、<、>、in And so on . The range scan on this index column is better than the full index scan . Just start at a certain point , End at another point , Don't scan all indexes .index:Full Index Scan,index And ALL The difference for index Type only traverses the index tree . This is usually ALL block , The index file should usually be smaller than the data file .(Index And ALL Even though they read the whole watch , but index Read from index , and ALL Read from hard disk )- ALL:Full Table Scan, Traverse the entire table to find the matching rows
2.key_len: Show MySQL Actually determine the length of the index used . If the index is NULL, The length of NULL. If not NULL, Is the length of the index used . So from this field, you can infer which index is used .
Calculation rules :
1. Fixed length field ,int Occupy 4 Bytes ,date Occupy 3 Bytes ,char(n) Occupy n Characters .
2. Variable length field varchar(n), Take up n Characters + Two bytes .
3. Different character sets , The number of bytes occupied by a character is different .Latin1 Coded , One character takes one byte ,gdk Coded , One character takes two bytes ,utf-8 Coded , One character takes three bytes .
( Because my database uses Latin1 Encoding format , So in the later calculation , A character is counted as a byte )
4. For all index fields , If set to NULL, It will take 1 Bytes .
After understanding the leftmost prefix matching principle, let's take a look at the scenario of index failure and analyze why it fails .
3、 ... and 、MySQL Logical architecture and optimizer
MySQL Logical architecture :

mysql The architecture can be roughly divided into 4 layer , Namely :
1. client : Various languages provide connections mysql Database method , such as jdbc、php、go etc. , You can choose Select the corresponding method or framework connection mysql
2.server layer : Including connectors 、 The query cache 、 analyzer 、 Optimizer 、 Actuators etc. , cover mysql Most of the core service functions of , And all the built-in functions ( For example, date 、 aristocratic family 、 Count Learning and encryption functions ), All cross-storage engine functionality is implemented in this layer , Like stored procedures 、 trigger 、 View etc. .
3. Storage engine layer : Responsible for data storage and extraction , It is a component that really deals with the underlying physical files . Data is essentially stored on disk , Store data in an organized way through a specific storage engine and extract data according to business needs . The architecture mode of storage engine is plug-in , Support Innodb,MyIASM、Memory Wait for multiple storage engines . Now the most commonly used storage engine is Innodb, It is from mysql5.5.5 Version began to be the default storage engine .
4. Physical file layer : Store the real table data of the database 、 Log etc. . Physical documents include :redolog、undolog、binlog、errorlog、querylog、slowlog、data、index etc. .
server Introduction to important components of layer :
1. The connector
The connector is responsible for the connection from the client 、 Get user rights 、 Maintaining and managing connections .
After a user successfully establishes a connection , Even if you use the administrator account to modify the permissions of this user , It does not affect the permissions of existing connections . After the modification is completed , Only when you create a new connection can you use the new permission settings .
2. The query cache
mysql After getting a query request , You will go to the query cache to check whether this statement has been executed before . The statements executed before and their results may be expressed in key-value On the form of , Is directly cached in memory .key Is the statement of the query ,value Is the result of a query . If at present sql Queries can be found directly in the query cache key, So this value Will be returned directly to the client .
In fact, it is recommended not to use query caching in most cases , Why? ? Because query caching often does more harm than good . The query cache is very vulnerable to invalidation , Just update a table , All query caches associated with this table will be emptied . Therefore, it is likely that after struggling to save the results , It is cleared by an update operation before it is used . For databases with many update operations , The hit rate for the query cache will be very low . Unless the business needs a static table , It takes a long time to update . such as , A system configuration table , Then the query cache is suitable for the query of this table .
3. analyzer
Lexical analysis ( Identify keywords , operation , Table name , Name )
Syntax analysis ( Judge whether it conforms to the grammar )
4. Optimizer
The optimizer is when there are multiple indexes in a table , Decide which index to use ; Or there are multiple table associations in a statement (join) When , Determine the join order of the tables . After the optimizer phase is complete , The execution of this statement is determined , Then we go to the executor phase .
5. actuator
At the beginning of execution , First judge the user's satisfaction with this table T Do you have permission to execute the query . without , An error with no permissions is returned . If you hit the query cache , When the query cache returns results , Do authority verification . Queries will also be called before the optimizer precheck Verify permissions . If you have authority , Open the table and continue . When I open my watch , The actuator is defined according to the engine of the table , Call the interface provided by this engine . In some cases , The executor is called once , Inside the engine, it scans multiple lines , So the engine scans the number of rows followed rows_examined It's not exactly the same .
MySQL Optimizer :
MySQL The optimizer uses a cost based optimization approach (Cost-based Optimization), With SQL Statement as input , Use the built-in cost model, data dictionary information and the statistical information of the storage engine to decide which steps to use to implement the query statement , That is, query plan .
At a high level ,MySQL Server It can be divided into two parts : Server layer and storage engine layer . among , The optimizer works at the server layer , Located in the storage engine API above . The working process of the optimizer can be semantically divided into four stages :
1. Logical transformation , Including negation and elimination 、 Equivalence transfer and constant transfer 、 Constant expression evaluation 、 The outer connection is converted to the inner connection 、 Subquery transformation 、 View merging, etc ;
2. Optimize preparation , For example index ref and range Access method analysis 、 Query condition fan out value (fan out, Number of filtered records ) analysis 、 Constant table detection ;
3. Cost based optimization , Including the selection of access method and connection order ;
4. Implement the plan to improve , For example, push... Under table conditions 、 Access method adjustment 、 Avoid sorting and push down under index conditions .
Four 、 Index failure scenarios and why
1.like Use wildcards % Invalid start index . The bottom principle of leftmost prefix matching is introduced above , We know that the commonly used index data structure is B+ Trees , And the index is arranged in order . If the type of index keyword is Int type , The index is arranged in the following order :
Data is only stored in Leaf node , And it's Orderly The emissions of .
If the type of index keyword is String type , The order is as follows :
It can be seen that , The order of the index is based on the comparison string The first letter Sort of .
When we do fuzzy query , If you put % Put it in front , Leftmost n A letter is ambiguous , Unable to accurately locate an index according to the order of the index , Only a full table scan can be performed , Find the right data .( The bottom principle of the leftmost prefix )
In the use of Joint index It's the same when , If you violate the rule of orderly index arrangement , It will also cause index invalidation , Conduct Full table scanning .
Example : surface example There are composite indexes in :(A,B,C)
SELECT * FROM example WHERE A=1 and B =1 and C=1; Sure Go to the index ;
SELECT A FROM example WHERE C =1 and B=1 ORDER BY A; Sure Go to the index ( Override index used )
SELECT * FROM example WHERE C =1 and B=1 ORDER BY A; Can not be Go to the index
Overlay index : The index contains all the indexes of the data needed for the query , Called overlay index (Covering Index)
There are two ways Optimize :
One is Use overlay index , The second is hold % Put it back .
2. The field type is string ,where It is not enclosed in quotation marks . The fields in the table are of string type , yes B+ The general index of the tree , If the query condition passes a number , It's not indexed .
Example : surface example There are fields in pid yes varchar type .
// Execute the statement at this time type by ALL A full table query
explain SELECT * FROM example WHERE pid = 1
// Execute the statement at this time type by ref Index query
explain SELECT * FROM example WHERE pid = '1'
Why doesn't the first sentence go without a single quotation mark ? This is because when single quotes are not added , It's a comparison of strings and numbers , They don't match ,MySQL Would like to be doing Implicit Type conversion of , Convert them to floating-point numbers and compare them .
3.OR As long as there are non indexed columns before and after , Will lead to index invalidation . The query criteria include or, It may lead to index invalidation .
Example : surface example There are fields in pid yes int type ,score yes int type .
// Execute the statement at this time type by ref Index query
explain SELECT * FROM example WHERE pid = 1
// hold or Condition plus no index score, It's not going to be indexed , by ALL A full table query
explain SELECT * FROM example WHERE pid = 1 OR score = 10
Here for OR Add those without index score This situation , Suppose it's gone p_id The index of , But go to score When querying conditions , It's got to scan the whole table , It's a three-step process : Full table scan + An index scan + Merge .
mysql There is an optimizer , In efficiency and cost , encounter OR Conditions , It is also reasonable that the index may fail .
Be careful : If or Condition columns are indexed , Index may go .
4. Joint index ( Composite index ), The condition column in the query is not the first column in the union index , Index failure . In the union index , When the query criteria meet the leftmost matching principle , The index works normally .
When we create a federated index , Such as (k1,k2,k3), It's equivalent to creating (k1)、(k1,k2) and (k1,k2,k3) Three indexes , This is the left most matching principle .
Example : There is a joint index idx_pid_score,pid before ,score After .
// Execute the statement at this time type by ref Index query ,idx_pid_score Indexes
explain SELECT * FROM example WHERE pid = 1 OR score = 10
// Execute the statement at this time type by ref Index query ,idx_pid_score Indexes
explain SELECT * FROM example WHERE pid = 1
// Execute the statement at this time type by ALL A full table query
explain SELECT * FROM example WHERE score = 10
Union index does not satisfy the leftmost principle , Indexes generally fail , But this one still follows Mysql Optimizer related .
5. Calculation 、 function 、 Type conversion ( Automatic or manual ) Cause index to fail , Use... On index fields (!= perhaps < >,not in) when , May cause index to fail .
birthtime Indexed , But because of the use of mysql Built in functions for Date_ADD(), I didn't go to the index .
Example : In the table example There is idx_birth_time The index for datetime Type of birthtime Field
// Execute the statement at this time type by ALL A full table query
explain SELECT * FROM example WHERE Date_ADD(birthtime,INTERVAL 1 DAY) = 6
There are also operations on index columns ( Such as ,+、-、*、/), Index failure .
Example : In the table example There is int Type of score Field index idx_score
// Execute the statement at this time type by ALL A full table query
explain SELECT * FROM example WHERE score-1=5
And it doesn't mean (!= perhaps <>) Cause index to fail .
Example : In the table example There is int Type of score Field index idx_score
// Execute the statement at this time type by ALL A full table query
explain SELECT * FROM example WHERE score != 2
// Execute the statement at this time type by ALL A full table query
explain SELECT * FROM example WHERE score <> 3
although score Indexed , But it did != perhaps < >,not in These times , The index is like a fake .
6. is null You can use index ,is not null Index not available .
Example : In the table example There is varchar Type of name Field index idx_name,varchar Type of card Field index idx_card.
// Execute the statement at this time type by range Index query
explain SELECT * FROM example WHERE name is not null
// Execute the statement at this time type by ALL A full table query
explain SELECT * FROM example WHERE name is not null OR card is not null
7. The coding format of fields associated with left connection query or right connection query is different . If the field code format of the same field of two tables is different, the index query will not be carried out .
Example : In the table example There is varchar Type of name The field code is utf8mb4, The index for idx_name
In the table example_two There is varchar Type of name The field code is utf8, The index for idx_name.

// Execute the statement at this time example Watch can go type by index Type index ,example_two Then for ALL Full table search without index
explain SELECT e.name,et.name FROM example e LEFT JOIN example_two et on e.name = et.name
When the field types of the two tables are changed to be consistent :
// Execute the statement at this time example Watch can go type by index Type index ,example_two Will go type by ref Type index
explain SELECT e.name,et.name FROM example e LEFT JOIN example_two et on e.name = et.name
So the field type will also lead to index invalidation
8.mysql It is estimated that full table scanning is faster than indexing , Index is not used . When the index of a table is queried , Will use the best index , Unless the optimizer uses full table scanning more effectively . The optimizer optimizes the complete table scan depending on whether the data found using the best index exceeds the table 30% The data of . Suggest : Don't give ’ Gender ’ Wait for the index to be added . If a data column contains "0/1" or “Y/N” equivalence , That is, there are many duplicate values , Even if it's indexed , Indexing doesn't work very well , It can also cause a full table scan .
Mysql For efficiency and cost , Estimate full table scan and use index , Which is fast , It's about its optimizer .
5、 ... and 、 summary
Here's a list of mysql The scenario of index invalidation will occur during the execution of the statement , If there are other welcome messages to add ~
边栏推荐
- C language functions (2)
- zkEVM:MINA的CEO对zkEVM和L1相关内容的总结
- Hurry in!!! Write a number guessing game with dozens of lines of code based on the basic knowledge of C language
- Where can Lora and nb-iot be used
- 离线数据仓库从0到1-阶段二软件安装
- PHP connects to MySQL database, and database connects to static tool classes to simplify the connection.
- Testing is not valued? Senior: you should think in another position
- Efficient Video Instance Segmentation via Tracklet Query and Proposal
- How to choose sentinel vs hystrix?
- 5-20v input peak charging current 3.5A single lithium battery switching charging chip sc7101
猜你喜欢

Use VRRP technology to realize gateway equipment redundancy, with detailed configuration experiments

Bond network mode configuration

Three ways of redis cluster

【 Kotlin 中的类和对象实例】

PHP connects to MySQL database, and database connects to static tool classes to simplify the connection.

The B2B2C multi merchant system has rich functions and is very easy to open

Aike AI frontier promotion (7.18)

5-20v input peak charging current 3.5A single lithium battery switching charging chip sc7101

Uncaught TypeError: $(...). Onmousenter is not a function JS error, solution:

多商户商城系统功能拆解15讲-平台端会员标签
随机推荐
[MySQL project practical optimization] complex trigger case sharing
WAF details
Intensive reading of the paper -yolov1:you only look once:unified, real time object detection
QT notes - Q_ Q and Q_ D learning
78. Subset
Six years of automated testing from scratch, I don't regret turning development to testing
C语言函数(2)
Portable power fast charging scheme 30W automatic pressure rise and fall PD fast charging
LDP related knowledge points
[stl] priority queue priority_ queue
【Unity3d Shader】角色投影与倒影
容器跑不动?网络可不背锅
waf详解
Uncaught TypeError: $(...). Onmousenter is not a function JS error, solution:
基于Caffe ResNet-50网络实现图片分类(仅推理)的实验复现
括号嵌套问题(建议收藏)
6年从零开始的自动化测试之路,开发转测试我不后悔...
File upload error: current request is not a multipart request
【虚拟化】查看vCenter和ESXi主机的Log日志文件
[create interactive dice roller application]