当前位置:网站首页>The cost of returning tables in MySQL
The cost of returning tables in MySQL
2022-07-07 01:08:00 【Xiu Qiang】
The price of returning the watch
For the following query statement :SELECT * FROM single_ table WHERE key1 > 'a' AND key1 < 'c';
We can choose the following two ways to execute .
- Execute the query in the way of full table scanning
That is, directly scan all clustered index records , For each clustered index record , To determine whether the search criteria are valid , If yes, send it to the client , Otherwise, skip the record . - Use idx_key1 Execute the query
According to the search criteria key1>a’ AND key1 < ‘c’ Get the corresponding scanning interval (‘a’,‘c’), Then scan the secondary index records in the scanning interval . because idx_key1 The leaf node of the index stores incomplete user records , Contains only key1、id These two columns , The query list is *, This means that we need to get the cluster index record corresponding to each secondary index record , That is, perform the operation of returning to the table , After obtaining the complete user record, it will be sent to the client .
For the use of InnoDB For the table of the storage engine , All data pages in the index must be stored on disk , Wait until you need to load it into memory to use . These data pages will be stored in one or more files on the disk , The page number of the page corresponds to the offset of the page in the disk file . With 16KB Size pages as an example , Page No 0 The page of corresponds to the offset of 0 The location of , Page No 1 The page opposite these files has an offset of 16KB The location of .
B+ The nodes of each layer of the tree will be connected using a two-way linked list , The page numbers of the previous node and the next node do not need to be adjacent . But in practice ,InnoDB Try to arrange the page numbers of leaf nodes of the same index in order .
in other words ,idx_key1 In scan interval (‘a’,‘e’) The page number of the page where the secondary index record is located in will be as adjacent as possible . Even if the page numbers of these pages are not adjacent , But at least one page can store many records , That is to say, after executing a page I/O after , You can load many secondary index records from disk into memory . To make a long story short , Reading is in the scanning range (‘a’,‘e’) When the secondary index record in , The price paid is still small . But the scan range (‘a’,‘e’) Corresponding to the secondary index record in id The size of the value is irregular , Every time we read a secondary index record , You need to record according to the secondary index id Value to cluster index to perform back to table operation . If the page of the corresponding clustered index record is not in memory , You need to load the page from disk to memory . Due to a lot of reading id Clustered index records with discontinuous values , Moreover, these clustered index records are distributed in different data pages , The page numbers of these data pages are also irregular , Therefore, it will cause a lot of random I/O.
The more records that need to perform a table back operation , The lower the performance of queries using secondary indexes , Some queries prefer to use a full table scan rather than a secondary index . such as , hypothesis key1 Values in 'a’~’c’ The number of user records between accounts for 99% above , If you use idx_key1 Indexes , Will have a 99% The above id The value needs to be returned to the table . It's better to perform a full table scan directly .
When executing the query , When to use full table scanning , When to use secondary index + How to return to the table ? This is it. Query optimizer What should be done . The query optimizer will calculate some statistics for the records in the table in advance , Then use these statistics or access a small number of records in the table to calculate the number of records that need to be rowed back to the table . If you need to perform a table back operation, the more records , The more likely you are to use full table scanning , On the contrary, they tend to use secondary indexes + Back to the table . Of course , The analysis work done by the query optimizer is not so simple , But it's basically such a process .
In general , You can specify... For the query statement LIMIT Clause to limit the number of records returned by the query , This may make the query optimizer prefer to use secondary indexes + Query by returning to the table , The reason is that there are fewer records back to the table , The higher the performance . such as , The above query statement can be rewritten as follows :SELECT * FROM single_table WHERE key1 >'a' AND key1 < 'c' LIMIT 10;
Added LIMIT 10 The query statement after clause makes it easier for the query optimizer to adopt secondary index + Back to the table .
For queries that need to sort the results , If the secondary index is used to execute the query, there are many records that need to perform the operation of returning to the table , Also prefer to use full table scanning + Execute the query in the way of file sorting . For example, the following query statement :SELECT * FROM single_table ORDER BY key1;
Because the query list is *
, If you use secondary index to sort , You need to perform a table back operation on all secondary index records . The cost of this operation is not as low as directly traversing the cluster index and then sorting the files , Therefore, the query optimizer will tend to use full table scanning to execute queries . If you add LIMIT Clause , For example, the following query statement :SELECT * FROM single_table ORDER BY key1 LIMIT 10;
This query statement requires very few records to perform the operation of returning to the table , The query optimizer will tend to use secondary indexes + Back to the table .
边栏推荐
- Levels - UE5中的暴雨效果
- LLDP兼容CDP功能配置
- 建立自己的网站(17)
- Return to blowing marshland -- travel notes of zhailidong, founder of duanzhitang
- 界面控件DevExpress WinForms皮肤编辑器的这个补丁,你了解了吗?
- BFS realizes breadth first traversal of adjacency matrix (with examples)
- 批量获取中国所有行政区域经边界纬度坐标(到县区级别)
- tensorflow 1.14指定gpu运行设置
- Informatics Orsay Ibn YBT 1172: find the factorial of n within 10000 | 1.6 14: find the factorial of n within 10000
- ARM裸板调试之JTAG调试体验
猜你喜欢
线段树(SegmentTree)
Part V: STM32 system timer and general timer programming
boot - prometheus-push gateway 使用
Configuring OSPF basic functions for Huawei devices
Anfulai embedded weekly report no. 272: 2022.06.27--2022.07.03
随时随地查看远程试验数据与记录——IPEhub2与IPEmotion APP
Come on, don't spread it out. Fashion cloud secretly takes you to collect "cloud" wool, and then secretly builds a personal website to be the king of scrolls, hehe
C9 colleges and universities, doctoral students make a statement of nature!
"Exquisite store manager" youth entrepreneurship incubation camp - the first phase of Shunde market has been successfully completed!
Js+svg love diffusion animation JS special effects
随机推荐
阿里云中mysql数据库被攻击了,最终数据找回来了
【批處理DOS-CMD命令-匯總和小結】-字符串搜索、查找、篩選命令(find、findstr),Find和findstr的區別和辨析
from . cv2 import * ImportError: libGL. so. 1: cannot open shared object file: No such file or direc
NEON优化:矩阵转置的指令优化案例
Meet the level 3 requirements of ISO 2.0 with the level B construction standard of computer room | hybrid cloud infrastructure
There is an error in the paddehub application
Maidong Internet won the bid of Beijing life insurance to boost customers' brand value
【批处理DOS-CMD命令-汇总和小结】-跳转、循环、条件命令(goto、errorlevel、if、for[读取、切分、提取字符串]、)cmd命令错误汇总,cmd错误
Atomic in golang and CAS operations
力扣1037. 有效的回旋镖
MySQL中回表的代价
省市区三级坐标边界数据csv转JSON
【案例分享】网络环路检测基本功能配置
一行代码实现地址信息解析
fastDFS数据迁移操作记录
Configuring OSPF basic functions for Huawei devices
[牛客] [NOIP2015]跳石头
golang中的WaitGroup实现原理
Tensorflow 1.14 specify GPU running settings
[software reverse - solve flag] memory acquisition, inverse transformation operation, linear transformation, constraint solving