当前位置:网站首页>The cost of returning tables in MySQL
The cost of returning tables in MySQL
2022-07-07 01:08:00 【Xiu Qiang】
The price of returning the watch
For the following query statement :SELECT * FROM single_ table WHERE key1 > 'a' AND key1 < 'c';
We can choose the following two ways to execute .
- Execute the query in the way of full table scanning
That is, directly scan all clustered index records , For each clustered index record , To determine whether the search criteria are valid , If yes, send it to the client , Otherwise, skip the record . - Use idx_key1 Execute the query
According to the search criteria key1>a’ AND key1 < ‘c’ Get the corresponding scanning interval (‘a’,‘c’), Then scan the secondary index records in the scanning interval . because idx_key1 The leaf node of the index stores incomplete user records , Contains only key1、id These two columns , The query list is *, This means that we need to get the cluster index record corresponding to each secondary index record , That is, perform the operation of returning to the table , After obtaining the complete user record, it will be sent to the client .
For the use of InnoDB For the table of the storage engine , All data pages in the index must be stored on disk , Wait until you need to load it into memory to use . These data pages will be stored in one or more files on the disk , The page number of the page corresponds to the offset of the page in the disk file . With 16KB Size pages as an example , Page No 0 The page of corresponds to the offset of 0 The location of , Page No 1 The page opposite these files has an offset of 16KB The location of .
B+ The nodes of each layer of the tree will be connected using a two-way linked list , The page numbers of the previous node and the next node do not need to be adjacent . But in practice ,InnoDB Try to arrange the page numbers of leaf nodes of the same index in order .
in other words ,idx_key1 In scan interval (‘a’,‘e’) The page number of the page where the secondary index record is located in will be as adjacent as possible . Even if the page numbers of these pages are not adjacent , But at least one page can store many records , That is to say, after executing a page I/O after , You can load many secondary index records from disk into memory . To make a long story short , Reading is in the scanning range (‘a’,‘e’) When the secondary index record in , The price paid is still small . But the scan range (‘a’,‘e’) Corresponding to the secondary index record in id The size of the value is irregular , Every time we read a secondary index record , You need to record according to the secondary index id Value to cluster index to perform back to table operation . If the page of the corresponding clustered index record is not in memory , You need to load the page from disk to memory . Due to a lot of reading id Clustered index records with discontinuous values , Moreover, these clustered index records are distributed in different data pages , The page numbers of these data pages are also irregular , Therefore, it will cause a lot of random I/O.
The more records that need to perform a table back operation , The lower the performance of queries using secondary indexes , Some queries prefer to use a full table scan rather than a secondary index . such as , hypothesis key1 Values in 'a’~’c’ The number of user records between accounts for 99% above , If you use idx_key1 Indexes , Will have a 99% The above id The value needs to be returned to the table . It's better to perform a full table scan directly .
When executing the query , When to use full table scanning , When to use secondary index + How to return to the table ? This is it. Query optimizer What should be done . The query optimizer will calculate some statistics for the records in the table in advance , Then use these statistics or access a small number of records in the table to calculate the number of records that need to be rowed back to the table . If you need to perform a table back operation, the more records , The more likely you are to use full table scanning , On the contrary, they tend to use secondary indexes + Back to the table . Of course , The analysis work done by the query optimizer is not so simple , But it's basically such a process .
In general , You can specify... For the query statement LIMIT Clause to limit the number of records returned by the query , This may make the query optimizer prefer to use secondary indexes + Query by returning to the table , The reason is that there are fewer records back to the table , The higher the performance . such as , The above query statement can be rewritten as follows :SELECT * FROM single_table WHERE key1 >'a' AND key1 < 'c' LIMIT 10;
Added LIMIT 10 The query statement after clause makes it easier for the query optimizer to adopt secondary index + Back to the table .
For queries that need to sort the results , If the secondary index is used to execute the query, there are many records that need to perform the operation of returning to the table , Also prefer to use full table scanning + Execute the query in the way of file sorting . For example, the following query statement :SELECT * FROM single_table ORDER BY key1;
Because the query list is *
, If you use secondary index to sort , You need to perform a table back operation on all secondary index records . The cost of this operation is not as low as directly traversing the cluster index and then sorting the files , Therefore, the query optimizer will tend to use full table scanning to execute queries . If you add LIMIT Clause , For example, the following query statement :SELECT * FROM single_table ORDER BY key1 LIMIT 10;
This query statement requires very few records to perform the operation of returning to the table , The query optimizer will tend to use secondary indexes + Back to the table .
边栏推荐
- Deep understanding of distributed cache design
- Learn self 3D representation like ray tracing ego3rt
- 【案例分享】网络环路检测基本功能配置
- UI控件Telerik UI for WinForms新主题——VS2022启发式主题
- Periodic flash screen failure of Dell notebook
- 城联优品入股浩柏国际进军国际资本市场,已完成第一步
- Dell笔记本周期性闪屏故障
- [HFCTF2020]BabyUpload session解析引擎
- Provincial and urban level three coordinate boundary data CSV to JSON
- 深度学习框架TF安装
猜你喜欢
[batch dos-cmd command - summary and summary] - jump, cycle, condition commands (goto, errorlevel, if, for [read, segment, extract string]), CMD command error summary, CMD error
省市区三级坐标边界数据csv转JSON
批量获取中国所有行政区域经边界纬度坐标(到县区级别)
Maidong Internet won the bid of Beijing life insurance to boost customers' brand value
Dell筆記本周期性閃屏故障
做微服务研发工程师的一年来的总结
Anfulai embedded weekly report no. 272: 2022.06.27--2022.07.03
Equals() and hashcode()
Building a dream in the digital era, the Xi'an station of the city chain science and Technology Strategy Summit ended smoothly
阿里云中mysql数据库被攻击了,最终数据找回来了
随机推荐
Batch obtain the latitude coordinates of all administrative regions in China (to the county level)
腾讯云 WebShell 体验
[force buckle]41 Missing first positive number
mysql: error while loading shared libraries: libtinfo.so.5: cannot open shared object file: No such
ActiveReportsJS 3.1中文版|||ActiveReportsJS 3.1英文版
Activereportsjs 3.1 Chinese version | | | activereportsjs 3.1 English version
"Exquisite store manager" youth entrepreneurship incubation camp - the first phase of Shunde market has been successfully completed!
View remote test data and records anytime, anywhere -- ipehub2 and ipemotion app
Part IV: STM32 interrupt control programming
Part V: STM32 system timer and general timer programming
NEON优化:性能优化常见问题QA
Zabbix 5.0:通过LLD方式自动化监控阿里云RDS
[software reverse automation] complete collection of reverse tools
NEON优化:矩阵转置的指令优化案例
The printf function is realized through the serial port, and the serial port data reception is realized by interrupt
fastDFS数据迁移操作记录
Let's talk about 15 data source websites I often use
C9 colleges and universities, doctoral students make a statement of nature!
[batch dos-cmd command - summary and summary] - string search, search, and filter commands (find, findstr), and the difference and discrimination between find and findstr
[Niuke] b-complete square