当前位置:网站首页>Example analysis of SQL query optimization principle
Example analysis of SQL query optimization principle
2022-06-10 23:19:00 【Yisu cloud】
SQL Example analysis of query optimization principle
Today I'd like to share with you SQL Query optimization principle example analysis of the relevant knowledge points , Detailed content , Clear logic , I believe most people still know too much about this knowledge , So share this article for your reference , I hope you will gain something after reading this article , Now let's take a look .
There's a financial statement , Not divided into databases and tables , The current amount of data is 9555695, Paging query uses limit, Query time before optimization 16 s 938 ms (execution: 16 s 831 ms, fetching: 107 ms), Adjust as follows SQL after , Time consuming 347 ms (execution: 163 ms, fetching: 184 ms);
operation : The query criteria are placed in the subquery , Subqueries only look up primary keys ID, Then use the primary key Association determined in the subquery to query other attribute fields ;
principle :1、 Reduce the return operation ;
2、 May refer to 《 Alibaba Java Development Manual ( Taishan Edition )》 The fifth chapter -MySQL database 、( Two ) Index specifications 、 The first 7 strip :
【 recommend 】 Use delay association or subquery to optimize the super multi page scenario .
explain : MySQL I didn't pick it offeset That's ok , It's about taking offset+N That's ok , And then back before giving up offset That's ok , return N That's ok , That's right offset When I was very old , Efficiency is very low , Or control the total number of pages returned , Or for the number of pages over a specific threshold SQL rewrite .
Example : First, quickly locate what needs to be acquired id paragraph , And then relate :
SELECT a.* FROM surface 1 a,(select id from surface 1 where Conditions LIMIT 100000,20) b where a.id = b.id;
-- Before optimization SQLSELECT Various fields FROM `table_name`WHERE Various conditions LIMIT 0,10;
-- After optimization SQLSELECT Various fields FROM `table_name` main_taleRIGHT JOIN (SELECT Subqueries only look up primary keys FROM `table_name`WHERE Various conditions LIMIT 0,10;) temp_table ON temp_table. Primary key = main_table. Primary key
Preface
Let's start with MySQL Version of :
mysql> select version();+-----------+| version() |+-----------+| 5.7.17 |+-----------+1 row in set (0.00 sec)
Table structure :
mysql> desc test;+--------+---------------------+------+-----+---------+----------------+| Field | Type | Null | Key | Default | Extra |+--------+---------------------+------+-----+---------+----------------+| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment || val | int(10) unsigned | NO | MUL | 0 | || source | int(10) unsigned | NO | | 0 | |+--------+---------------------+------+-----+---------+----------------+3 rows in set (0.00 sec)
id It is an auto increment primary key ,val Is a non unique index .
Pour in a lot of data , common 500 ten thousand :
mysql> select count(*) from test;+----------+| count(*) |+----------+| 5242882 |+----------+1 row in set (4.25 sec)
We know , When limit offset rows Medium offset When a large , There will be efficiency issues :
mysql> select * from test where val=4 limit 300000,5;+---------+-----+--------+| id | val | source |+---------+-----+--------+| 3327622 | 4 | 4 || 3327632 | 4 | 4 || 3327642 | 4 | 4 || 3327652 | 4 | 4 || 3327662 | 4 | 4 |+---------+-----+--------+5 rows in set (15.98 sec)
In order to achieve the same goal , We usually rewrite it as follows :
mysql> select * from test a inner join (select id from test where val=4 limit 300000,5) b on a.id=b.id;+---------+-----+--------+---------+| id | val | source | id |+---------+-----+--------+---------+| 3327622 | 4 | 4 | 3327622 || 3327632 | 4 | 4 | 3327632 || 3327642 | 4 | 4 | 3327642 || 3327652 | 4 | 4 | 3327652 || 3327662 | 4 | 4 | 3327662 |+---------+-----+--------+---------+5 rows in set (0.38 sec)
The time difference is obvious .
Why did the above result appear ? Let's see select * from test where val=4 limit 300000,5; Query process of :
Query the index leaf node data .
According to the primary key value of the leaf node, query all the required field values on the cluster index .
It's similar to the picture below :

Like above , Need to check 300005 Secondary inode , Inquire about 300005 Data of secondary cluster index , Finally, filter out the results 300000 strip , Take out the last 5 strip .MySQL It takes a lot of randomness I/O On the query cluster index data , But there is 300000 Sub random I/O The query data will not appear in the result set .
Someone must have asked : Since it was indexed in the beginning , Why not follow the index leaf node to find the last needed 5 Nodes , Then query the actual data in the cluster index . It just needs 5 Sub random I/O, Similar to the process shown in the following picture :

In fact, I also want to ask this question .
confirmed
Now let's take a practical operation to confirm the above inference :
To confirm select * from test where val=4 limit 300000,5 It's a scan 300005 Index nodes and 300005 Data nodes on clustered indexes , We need to know MySQL Is there any way to count in a sql The number of times a data node is queried through an index node in . I tried first Handler_read_* series , Unfortunately, none of the variables can satisfy the condition .
I can only prove it indirectly :
InnoDB There is buffer pool. It contains recently accessed data pages , Including data pages and index pages . So we need to run two sql, To compare buffer pool Number of data pages in . The prediction is to run select * from test a inner join (select id from test where val=4 limit 300000,5); after ,buffer pool The number of data pages in is far less than select * from test where val=4 limit 300000,5; Corresponding quantity , Because of the previous one sql Only visit 5 Secondary data page , The second one sql visit 300005 Secondary data page .
select * from test where val=4 limit 300000,5
mysql> select index_name,count(*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name;Empty set (0.04 sec)It can be seen that , at present buffer pool There's nothing about test Table data page .
mysql> select * from test where val=4 limit 300000,5;+---------+-----+--------+| id | val | source |+---------+-----+--------+| 3327622 | 4 | 4 || 3327632 | 4 | 4 || 3327642 | 4 | 4 || 3327652 | 4 | 4 || 3327662 | 4 | 4 |+---------+-----+--------+5 rows in set (26.19 sec)mysql> select index_name,count(*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name;+------------+----------+| index_name | count(*) |+------------+----------+| PRIMARY | 4098 || val | 208 |+------------+----------+2 rows in set (0.04 sec)It can be seen that , here buffer pool About China test Table has 4098 Data pages ,208 Index pages .
select * from test a inner join (select id from test where val=4 limit 300000,5) ; To prevent the effect of the last test , We need to empty buffer pool, restart mysql.
mysqladmin shutdown/usr/local/bin/mysqld_safe &
mysql> select index_name,count(*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name;Empty set (0.03 sec)function sql:
mysql> select * from test a inner join (select id from test where val=4 limit 300000,5) b on a.id=b.id;+---------+-----+--------+---------+| id | val | source | id |+---------+-----+--------+---------+| 3327622 | 4 | 4 | 3327622 || 3327632 | 4 | 4 | 3327632 || 3327642 | 4 | 4 | 3327642 || 3327652 | 4 | 4 | 3327652 || 3327662 | 4 | 4 | 3327662 |+---------+-----+--------+---------+5 rows in set (0.09 sec)mysql> select index_name,count(*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name;+------------+----------+| index_name | count(*) |+------------+----------+| PRIMARY | 5 || val | 390 |+------------+----------+2 rows in set (0.03 sec) We can see clearly the difference between the two : first sql To load the 4098 Data pages to buffer pool, And the second one. sql Only loaded 5 Data pages to buffer pool. In line with our prediction . It also confirms why the first sql Will be slow : Read a lot of useless data rows (300000), Finally, he abandoned .
And it creates a problem : Loaded a lot of hot, not very high data pages to buffer pool, Can cause buffer pool Pollution of , Occupy buffer pool Space . Problems encountered
To make sure it's cleared every time you restart buffer pool, We need to close innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup, These two options control when the database is shut down dump Out buffer pool The data in the database and when the database is opened is loaded on the disk for backup buffer pool The data of .
That's all “SQL Example analysis of query optimization principle ” All the content of this article , Thank you for reading ! I believe you will gain a lot after reading this article , Xiaobian will update different knowledge for you every day , If you want to learn more , Please pay attention to the Yisu cloud industry information channel .
边栏推荐
- Icml2022 | reexamine end-to-end voice to text translation from scratch
- ICML2022 | 从零开始重新审视端到端的语音到文本翻译
- Software features and functions of the blind box mall app system development
- Creation of thread pool
- 【GMM+KDE】基于MATLAB的GMM和KDE核估计得目标跟踪仿真
- 聚簇索引和非聚簇索引
- Our understanding of the industrial Internet is still trapped in the logic of an Internet like platform and center
- [gmm+kde] target tracking simulation based on GMM and KDE kernel estimation of MATLAB
- 关于String.format(String format, Object... args)
- Dependencymanagement and dependencies
猜你喜欢

mysql 表机制

Icml2022 | revoir la traduction vocale de bout en bout du texte à partir de zéro

软件测试入门之软件测试的概念与过程(精辟内容)

Vulnhub's DC3
![[论文分享] PATA: Fuzzing with Path Aware Taint Analysis](/img/f6/627344c5da588afcf70302ef29d134.png)
[论文分享] PATA: Fuzzing with Path Aware Taint Analysis

Model Workshop

ICML2022 | 从零开始重新审视端到端的语音到文本翻译

HALCON联合C#检测表面缺陷——仿射变换(二)

R 语言绘制二维正态分布的密度曲面图;

Executor - Shutdown、ShutdownNow、awaitTermination 详解与实战
随机推荐
一 组工人合作完成某一部件的装配工序所需的时间(单位:分钟)分别如下:
Native support for the first version of arm64! Microsoft win11/10 free tool set PowerToys 0.59 release
Clustered and non clustered indexes
Redis数据结构
Untiy reset animation
300题 线代第一讲行列式
200 c language words, please collect!
0223-总结
Introduction to Wireshark capturing RTP load TS stream (UDP multicast)
Distributed Foundation
About not being able to create a new package under Src in idea
What are the restrictions on opening futures accounts? Where is the safest?
联想首次详解混合云Lenovo xCloud五大优势,如何打造智能化数字底座
Laravel8 enables alicloud file upload
30 frequently asked questions of 2022 latest software test interview questions [continuous update ~]
redis列表list常用命令大全
[original] analysis of nine price HPV data capture of Yilu app
MySQL组合索引不被命中使用的情况
0223 summary
Ma8601 pin √ pin replaces Tang Ming fe1.1s without changing the circuit board | perfectly replaces fe1.1s scheme