当前位置:网站首页>MySQL SQL statement optimization
MySQL SQL statement optimization
2022-06-09 12:06:00 【Help me to the Internet cafe】
Statements to optimize
That is, the optimizer uses its own optimizer to write to us SQL To optimize , Then put it in InnoDB In-engine execution .
Condition simplification
Remove unnecessary parentheses
select * from x where ((a = 5));
The brackets above are unnecessary , The optimizer will directly remove .
select * from x where a = 5;
Equivalent transmission
select * from x where b = a and a = 5;
alike , Although it is a comparison of two columns , however a There is only one value for , So you can optimize
select * from x where b = 5 and a = 5;
Constant passing
select * from x where a = 5 and b > a;
It can be optimized to
select * from x where a = 5 and b > 5;
Remove useless conditions
select * from x where a < 5 and b > 10 and b > a;
When the current two conditions occur , The last condition is bound to happen , So you can optimize
select * from x where a < 5 and b > 10;
Expression evaluation
select * from x where -a > -5;
The optimizer does not optimize it , And there are many disadvantages, that is, you can't use the index , So let's try to make the column appear alone , Not in expression evaluation .
Constant table detection
When there are only one or two pieces of data in the table , Or if you use the index equivalent query of primary key or unique column, it will be MySQL The optimizer treats as a constant table , Direct will SQL The statement is optimized into a constant .
select * from table1 join table2 on table1.col1 = table2.col2 where table1 = 'a';
select table1 The columns of are treated as constants ,table2.* from table2 where table1 The constant col1 = table2.col2;
External connection elimination
External connection , First, the order of connection is fixed , Therefore, the driven table and the driven table are fixed . Therefore, the driver table cannot be exchanged like the inner connection .
But what? , There is a situation
select * from table1 left join table2 on table1.col1 = table2.col2 where table2.col2 is not null;
We set table2 The column of is not empty , What does that mean? , When table1 Set when it doesn't match table2 As a null, But it does not meet the search criteria and is filtered out , So left connection matching failed null Equivalent to failure . There is no difference between this statement and inner connection , Directly optimize it into internal connection .
So when external connections appear , But the driven table Reject null values when , At this time, the external connection and internal connection can be converted to each other , The inner connection can be optimized by exchanging the driver table SQL Query cost .
Sub query optimization
Sub query classification
- Scalar subquery
- Column query
- Line sub query
- Table sub query
Subdivision
- Correlation subquery
- Uncorrelated subqueries
Scalar subquery
Unrelated scalar subqueries
select * from x where key1 = (select y.key1 from y where y.primarykey = 1);
For unrelated scalar subqueries , Is to execute the sub query first , Then query the external query .
Correlation subquery
select * from s1 where key1 = (select common_field from s2 where s1.key3 = s2.key3 limit 1);
For related scalar subqueries
- First, take out each external row that meets its own search criteria , Then, the value of the corresponding column of the sub query is transferred .
- Calculate the result of subquery
- External judgment key1 Whether the result given by this sub query meets the conditions , Meet the requirements of adding the result line .
- Continue to cycle back to 1, Until all the rows of the outer table are traversed .
In fact, it is similar to the process of connection .
The optimizer does not need any optimization for scalar subqueries , Because for scalar subqueries , The amount of data is still very small .
IN Sub query optimization
select * from x where key1 in (select key3 from y);
For the above irrelevant IN For inquiry , If IN If the sub query has few parameters , You can also try loading into memory , Then let the outer query compare many conditions .
But if the amount of sub query data increases , Memory cannot be fully loaded , Or cause too many parameters to be compared in the outer query , Outer layer records need to be compared for too many conditions , Cause the index to be unusable , Because every time you use an index , Compare every time , It's better to scan the whole table directly . Finally, the performance is very low .
Physicochemical table optimization
MySQL Right in Too many parameters , The subquery will not be used as an external parameter , Instead, create a temporary table directly to store the results of the subquery .
- Make the column of the temporary table the column of the subquery result , And remove the weight .
- Temporary tables are usually not too large after weight removal , Created is Memory Temporary table of storage engine , And create a hash index .
Sub query to materialization table materialized_table after , We can also convert materialized tables and outer queries into joins .
select x.* from x inner join materialized_table m on key1 = m.key3;
Then we can use the previous knowledge of cost calculation to calculate which is more suitable as the driving table .
Only unrelated subqueries can be converted to materialized tables
semi-join Optimize
Like the above results , We convert the query results into materialized tables , Then we are transforming the materialized table into a connection .
Why can't we directly convert subqueries into joins ? This is it. semi-join Optimize .
We can try to convert it into the following statement
select x.* from x join y on key1 = key3;
Three situations
- Was the driver table y The row of does not meet the connection condition , Cannot join result set .
- Was the driver table y One key3 Satisfy and drive tables x Of key1 Equal and y surface key3 There is only one , A record is added to the result set .
- Was the driver table y Yes key3 The connection conditions are met, but one key3 There are many records , Multiple records will be added to the result set .
The condition that can be satisfied is y Tabular key3 Is the primary key or unique column , Otherwise, there will be multiple situations , This sentence is not equal to the primitive sentence .
But at this time semi join The emergence of the concept of semi connection , In the case of semi connection , For the drive table x Come on , We only care about driven tables y Whether there are records that can meet the connection conditions , And don't care about the driven table y There are several that match , Finally, the result set only saves the driver table x The record of .
Realize semi connection semi join Methods .PS:semi join Semi connection is just a concept .
- Table pullout ( Pull up table in sub query )
- When the query column of the sub query ( namely select The column of ) Is the primary key or unique column , That's what we said above join Just come out , Because there won't be more than one
- DuplicateWeedout execution strategy ( De duplication strategy )
- We don't mention the above, we change it to join Will there be duplication of methods , Because the repetition of the driven table leads to the repetition of the driven table .
- Let's just create a temporary table , hold s1 Record the results of the connection id ( Is the of the data row id It can be understood that ) Put into temporary table , When the data row is added again, the temporary table will throw an exception of duplicate primary key , You won't add duplicate lines .
- LooseScan execution strategy ( Loose index scan )
- When the subquery column key1 Index with sub query table , So we can access... Through the index , For each value , Visit only one line , Duplicate values are no longer accessed , This prevents multiple records .
- Semi-join Materialization execution strategy ( Physicochemical surface semi connection )
- take Uncorrelated subqueries Materialize into temporary table by materializing table , There are no duplicate lines , We can convert it directly into a connection .
- FirstMatch execution strategy ( Match for the first time )
- Take a record of the external connection , Then compare with the sub query one by one . The most primitive method
semi join Conditions of use :
- The subquery must be and IN Statement , And in the outer layer Where and on Clause .
- The search criteria for the outer layer must be and and in Subquery connected .
- A subquery is a single query , You can't union
- Subquery cannot contain group by、having、 Aggregation function
- ...
EXISTS Optimize
If you can't use semi join And physicochemical table , We can also put in Transform the sentence into EXISTS sentence .
Change the above into the following sentence .
select * from x where exists (select 1 from y where key3 = x.key1)
If the driven table key3 There is an index , You can use the index o( ̄▽ ̄)d.
边栏推荐
- VMware vSphere 6.5 configuration family
- Real questions and answers of comprehensive knowledge of system integration project management engineer in the second half of 2021
- HEVC之HM学习02
- 2. < tag backtracking, combination and pruning > lt.216 Total number of combinations|||
- PMP项目管理知识体系
- MySQL 乐观锁、悲观锁、多粒度锁
- 由于没有远程桌面授权服务器可以提供许可证
- 中國科學院院刊 | 包雲崗:加速發展關鍵核心技術,必須把握技術發展的自身規律
- Origin 2022b | 更新及安装 | 中英文切换
- 3.<tag-回溯和组合及其剪枝>lt.17. 电话号码的字母组合
猜你喜欢

10.<tag-二叉树和BST基础>lt.700. 二叉搜索树中的搜索 + lt.98. 验证二叉搜索树 + lt.530. 二叉搜索树的最小绝对差(同lt.783)

5.<tag-回溯和切割问题>lt.93.复原IP地址

12. < tag binary tree and BST foundation > lt.701 Insert operation DBC in binary search tree

Google chrome插件 | pagenote 网页标记

Iphone5s display disabled solution

【堆排|快排】Top-k问题

How to model 3DMAX (I)

11.<tag-二叉树和BST基础>lt.501. 二叉搜索树中的众数
![[reprint] what is the](/img/7a/de7df9830f589be0b2214f875eb95e.png)
[reprint] what is the "brain crack" of distributed systems?

JMeter安装教程
随机推荐
使用U盘一比一拷贝核心板系统镜像的方法
dotnet core 也能协调分布式事务啦!
From these papers in 2022, we can see the trend of recommended system sequence modeling
3dmax如何建模(一)
7.<tag-回溯和子集问题>lt.70.子集 + lt.90.子集 II
04 | everything must be done in advance: four issues that must be clearly considered before the construction of China Taiwan Relations
No remote desktop license server can provide licenses
1.<tag-回溯和组合及其剪枝>lt.77.组合 +剪枝 dyh
01 | 来龙去脉:中台为什么这么火?
Relay alphafold! Star pharmaceutical science and technology released a new era of tbind opening molecular protein complex structure prediction
Real questions and answers of comprehensive knowledge of system integration project management engineer in the second half of 2021
Learning notes of segmentation, paging, page table and quick table
How to model 3DMAX (I)
Origin 2022b | 更新及安装 | 中英文切换
08 | middle stage landing step 3: middle stage planning and design
马斯克 “取消交易”威胁奏效 推特同意开放数据库供其核查
07 | 中台落地第二步:企业数字化全景规划(Define)
【数据中台】00丨开篇词丨数据中台,是陷阱?还是金钥匙?
计算字符串公式的结果
Security evaluation of commercial password application