当前位置:网站首页>Why does select * lead to low query efficiency?
Why does select * lead to low query efficiency?
2022-06-28 08:58:00 【000X000】
Whether at work or in an interview , About SQL Do not use in “SELECT *”, It's all the questions we've heard so much about , It's bad to hear , But the general understanding is still at a very shallow level , Not many people go to the bottom of it , Explore its principles .
I don't say much nonsense , This article will give you a deeper understanding of "SELECT * " Reasons and scenarios of low efficiency .
One 、 The reason for low efficiency
Take a look at the latest 《 Ali java Development Manual ( Taishan Edition )》 in MySQL Part description :
4 - 1. 【 mandatory 】 In table query , Never use * List of fields as query , What fields are required must be clearly stated .
explain :
Increase query analyzer parsing cost .
It is easy to increase or decrease fields resultMap Inconsistent configuration .
Useless fields add network Consume , In especial text Type field .
Several reasons are mentioned in the development manual , Let's take a closer look :
1. Unnecessary columns increase data transfer time and network overhead
use “SELECT * ” The database needs to parse more objects 、 Field 、 jurisdiction 、 Properties and other related content , stay SQL Complex sentences , In the case of more hard parsing , It's a huge burden on the database .
Increase network overhead ;* Sometimes it will be mistakenly taken with log、IconMD5 Such useless and large text fields , The data transfer size It will increase geometrically . If DB Not on the same machine as the application , This kind of expense is very obvious
Even if mysql The server and the client are on the same machine , The protocol used is still tcp, Communication also takes extra time .
2. For useless large fields , Such as varchar、blob、text, Will increase io operation
To be precise , The length exceeds 728 Byte time , The excess data will be serialized to another place first , So reading this record will increase once io operation .(MySQL InnoDB)
3. Lose MySQL Optimizer “ Overlay index ” The possibility of strategy optimization
SELECT * It eliminates the possibility of index coverage , And based on MySQL Optimizer's “ Overlay index ” The strategy is very fast again , Very efficient , Query optimization is highly recommended by the industry .
for example , There is a table for t(a,b,c,d,e,f), among ,a Primary key ,b Have an index .
that , There are two on the disk B+ Trees , They are clustered index and auxiliary index ( Including a single column index 、 Joint index ), Keep separately (a,b,c,d,e,f) and (a,b), If the query condition is where Conditions can be passed by b The index of the column filters out some records , The query will go first to the secondary index , If the user just needs a Column sum b Columns of data , The data that users query can be known directly through the auxiliary index .
If the user uses select *, Get data you don't need , First, the data is filtered through the secondary index , Then we get all the columns through the clustered index , That's one more time b+ Tree query , It's bound to be a lot slower .

Picture taken from blog 《 I went to , Why the leftmost prefix principle fails ?》
Because the secondary index has much less data than the clustered index , In many cases , Overlay index by secondary index ( All the columns needed by the user can be obtained through the index ), You don't need to read the disk , Directly from memory , The clustered index is likely to have data on disk ( External storage ) in ( Depending on buffer pool The size and hit rate of ), In this case , One is memory reading , One is disk reading , The speed difference is significant , It's almost an order of magnitude difference .
Two 、 Extended knowledge index
The auxiliary index is mentioned above , stay MySQL A single column index is included in the index 、 Joint index ( Multi column Association ), A single column index is no longer redundant , Here's the role of the union index
Joint index (a,b,c)
Joint index (a,b,c) Actually established (a)、(a,b)、(a,b,c) Three indexes
We can think of the composite index as the first level table of contents of a book 、 Two level directory 、 Three level directory , Such as index(a,b,c), amount to a It's a level one catalog ,b It's a secondary directory under the first level directory ,c It's the third level directory under the secondary directory . To use a directory , Must first use its parent directory , Except for the first level catalogue .
as follows :
The advantages of a federated index
1) To reduce overhead
Build a union index (a,b,c) , It's actually equivalent to building (a)、(a,b)、(a,b,c) Three indexes . Every more index , Will increase the cost of write operations and disk space . For tables with large amounts of data , Using federated indexes can greatly reduce the cost !
2) Overlay index
On union index (a,b,c), If there are the following sql Of ,
SELECT a,b,c from table where a='xx' and b = 'xx';
that MySQL You can get the data directly by traversing the index , No need to return the form , This reduces a lot of randomness io operation . Reduce io operation , Especially random io It's actually DBA The main optimization strategy . therefore , In real application , Coverage index is one of the main optimization methods to improve performance .
3) Efficient
There are many index columns , The less data is filtered through the federated index . Such as the 1000W Table of data , There are the following SQL:
select col1,col2,col3 from table where col1=1 and col2=2 and col3=3;
hypothesis : Let's assume that each condition can filter out 10% The data of .
A. If there is only a single column index , Then we can filter out 1000W10%=100w Data , Then return to the table from 100w Match found in data col2=2 and col3= 3 The data of , And then sort , Page again , And so on ( recursive );
B. If it is (col1,col2,col3) Joint index , Through the three column index to filter out 1000w10% 10% *10%=1w, Efficiency improvement can be imagined !
Is the more indexes built, the better
The answer, of course, is No
Tables with small amount of data do not need to be indexed , It will increase the index cost
Do not index columns that are not frequently referenced , Because it's not often used , Even if it's indexed, it doesn't make much sense
Do not index columns that are frequently updated , Because it will definitely affect the efficiency of insertion or update
Fields with duplicate and evenly distributed data , So it doesn't do much to index ( For example, the gender field , Only men and women , Not suitable for indexing )
Data changes need to be indexed , It means that the more indexes, the higher the maintenance cost .
More indexes also need more storage space
边栏推荐
- 华泰证券网上开户安全吗 办理流程是什么
- Integer partition
- AWS builds a virtual infrastructure including servers and networks (2)
- 小程序 :遍历list里面的某个数组的值,等同于 for=“list” list.comment里面的某一项
- High rise building fire prevention
- AVFrame内存管理api
- Implementation of single sign on
- 用Pytorch搭建第一個神經網絡且進行優化
- Common test method used by testers --- orthogonal method
- [go ~ 0 to 1] on the first day, June 24, variables, conditional judgment cycle statement
猜你喜欢

Basic operation of PMP from applying for the exam to obtaining the certificate, a must see for understanding PMP

Kali installation configuration
![[big case] Xuecheng online website](/img/40/beec3ba567f5a372899bb58af0d05a.png)
[big case] Xuecheng online website

STL - inverter

Superimposed ladder diagram and line diagram and merged line diagram and needle diagram

Webrtc advantages and module splitting

JMeter -- interface test 1

Infinite penetration test

Assertions used in the interface automation platform

抖音服務器帶寬有多大,才能供上億人同時刷?
随机推荐
Implementation of single sign on
temple
TCP
Where is CentOS mysql5.5 configuration file
个人究竟如何开户炒股?在线开户安全么?
[.Net6] GRP server and client development cases, as well as the access efficiency duel between the minimum API service, GRP service and traditional webapi service
[untitled]
抖音服务器带宽有多大,才能供上亿人同时刷?
Analysis of prepaid power purchase device
How to solve the problem of high concurrency and seckill
Is it safe to open an account for online stock speculation?
如何抑制SiC MOSFET Crosstalk(串擾)?
AWS saves data on the cloud (3)
Power data
[go ~ 0 to 1] on the first day, June 24, variables, conditional judgment cycle statement
【.NET6】gRPC服务端和客户端开发案例,以及minimal API服务、gRPC服务和传统webapi服务的访问效率大对决
[big case] Xuecheng online website
Container adapter - stack: stack queue: queue priority_ Queue: priority queue
rman备份报ORA-19809 ORA-19804
Loss损失函数