当前位置:网站首页>The SQL query statement executes select (1) first, and the ByteDance algorithm engineer is interviewed

The SQL query statement executes select (1) first, and the ByteDance algorithm engineer is interviewed

2022-06-21 13:53:00 Programmer hyperspace

SELECT

< Back to the data list > # The returned single column must be in group by clause , Except aggregate functions

DISTINCT

# Data De duplication

ORDER BY

< Sorting conditions > # Sort

LIMIT

< Row limit >

  • Actually , When the engine performs each of these steps , Will form a virtual table in memory , And then do the following operation on the virtual table , And free the memory of the unused virtual table , And so on .

Specific explanation :( notes : below “VT” Express → Virtual table virtual )

  1. from:**select * from table_1, table_2; And select * from table_1 join table_2; The result is the same , They all mean Cartesian product ;** It is used to directly calculate the Cartesian product of two tables , Get the virtual table VT1, This is all. select Statement is the first operation to be performed , Other operations are performed on this table , That is to say from What the operation accomplished

  2. on: from VT1 Filter the data that meets the criteria in the table , formation VT2 surface ;

  3. join: Will be join Type of data added to VT2 In the table , for example left join The remaining data from the left table will be added to the virtual table VT2 in , formation VT3 surface ; If the number of tables is greater than 2, It will repeat 1-3 Step ;

  4. where: Perform screening ,( You can't use aggregate functions ) obtain VT4 surface ;

  5. group by: Yes VT4 Tables are grouped , obtain VT5 surface ; After processing the statement , Such as select,having, The columns used must be included in group by In the condition , Those that don't appear need aggregate functions ;

  6. having: Filter grouped data , obtain VT6 surface ;

  7. select: Return the column to get VT7 surface ;

  8. distinct: Used to get rid of duplication VT8 surface ;

  9. order by: Used to sort to get VT9 surface ;

  10. limit: Returns the number of rows required , obtain VT10;

It should be noted that :

  • group by In the condition , Each column must be a valid column , It can't be an aggregate function ;
  • null Values are also returned as a group ;
  • Except for aggregate functions ,select Columns in clause must be in group by In the condition ;

The above shows us what a query will return , meanwhile , Also answer the following questions :

  • Can be in GRROUP BY Then use WHERE Do you ?( no way ,GROUP BY Is in WHERE after !)

  • Can I filter the results returned by window functions ?( no way , The window function is SELECT In the sentence , and SELECT Is in WHERE and GROUP BY after )

  • Can be based on GROUP BY What's going on in ORDER BY Do you ?( Sure ,ORDER BY Basically at the end of the execution , So it can be based on anything ORDER BY)

  • LIMIT When is the execution ?( In the end !)

however , The database engine does not have to be executed strictly in this order SQL Inquire about , Because in order to execute queries faster , They will make some optimizations , These questions are explained below ↓↓↓.

SQL The alias in will affect SQL Execution order ?

======================

The following parties SQL Shown :


SELECT CONCAT(first_name, ' ', last_name) AS full_name, count(*)

FROM table

GROUP BY full_name

From this statement , As if GROUP BY Is in SELECT After that , Because it quotes SELECT One of the aliases in . But it doesn't have to be , The database engine will rewrite the query like this ↓↓↓:


SELECT CONCAT(first_name, ' ', last_name) AS full_name, count(*)

FROM table

GROUP BY CONCAT(first_name, ' ', last_name)

therefore , such GROUP BY Still execute first .

in addition , The database engine will also do a series of checks , Make sure SELECT and GROUP BY The things in are effective , So we will check the query before generating the execution plan .

The database is likely to perform queries out of the normal order ( Optimize )

====================

In practice , The database does not necessarily follow JOIN、WHERE、GROUP BY Order to execute the query , Because they do a series of optimizations , Disorganize the order of execution , So that the query can execute faster , As long as the query results are not changed .

This query shows why queries need to be executed in different order :


SELECT * FROM

dept d LEFT JOIN student s ON d.student_id = s.id



#  Pleasantly surprised 

 Finally, I prepared a set of interview questions corresponding to the above materials ( There's an answer ) And the high-frequency interview algorithm questions during the interview ( If the interview preparation time is not enough , Then concentrate on these algorithm problems , Hit rate 85%+)

![image.png](https://img-blog.csdnimg.cn/img_convert/a2f6f9cda63f0e1b601d8a25895dd31a.png)


![image.png](https://img-blog.csdnimg.cn/img_convert/c3838a44e11db29168828428e1106450.png)

 I also prepared a set of interview questions corresponding to the above materials ( There's an answer ) And the high-frequency interview algorithm questions during the interview ( If the interview preparation time is not enough , Then concentrate on these algorithm problems , Hit rate 85%+)

[ Outside the chain picture transfer in ...(img-o24VTF6k-1628591532162)]


[ Outside the chain picture transfer in ...(img-KvLHOxFR-1628591532164)]

**[ Data acquisition method : Get it free of charge here ](https://gitee.com/vip204888/java-p7)**
原网站

版权声明
本文为[Programmer hyperspace]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202221431134212.html