当前位置:网站首页>Summary of SQL aggregate query method for yyds dry goods inventory
Summary of SQL aggregate query method for yyds dry goods inventory
2022-07-01 19:59:00 【51CTO】
SQL Why support aggregate queries ?
This seems like a naive question , But let's think about it step by step . Data is stored at behavioral granularity , The simplest SQL The sentence is select * from test, What you get is the whole two-dimensional table details , But this alone is not enough , For two purposes , need SQL Provide aggregate function :
1. The detailed data has no statistical significance , For example, I want to know the total turnover today , And don't care much about how much a table guest spends .
2. Although you can check the data into memory first and then aggregate , But when the amount of data is very large, it is easy to burst the memory , Maybe a table has a data volume of... In one day 10TB, and 10TB Even if the data can be read into memory , Aggregation computing may also be unacceptably slow .
In addition, aggregation itself has a certain logical complexity , and SQL Provides aggregation functions and grouping aggregation capabilities , It can easily and quickly count the aggregated data with business value , This lays the foundation for SQL The analytical value of language , Therefore, most analysis software directly adopts SQL As a direct user oriented expression .
Aggregate functions
Common aggregate functions are :
- COUNT: Count .
- SUM: Sum up .
- AVG: averaging .
- MAX: For maximum .
- MIN: For the minimum .
COUNT
COUNT Used to calculate how many pieces of data there are , Let's see id How many in this column :
But we found that we actually checked any column COUNT It's all the same , The introduction id What is it ? There is no need to find a specific column to refer to , So it can also be written as :
But there are subtle differences between the two .SQL There is a very special value type NULL
, If COUNT Specific columns are specified , This column will be skipped during statistics, and the value is NULL
The line of , and COUNT(*)
Because no specific column is specified , So even if it includes NULL
, Even all columns in a row are NULL
, Will also be included . therefore COUNT(*)
The result must be greater than or equal to COUNT(c1)
.
Of course, any aggregate function can follow the query criteria WHERE, such as :
SUM
SUM Sum all terms , Therefore, it must act on the numeric field , Not for strings .
SUM encounter NULL Value when 0 Handle , Because it's equivalent to ignoring .
AVG
AVG Find the value of all terms , Therefore, it must act on the numeric field , Not for strings .
AVG encounter NULL Value is ignored in the most thorough way , namely NULL Completely not involved in the calculation of numerator and denominator , Just like this line of data does not exist .
MAX、MIN
MAX、MIN Find the maximum and minimum values respectively , When the above is different , It can also act on strings , So you can judge the size according to the letters , From big to small a-z
, But even if it can be counted , It has no practical significance and is difficult to understand , Therefore, it is not recommended to extremum the string .
Multiple aggregate fields
Although they are aggregate functions , but MAX、MIN Strictly speaking, it is not an aggregate function , Because they just look for lines that meet the conditions . You can see the comparison of the query results in the following two paragraphs :
The first query can find the row with the maximum value id, And the second query id It's meaningless , Because I don't know which line to belong to , So only the first data is returned id.
Of course , If you calculate at the same time MAX、MIN, So at this time id Only the value of the first data is returned , Because the query result corresponds to a complex number of rows :
Based on these characteristics , It's best not to mix polymerization and non polymerization , That is, once a field in a query is aggregated , Then all fields must be aggregated .
A lot now BI All custom fields of the engine have this restriction , Because there are many boundary conditions when mixing aggregation and non aggregation in user-defined memory calculation , although SQL Can support , However, business customized functions may not support .
Group aggregation
Grouping aggregation is GROUP BY, In fact, it can be regarded as an advanced conditional statement .
for instance , Query each country's GDP Total amount :
The returned results are grouped by country , At this time , The aggregation function becomes aggregation within a group .
In fact, if we just want to see 、 beauty GDP, Non grouping can also be used to check , Just divide it into two SQL:
therefore GROUP BY It can also be understood as , Find out all enumerable conditions of a field , And integrate it into a table , Each line represents an enumeration case , It doesn't need to be broken down into one by one WHERE The query .
Multi field grouping aggregation
GROUP BY You can use... For multiple dimensions , The meaning is equivalent to the row in table query / Drag columns into multiple dimensions .
It's on it BI Query tool perspective , If there is no context , You can see the following progressive description :
- Group and aggregate by multiple fields .
- Multiple fields are combined to become unique Key, namely
GROUP BY a,b
Express a,b Together, describe a group . -
GROUP BY a,b,c
You may see many duplicate results of the first column of the query a That's ok , The second column sees the repetition b That's ok , But in the same a There will be no repetition within the value ,c stay b The same is true in the line .
Here is an example :
GROUP BY + WHERE
WHERE It is filtered according to the criteria of the row . therefore GROUP BY + WHERE Not in groups , It's screening the whole .
But because of filtering by row , In fact, the results are exactly the same within or outside the group , So we can hardly perceive the difference :
However , Ignoring this difference will cause us to run into a wall in aggregation screening .
For example, we should screen out the average score greater than 60 The sum of students' grades , If you don't use subqueries , It cannot be used in ordinary query WHERE Add aggregate function to achieve , For example, the following is an example of a syntax error :
Don't fantasize about the above SQL Can be executed successfully , Not in WHERE Use aggregate functions in .
GROUP BY + HAVING
HAVING It is filtered according to the group . So you can HAVING Using aggregate functions :
In the above example, you can query normally , It means to look at the total score according to the grouping of classes , And only the average score greater than 60 Class .
So why HAVING You can use aggregation conditions ? because HAVING Filtering is a group , Therefore, you can filter out groups that do not meet the conditions after group aggregation , It makes sense . and WHERE For row granularity , After aggregation, there is only one piece of data in the whole table , It doesn't make sense whether it's filtered or not .
But here's the thing ,GROUP BY Index filtering cannot be used to generate derived tables , therefore WHERE You can optimize performance by indexing fields , and HAVING Does not work for index fields .
边栏推荐
- Set object value changes null value object
- The key to the success of digital transformation enterprises is to create value with data
- Is Dao safe? Build finance encountered a malicious governance takeover and was looted!
- Redis installation and startup in Windows environment (background startup)
- HLS4ML报错The board_part definition was not found for tul.com.tw:pynq-z2:part0:1.0.
- Image acquisition and playback of coaxpress high speed camera based on pxie interface
- Tensorflow reports an error, could not load dynamic library 'libcudnn so. eight
- 一文读懂C语言中的结构体
- DS transunet: Dual Swing transformer u-net for medical image segmentation
- 2022/6/8-2022/6/12
猜你喜欢
全国职业院校技能大赛网络安全“splunk“详细配置
振弦采集模块测量振弦传感器的流程步骤
开发那些事儿:EasyCVR平台添加播放地址鉴权功能
Example explanation: move graph explorer to jupyterlab
Solve the problem of slow or failed vscode download
漏洞复现-.Net-ueditor上传
A quietly rising domestic software, low-key and powerful!
[research materials] iResearch tide Watching: seven major trends in the clothing industry - Download attached
HLS4ML报错The board_part definition was not found for tul.com.tw:pynq-z2:part0:1.0.
uniapp使用腾讯地图选点 没有window监听回传用户的位置信息,怎么处理
随机推荐
math_利用微分算近似值
Redis installation and startup in Windows environment (background startup)
windows环境 redis安装和启动(后台启动)
Solve the problem of slow or failed vscode download
js三元表达式复杂条件判断
uniapp使用腾讯地图选点 没有window监听回传用户的位置信息,怎么处理
【let var const】
基于图的 Affinity Propagation 聚类计算公式详解和代码示例
新增订单如何防止重复提交
docker ubuntu容器中安装mysql遇到的问题
Review the collection container again
振弦采集模塊測量振弦傳感器的流程步驟
MySQL reports an error can't create table 'demo01 tb_ Student‘ (errno: 150)*
HLS4ML报错The board_part definition was not found for tul.com.tw:pynq-z2:part0:1.0.
类加载机制
Stack Overflow 2022 开发者调查:行业走向何方?
Sum the amount
[research materials] Huawei Technology ICT 2021: at the beginning of the "Yuan" year, the industry is "new" -- download attached
Leetcode 1380 lucky numbers in matrix [array] the leetcode path of heroding
New window open page -window open