当前位置:网站首页>Alibaba Tianchi SQL training camp task4 learning notes
Alibaba Tianchi SQL training camp task4 learning notes
2022-07-05 22:57:00 【abbert】
1. Pay attention to distinguish the associated sub queries 、 The union of two tables 、 The table connects the relationship between the three .
2. Use NOT The predicate subtracts the set , Find out product In the table , The price is higher than 2000, But the profit is lower than 30% The goods , The results should be shown in the table below .
Solve this problem , use where+and It can be fixed , But the problem here is to let you use not Predicate resolution , So in and The latter condition cannot be expressed directly , Use not in + The way of subquery , The syntax in this way is equivalent to a comparison conditional statement .
SELECT * FROM product WHERE sale_price > 2000 AND product_id NOT IN (SELECT product_id FROM product WHERE sale_price<1.3*purchase_price)
3. Use not in + union It can achieve the purpose of finding the symmetrical difference between two tables , But because in MySQL 8.0 in , Because the of two tables or query results cannot be obtained directly , Therefore, it is not suitable to use the above idea to calculate the symmetry difference . Fortunately, there are difference set operations that can be used . You can see intuitively , The difference in symmetry between two sets is equal to A-B And go up B-A, Therefore, in practice, we can use this idea to find the symmetry difference .
4. With the help of the Union and symmetric difference of the two tables, the purpose of finding the intersection can be achieved , The intersection of two sets can be regarded as the intersection of two sets, and the symmetry difference between the two sets is removed .
5. Writing order :
SELECT》FROM 》WHERE》GROUP BY》HAVE》ORDER BY
Execution order :
FROM 》WHERE》GROUP BY》HAVE》SELECT》ORDER BY
6. natural SQL The order of statement execution is to execute the subquery first , But the order of associated sub queries is to execute the main query first . The main purpose of this is to solve , The main query queries a row of records each time and the sub query directly outputs the aggregate query results . adopt “ relation ” To subquery , Limit the number of outputs of subqueries each time , So as to achieve the purpose of querying different kinds of aggregation results in a table , Therefore, the associated sub query needs to execute the main query first . It's kind of similar python in for Nesting of loops .
Associated subqueries and table joins can achieve the same effect , But why should there be table connections ?
If you have excel Of vlookup function , You will find that this function can also achieve this function . actually , In the way of thinking , Associated subqueries are more like vlookup function : By table A Main table , Then according to the table A The value of each row of the associated column , Go to the table one by one B Find rows with equal values in the associated columns in .
When the amount of data is small , There is no performance problem with this approach , But when the amount of data is large , This method will lead to large computational overhead : For each row of data returned by the external query , Will pass the value of an associated column to the internal sub query , Then the internal subquery executes a query based on the passed in value and returns its query result . This makes , For example, the returned result of the external main query has 10000 rows , Then the subquery will be executed 10000 times , This will lead to a terrible time consumption .
Why use associated subqueries ?
Sometimes you need to query the extension information in the same table , Do not find a subquery ( View ) As a reference, you cannot directly query , So at this time, you can use the associated sub query , Problems that can be queried with associated subqueries can also be solved with inner connections , Just embed the filter criteria in one of the connection tables in advance .
About within the link , The following three points need to be noted :
Key points : To make a connection, you need to be in FROM Use multiple tables in clause .
Previous FROM There is only one table in the clause , And this time we used shopproduct and product Two tables , Use keywords INNER JOIN You can link the two tables together :
FROMshopproduct AS SP INNER JOINproduct AS P
combination WHERE Clause uses inner links
The first increase WEHRE The way clause , The above query is used as a sub query , Enclosed in parentheses , Then add filter criteria to the outer query .
SELECT *
FROM (-- Step 1: query results
SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.product_type
,P.sale_price
,SP.quantity
FROMshopproduct AS SP
INNER JOINproduct AS P
ON SP.product_id = P.product_id) AS STEP1
WHERE shop_name = ' Tokyo '
AND product_type = ' clothes ' ;
Remember what we learned when we learned sub query ? The result of the subquery is actually a table , It's just a virtual table , It doesn't really exist in the database , Only other tables in the database are filtered , A result of query operations such as aggregation " View ".
This way of writing can clearly distinguish each operation step , We are not very familiar with SQL Querying the execution order of each clause can help us .
But actually , If we know WHERE Clause will be FROM Clause is followed by , in other words , Finishing INNER JOIN … ON Get a new table , Will execute WHERE Clause , Then you get the standard writing :
SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.product_type
,P.sale_price
,SP.quantity
FROMshopproduct AS SP
INNER JOINproduct AS P
ON SP.product_id = P.product_id
WHERE SP.shop_name = ' Tokyo '
AND P.product_type = ' clothes ' ;
(1) Write where Filter statements (2) After filtering in each query, connect the results
No matter how to solve the problem , It's important to think , Get a question , First analyze whether you need to connect 、 How to determine the screening criteria 、 Need grouping and merging , Consider these problems clearly before , Choose a suitable sentence to write .
Example :
Find out the name and price of clothing goods in each store . The following results are expected :
The problem can be analyzed by sql The execution sequence of statements is substituted into the analysis step by step
1.from: Where are from ,product And shopproduct The two tables , Therefore, internal connection is required
shopproduct as sp inner join product as p on sp.product_id = p.product_id
2.where: What are the screening criteria , According to the meaning of the title, the condition is product_tape = ‘ clothes ’
3.select:SP.shop_id,SP.shop_name,SP.product_id ,P.product_name, P.product_type, P.purchase_price
( For the sentences that do not need to be executed in the topic, you can directly skip )
The final complete statement is :
-- Refer to the answer 1-- Do not use subqueries
SELECT SP.shop_id,SP.shop_name,SP.product_id ,P.product_name, P.product_type, P.purchase_price FROM shopproduct AS SP INNER JOIN product AS P ON SP.product_id = P.product_id WHERE P.product_type = ' clothes ';
-- Refer to the answer 2-- Use subquery
SELECT SP.shop_id, SP.shop_name, SP.product_id ,P.product_name, P.product_type, P.purchase_price FROM shopproduct AS SP INNER JOIN -- from product Table to find out the information of clothing products (SELECT product_id, product_name, product_type, purchase_price FROM product WHERE product_type = ' clothes ')AS P ON SP.product_id = P.product_id;
The second method is to directly perform a subquery on the table used for connection , So the direct connection is the filtered sub query ( View ), There is no need to filter after the connection .
Exercises :
In every store , What are the selling prices of the highest priced goods ?
-- Refer to the answer
SELECT SP.shop_id
,SP.shop_name
,MAX(P.sale_price) AS max_price
FROMshopproduct AS SP
INNER JOINproduct AS P
ON SP.product_id = P.product_id
GROUP BY SP.shop_id,SP.shop_name
1.from
2.group by ,group by The next grouping field needs to be filled in according to the actual situation , It is not limited to finding a maximum value and grouping a field
3.select
Inner link and associated subquery
Find out the items in each category that sell at a price higher than the average price of the product .
Associated subquery
SELECT product_type, product_name, sale_price FROM product AS P1 WHERE sale_price > (SELECT AVG(sale_price) FROM product AS P2 WHERE P1.product_type = P2.product_type GROUP BY product_type);
Internal connection :
SELECT P1.product_id ,P1.product_name ,P1.product_type ,P1.sale_price ,P2.avg_price FROM product AS P1 INNER JOIN (SELECT product_type,AVG(sale_price) AS avg_price FROM product GROUP BY product_type) AS P2 ON P1.product_type = P2.product_type WHERE P1.sale_price > P2.avg_price;
Ideas :
1.from:product surface , There is no need to connect
2.where: The selling price of each product category sale_price Higher than the average selling price of this kind of goods avg_price Here is an aggregate function , So just use where Pick up group by Definitely not the right number , Therefore, you need to associate sub queries first , Extract the average selling price of each category , Then output the query results , It's a way of thinking .
In addition, we encounter such problems , Self connection can be considered to solve , Is the connection of the same table , Similar to the associated sub query of the same table . First step : First, calculate the average selling price in groups ; The second step : Make internal connection according to the type of goods ; The third step : increase where Clause , Define the screening criteria .
summary : For cases where records need to be compared with aggregate results , Or use sub queries to get aggregate results and compare them with each record ; or : Aggregate first , Then make an internal connection with another table .
SELECT P1.product_id
,P1.product_name
,P1.product_type
,P1.sale_price
,P2.avg_price
FROM product AS P1
INNER JOIN
(SELECT product_type,AVG(sale_price) AS avg_price
FROM product
GROUP BY product_type) AS P2
ON P1.product_type = P2.product_type
WHERE P1.sale_price > P2.avg_price;
(task4-4.2.1.5) The background is : Find out the items in each category that sell at a price higher than the average price of the product ? Now we use inner connection to solve this problem , I want to ask you something ,P2 The table has passed group by Aggregated product_type, Why can it be combined with those that have not been aggregated P1 Table passing product_type Internal connection , Because “ON P1.product_type = P2.product_type”, I wrote this sentence , Even if the number is different, it can be connected ?
There is a mistake here : Table connection does not require the same number of public fields , There is no need to have the same name , As long as the value of the upper number is right , So here we can directly aggregate the P2 With non aggregated P1 Connected .
Secondly, don't force yourself to find “ or : Aggregate first , Then make an internal connection with another table .” This method is connected with the previous example “ Use subquery ” And “ Do not use subqueries ” There are two ways to find correlation , Because this question is to P1 The records in the table are consistent with P2 Records in which the average value is calculated by grouping aggregation are connected , It is fundamentally different from the two situations mentioned above , So there is no correlation between the two .
-------------------------------------------
This result is not consistent with the result given in the book , Less exercise T T-shirt , This is due to exercise T T-shirt regist_date Field is empty , When making natural connections , From product and product2 The movement of T When comparing this line of data , It's actually equivalent linking field by field , Remember when we were 6.2ISNULL,IS NOT NULL The comparison method of missing values learned in this section can be known , The two missing values are compared with an equal sign , The result is not true . The link will only return those lines that return true to the link condition .
Exercises : Use inner links to find product Table and product2 The intersection of tables .
SELECT P1.*
FROMproduct AS P1
INNER JOINproduct2 AS P2
ON (P1.product_id = P2.product_id
AND P1.product_name = P2.product_name
AND P1.product_type = P2.product_type
AND P1.sale_price = P2.sale_price
AND P1.regist_date = P2.regist_date)
The results are as follows
Note the above results and P230 The results are not consistent – Less product_id='0001' This business , Looking at the source table data, you can find , The less this line of data regist_date Is the missing value , Recall from Chapter 6 IS NULL The predicate , We learned that , This is because missing values cannot be compared with equal signs .
If we just use product_id To connect :
SELECT P1.*
FROMproduct AS P1
INNER JOINproduct2 AS P2
ON P1.product_id = P2.product_id
Query results :
It's the same this time . Internal connections only connect common records .
The inner link will discard the unsatisfied in the two tables ON The conditions are right , The opposite of the inner link is the outer link . The outer link will selectively keep the unmatched rows according to the type of outer link .
According to which table the reserved rows are located , There are three forms of external links : Left link , Right link and all outer link .
The left link will be saved and cannot be followed in the left table ON Clause matches to the line , At this time, the rows corresponding to the right table are all missing values ; The right link will save the right table, which cannot be followed ON Clause matches to the line , At this time, the rows corresponding to the left table are all missing values ; The total external join will save two tables at the same time ON Clause matches to the line , The corresponding row in another table is filled with missing values .
The corresponding grammars of the three external links are :
-- Left link
FROM <tb_1> LEFT OUTER JOIN <tb_2> ON <condition(s)>
-- Right link
FROM <tb_1> RIGHT OUTER JOIN <tb_2> ON <condition(s)>
-- All external links
FROM <tb_1> FULL OUTER JOIN <tb_2> ON <condition(s)>
Before, whether it is external connection or internal connection , A common prerequisite is the link condition –ON Clause , Used to specify the conditions of the link . If you have tried a link query without this link condition , You may have found out , There will be many lines . Remove... From the link ON Clause , It's called cross linking (CROSS JOIN), Cross linking is also called Cartesian product , The latter is a mathematical term . Two sets are Cartesian products , It's just using sets A Every element and set in B Each of the elements in makes up an ordered combination . Database table ( Or subquery ) And , Intersection and difference are operations such as expanding or filtering restrictions on the table vertically , This requires the number of columns in the table and the data type of the column in the corresponding position " Compatible with ", Therefore, these operations do not add new columns , And cross connect ( The cartesian product ) Is to expand the table horizontally , That is, add a new column , This is consistent with the function of the link . But because there is no ON Limitation of clause , Each row of the left table and the right table will be combined , This often leads to many meaningless rows appearing in the search results . Of course , In some query requirements , Cross linking also has some uses .
So in the associated sub query , Each piece of data in the main query will be combined with the sub query .
exercises :
4.1select *
form product as p1 full outer join product2 as p2
where sale_price > 5000;
intersection : The union of two sets removes the symmetry difference between the two sets
Symmetry difference :A-B And go up B-A, Union of two difference sets
Difference set :
A-B:select * from product where product_id not in (select product_id from product 2);
B-A:select * from product2 where product_id not in (select product_id from product 1);
Symmetry difference :
select * from product where product_id not in (select product_id from product 2)
union
select * from product2 where product_id not in (select product_id from product 1);
4.2.select * from
(select * from product
union
select * from product2) as p1
where product_id not in
(select * from product where product_id not in (select product_id from product 2)
union
select * from product2 where product_id not in (select product_id from product 1);
4.3
select p2.shop_name,p1.product_id,p1.product_type,max(sale_price)
from product as p1 right outer join shopproduct as p2 on product_id
group by product_type;
4.4
select product_type,product_id,sale_price
from product as p1
where sale_price = (select product_type,product_id,max(sale_price)
from product as p2
where p1.product_type = p2.product_type
group by p2.product_type);
select product_type,product_id,sale_price
from product as p1 inner join (select product_type,product_id,max(sale_price)
from product as p2
group by p2.product_type)
on p1.product_type = p2.product_type;
4.5
select product_id, produc_name, slae_price
from product
order by sale_price;
边栏推荐
- One article deals with the microstructure and instructions of class
- Metasploit(msf)利用ms17_010(永恒之蓝)出现Encoding::UndefinedConversionError问题
- Paddy serving v0.9.0 heavy release multi machine multi card distributed reasoning framework
- 东南亚电商指南,卖家如何布局东南亚市场?
- 30 optimization skills about mysql, super practical
- 二叉树(二)——堆的代码实现
- Global and Chinese markets for welding products 2022-2028: Research Report on technology, participants, trends, market size and share
- Metasploit(msf)利用ms17_010(永恒之蓝)出现Encoding::UndefinedConversionError问题
- Distributed resource management and task scheduling framework yarn
- Codeforces Global Round 19
猜你喜欢
openresty ngx_ Lua request response
[digital signal denoising] improved wavelet modulus maxima digital signal denoising based on MATLAB [including Matlab source code 1710]
第一讲:蛇形矩阵
傅里叶分析概述
MoCo: Momentum Contrast for Unsupervised Visual Representation Learning
d3dx9_ What if 29.dll is missing? System missing d3dx9_ Solution of 29.dll file
Hcip day 12 (BGP black hole, anti ring, configuration)
分布式解决方案之TCC
鏈錶之雙指針(快慢指針,先後指針,首尾指針)
Navigation day answer applet: preliminary competition of navigation knowledge competition
随机推荐
Global and Chinese markets of tantalum heat exchangers 2022-2028: Research Report on technology, participants, trends, market size and share
Openresty ngx Lua regular expression
Arduino measures AC current
audiopolicy
Editor extensions in unity
Metasploit (MSF) uses MS17_ 010 (eternal blue) encoding:: undefined conversionerror problem
使用rewrite规则实现将所有到a域名的访问rewrite到b域名
Request preview display of binary data and Base64 format data
LeetCode102. Sequence traversal of binary tree (output by layer and unified output)
Metaverse ape ape community was invited to attend the 2022 Guangdong Hong Kong Macao Great Bay metauniverse and Web3.0 theme summit to share the evolution of ape community civilization from technology
I closed the open source project alinesno cloud service
Nanjing: full use of electronic contracts for commercial housing sales
利用LNMP实现wordpress站点搭建
Nail error code Encyclopedia
傅里叶分析概述
Error when LabVIEW opens Ni instance finder
南京:全面启用商品房买卖电子合同
VOT Toolkit环境配置与使用
d3dx9_ How to repair 31.dll_ d3dx9_ 31. Solution to missing DLL
How can easycvr cluster deployment solve the massive video access and concurrency requirements in the project?