当前位置:网站首页>Alibaba Tianchi SQL training camp task4 learning notes
Alibaba Tianchi SQL training camp task4 learning notes
2022-07-05 22:57:00 【abbert】
1. Pay attention to distinguish the associated sub queries 、 The union of two tables 、 The table connects the relationship between the three .
2. Use NOT The predicate subtracts the set , Find out product In the table , The price is higher than 2000, But the profit is lower than 30% The goods , The results should be shown in the table below .
Solve this problem , use where+and It can be fixed , But the problem here is to let you use not Predicate resolution , So in and The latter condition cannot be expressed directly , Use not in + The way of subquery , The syntax in this way is equivalent to a comparison conditional statement .
SELECT * FROM product WHERE sale_price > 2000 AND product_id NOT IN (SELECT product_id FROM product WHERE sale_price<1.3*purchase_price)
3. Use not in + union It can achieve the purpose of finding the symmetrical difference between two tables , But because in MySQL 8.0 in , Because the of two tables or query results cannot be obtained directly , Therefore, it is not suitable to use the above idea to calculate the symmetry difference . Fortunately, there are difference set operations that can be used . You can see intuitively , The difference in symmetry between two sets is equal to A-B And go up B-A, Therefore, in practice, we can use this idea to find the symmetry difference .
4. With the help of the Union and symmetric difference of the two tables, the purpose of finding the intersection can be achieved , The intersection of two sets can be regarded as the intersection of two sets, and the symmetry difference between the two sets is removed .
5. Writing order :
SELECT》FROM 》WHERE》GROUP BY》HAVE》ORDER BY
Execution order :
FROM 》WHERE》GROUP BY》HAVE》SELECT》ORDER BY
6. natural SQL The order of statement execution is to execute the subquery first , But the order of associated sub queries is to execute the main query first . The main purpose of this is to solve , The main query queries a row of records each time and the sub query directly outputs the aggregate query results . adopt “ relation ” To subquery , Limit the number of outputs of subqueries each time , So as to achieve the purpose of querying different kinds of aggregation results in a table , Therefore, the associated sub query needs to execute the main query first . It's kind of similar python in for Nesting of loops .
Associated subqueries and table joins can achieve the same effect , But why should there be table connections ?
If you have excel Of vlookup function , You will find that this function can also achieve this function . actually , In the way of thinking , Associated subqueries are more like vlookup function : By table A Main table , Then according to the table A The value of each row of the associated column , Go to the table one by one B Find rows with equal values in the associated columns in .
When the amount of data is small , There is no performance problem with this approach , But when the amount of data is large , This method will lead to large computational overhead : For each row of data returned by the external query , Will pass the value of an associated column to the internal sub query , Then the internal subquery executes a query based on the passed in value and returns its query result . This makes , For example, the returned result of the external main query has 10000 rows , Then the subquery will be executed 10000 times , This will lead to a terrible time consumption .
Why use associated subqueries ?
Sometimes you need to query the extension information in the same table , Do not find a subquery ( View ) As a reference, you cannot directly query , So at this time, you can use the associated sub query , Problems that can be queried with associated subqueries can also be solved with inner connections , Just embed the filter criteria in one of the connection tables in advance .
About within the link , The following three points need to be noted :
Key points : To make a connection, you need to be in FROM Use multiple tables in clause .
Previous FROM There is only one table in the clause , And this time we used shopproduct and product Two tables , Use keywords INNER JOIN You can link the two tables together :
FROMshopproduct AS SP INNER JOINproduct AS P
combination WHERE Clause uses inner links
The first increase WEHRE The way clause , The above query is used as a sub query , Enclosed in parentheses , Then add filter criteria to the outer query .
SELECT *
FROM (-- Step 1: query results
SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.product_type
,P.sale_price
,SP.quantity
FROMshopproduct AS SP
INNER JOINproduct AS P
ON SP.product_id = P.product_id) AS STEP1
WHERE shop_name = ' Tokyo '
AND product_type = ' clothes ' ;
Remember what we learned when we learned sub query ? The result of the subquery is actually a table , It's just a virtual table , It doesn't really exist in the database , Only other tables in the database are filtered , A result of query operations such as aggregation " View ".
This way of writing can clearly distinguish each operation step , We are not very familiar with SQL Querying the execution order of each clause can help us .
But actually , If we know WHERE Clause will be FROM Clause is followed by , in other words , Finishing INNER JOIN … ON Get a new table , Will execute WHERE Clause , Then you get the standard writing :
SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.product_type
,P.sale_price
,SP.quantity
FROMshopproduct AS SP
INNER JOINproduct AS P
ON SP.product_id = P.product_id
WHERE SP.shop_name = ' Tokyo '
AND P.product_type = ' clothes ' ;
(1) Write where Filter statements (2) After filtering in each query, connect the results
No matter how to solve the problem , It's important to think , Get a question , First analyze whether you need to connect 、 How to determine the screening criteria 、 Need grouping and merging , Consider these problems clearly before , Choose a suitable sentence to write .
Example :
Find out the name and price of clothing goods in each store . The following results are expected :
The problem can be analyzed by sql The execution sequence of statements is substituted into the analysis step by step
1.from: Where are from ,product And shopproduct The two tables , Therefore, internal connection is required
shopproduct as sp inner join product as p on sp.product_id = p.product_id
2.where: What are the screening criteria , According to the meaning of the title, the condition is product_tape = ‘ clothes ’
3.select:SP.shop_id,SP.shop_name,SP.product_id ,P.product_name, P.product_type, P.purchase_price
( For the sentences that do not need to be executed in the topic, you can directly skip )
The final complete statement is :
-- Refer to the answer 1-- Do not use subqueries
SELECT SP.shop_id,SP.shop_name,SP.product_id ,P.product_name, P.product_type, P.purchase_price FROM shopproduct AS SP INNER JOIN product AS P ON SP.product_id = P.product_id WHERE P.product_type = ' clothes ';
-- Refer to the answer 2-- Use subquery
SELECT SP.shop_id, SP.shop_name, SP.product_id ,P.product_name, P.product_type, P.purchase_price FROM shopproduct AS SP INNER JOIN -- from product Table to find out the information of clothing products (SELECT product_id, product_name, product_type, purchase_price FROM product WHERE product_type = ' clothes ')AS P ON SP.product_id = P.product_id;
The second method is to directly perform a subquery on the table used for connection , So the direct connection is the filtered sub query ( View ), There is no need to filter after the connection .
Exercises :
In every store , What are the selling prices of the highest priced goods ?
-- Refer to the answer
SELECT SP.shop_id
,SP.shop_name
,MAX(P.sale_price) AS max_price
FROMshopproduct AS SP
INNER JOINproduct AS P
ON SP.product_id = P.product_id
GROUP BY SP.shop_id,SP.shop_name
1.from
2.group by ,group by The next grouping field needs to be filled in according to the actual situation , It is not limited to finding a maximum value and grouping a field
3.select
Inner link and associated subquery
Find out the items in each category that sell at a price higher than the average price of the product .
Associated subquery
SELECT product_type, product_name, sale_price FROM product AS P1 WHERE sale_price > (SELECT AVG(sale_price) FROM product AS P2 WHERE P1.product_type = P2.product_type GROUP BY product_type);
Internal connection :
SELECT P1.product_id ,P1.product_name ,P1.product_type ,P1.sale_price ,P2.avg_price FROM product AS P1 INNER JOIN (SELECT product_type,AVG(sale_price) AS avg_price FROM product GROUP BY product_type) AS P2 ON P1.product_type = P2.product_type WHERE P1.sale_price > P2.avg_price;
Ideas :
1.from:product surface , There is no need to connect
2.where: The selling price of each product category sale_price Higher than the average selling price of this kind of goods avg_price Here is an aggregate function , So just use where Pick up group by Definitely not the right number , Therefore, you need to associate sub queries first , Extract the average selling price of each category , Then output the query results , It's a way of thinking .
In addition, we encounter such problems , Self connection can be considered to solve , Is the connection of the same table , Similar to the associated sub query of the same table . First step : First, calculate the average selling price in groups ; The second step : Make internal connection according to the type of goods ; The third step : increase where Clause , Define the screening criteria .
summary : For cases where records need to be compared with aggregate results , Or use sub queries to get aggregate results and compare them with each record ; or : Aggregate first , Then make an internal connection with another table .
SELECT P1.product_id
,P1.product_name
,P1.product_type
,P1.sale_price
,P2.avg_price
FROM product AS P1
INNER JOIN
(SELECT product_type,AVG(sale_price) AS avg_price
FROM product
GROUP BY product_type) AS P2
ON P1.product_type = P2.product_type
WHERE P1.sale_price > P2.avg_price;
(task4-4.2.1.5) The background is : Find out the items in each category that sell at a price higher than the average price of the product ? Now we use inner connection to solve this problem , I want to ask you something ,P2 The table has passed group by Aggregated product_type, Why can it be combined with those that have not been aggregated P1 Table passing product_type Internal connection , Because “ON P1.product_type = P2.product_type”, I wrote this sentence , Even if the number is different, it can be connected ?
There is a mistake here : Table connection does not require the same number of public fields , There is no need to have the same name , As long as the value of the upper number is right , So here we can directly aggregate the P2 With non aggregated P1 Connected .
Secondly, don't force yourself to find “ or : Aggregate first , Then make an internal connection with another table .” This method is connected with the previous example “ Use subquery ” And “ Do not use subqueries ” There are two ways to find correlation , Because this question is to P1 The records in the table are consistent with P2 Records in which the average value is calculated by grouping aggregation are connected , It is fundamentally different from the two situations mentioned above , So there is no correlation between the two .
-------------------------------------------
This result is not consistent with the result given in the book , Less exercise T T-shirt , This is due to exercise T T-shirt regist_date Field is empty , When making natural connections , From product and product2 The movement of T When comparing this line of data , It's actually equivalent linking field by field , Remember when we were 6.2ISNULL,IS NOT NULL The comparison method of missing values learned in this section can be known , The two missing values are compared with an equal sign , The result is not true . The link will only return those lines that return true to the link condition .
Exercises : Use inner links to find product Table and product2 The intersection of tables .
SELECT P1.*
FROMproduct AS P1
INNER JOINproduct2 AS P2
ON (P1.product_id = P2.product_id
AND P1.product_name = P2.product_name
AND P1.product_type = P2.product_type
AND P1.sale_price = P2.sale_price
AND P1.regist_date = P2.regist_date)
The results are as follows
Note the above results and P230 The results are not consistent – Less product_id='0001' This business , Looking at the source table data, you can find , The less this line of data regist_date Is the missing value , Recall from Chapter 6 IS NULL The predicate , We learned that , This is because missing values cannot be compared with equal signs .
If we just use product_id To connect :
SELECT P1.*
FROMproduct AS P1
INNER JOINproduct2 AS P2
ON P1.product_id = P2.product_id
Query results :
It's the same this time . Internal connections only connect common records .
The inner link will discard the unsatisfied in the two tables ON The conditions are right , The opposite of the inner link is the outer link . The outer link will selectively keep the unmatched rows according to the type of outer link .
According to which table the reserved rows are located , There are three forms of external links : Left link , Right link and all outer link .
The left link will be saved and cannot be followed in the left table ON Clause matches to the line , At this time, the rows corresponding to the right table are all missing values ; The right link will save the right table, which cannot be followed ON Clause matches to the line , At this time, the rows corresponding to the left table are all missing values ; The total external join will save two tables at the same time ON Clause matches to the line , The corresponding row in another table is filled with missing values .
The corresponding grammars of the three external links are :
-- Left link
FROM <tb_1> LEFT OUTER JOIN <tb_2> ON <condition(s)>
-- Right link
FROM <tb_1> RIGHT OUTER JOIN <tb_2> ON <condition(s)>
-- All external links
FROM <tb_1> FULL OUTER JOIN <tb_2> ON <condition(s)>
Before, whether it is external connection or internal connection , A common prerequisite is the link condition –ON Clause , Used to specify the conditions of the link . If you have tried a link query without this link condition , You may have found out , There will be many lines . Remove... From the link ON Clause , It's called cross linking (CROSS JOIN), Cross linking is also called Cartesian product , The latter is a mathematical term . Two sets are Cartesian products , It's just using sets A Every element and set in B Each of the elements in makes up an ordered combination . Database table ( Or subquery ) And , Intersection and difference are operations such as expanding or filtering restrictions on the table vertically , This requires the number of columns in the table and the data type of the column in the corresponding position " Compatible with ", Therefore, these operations do not add new columns , And cross connect ( The cartesian product ) Is to expand the table horizontally , That is, add a new column , This is consistent with the function of the link . But because there is no ON Limitation of clause , Each row of the left table and the right table will be combined , This often leads to many meaningless rows appearing in the search results . Of course , In some query requirements , Cross linking also has some uses .
So in the associated sub query , Each piece of data in the main query will be combined with the sub query .
exercises :
4.1select *
form product as p1 full outer join product2 as p2
where sale_price > 5000;
intersection : The union of two sets removes the symmetry difference between the two sets
Symmetry difference :A-B And go up B-A, Union of two difference sets
Difference set :
A-B:select * from product where product_id not in (select product_id from product 2);
B-A:select * from product2 where product_id not in (select product_id from product 1);
Symmetry difference :
select * from product where product_id not in (select product_id from product 2)
union
select * from product2 where product_id not in (select product_id from product 1);
4.2.select * from
(select * from product
union
select * from product2) as p1
where product_id not in
(select * from product where product_id not in (select product_id from product 2)
union
select * from product2 where product_id not in (select product_id from product 1);
4.3
select p2.shop_name,p1.product_id,p1.product_type,max(sale_price)
from product as p1 right outer join shopproduct as p2 on product_id
group by product_type;
4.4
select product_type,product_id,sale_price
from product as p1
where sale_price = (select product_type,product_id,max(sale_price)
from product as p2
where p1.product_type = p2.product_type
group by p2.product_type);
select product_type,product_id,sale_price
from product as p1 inner join (select product_type,product_id,max(sale_price)
from product as p2
group by p2.product_type)
on p1.product_type = p2.product_type;
4.5
select product_id, produc_name, slae_price
from product
order by sale_price;
边栏推荐
- 一文搞定class的微觀結構和指令
- openresty ngx_lua请求响应
- Simple and beautiful method of PPT color matching
- VIM tail head intercept file import
- 基于STM32的ADC采样序列频谱分析
- Tensor attribute statistics
- Evolution of APK reinforcement technology, APK reinforcement technology and shortcomings
- Error when LabVIEW opens Ni instance finder
- Request preview display of binary data and Base64 format data
- 透彻理解JVM类加载子系统
猜你喜欢
南京:全面启用商品房买卖电子合同
Paddle Serving v0.9.0 重磅发布多机多卡分布式推理框架
一文搞定JVM的内存结构
[untitled]
2022 registration examination for safety management personnel of hazardous chemical business units and simulated reexamination examination for safety management personnel of hazardous chemical busines
关于MySQL的30条优化技巧,超实用
d3dx9_ How to repair 31.dll_ d3dx9_ 31. Solution to missing DLL
Hcip day 12 (BGP black hole, anti ring, configuration)
My experience and summary of the new Zhongtai model
Tensor attribute statistics
随机推荐
Global and Chinese markets of industrial pH meters 2022-2028: Research Report on technology, participants, trends, market size and share
Selenium+Pytest自动化测试框架实战
I closed the open source project alinesno cloud service
thinkphp5.1跨域问题解决
Global and Chinese markets for children's amusement facilities 2022-2028: Research Report on technology, participants, trends, market size and share
Ultrasonic sensor flash | LEGO eV3 Teaching
All expansion and collapse of a-tree
TCC of distributed solutions
openresty ngx_lua请求响应
【Note17】PECI(Platform Environment Control Interface)
基于STM32的ADC采样序列频谱分析
Metasploit (MSF) uses MS17_ 010 (eternal blue) encoding:: undefined conversionerror problem
MCU case -int0 and INT1 interrupt count
抖音__ac_signature
Binary tree (II) -- code implementation of heap
Business introduction of Zhengda international futures company
My experience and summary of the new Zhongtai model
Simple and beautiful method of PPT color matching
Methods modified by static
2022 registration examination for safety management personnel of hazardous chemical business units and simulated reexamination examination for safety management personnel of hazardous chemical busines