当前位置:网站首页>Task05:sql advanced processing

Task05:sql advanced processing

2022-06-10 08:49:00 JxWang05

Tutorial address

https://github.com/datawhalechina/wonderful-sql
https://gitee.com/datawhalechina/wonderful-sql

1. Window function

1.1 brief introduction

Another alias of window function is ,OLAP function , That is to say OnLine Analytical Processing

Its general syntax format is as follows :

< Window function > OVER (
  [PARTITION BY < Name >]
  ORDER BY < Column names for sorting >
)  

The parts in brackets can be omitted

partition There is a kind of group by The feeling of , Group processing , Select columns to group

then order by Well , Is sorted , Sort in the window , There's nothing to say

Examples are as follows :

mysql> SELECT product_name
    ->        ,product_type
    ->        ,sale_price
    ->        ,RANK() OVER (
    ->          PARTITION BY product_type
    ->          ORDER BY sale_price
    ->         ) AS ranking
    -> FROM product;
+--------------+--------------+------------+---------+
| product_name | product_type | sale_price | ranking |
+--------------+--------------+------------+---------+
|  Punch        |  Office Supplies      |        500 |       1 |
|  Ball pen        |  Office Supplies      |       5000 |       2 |
|  Fork          |  Kitchenware      |        500 |       1 |
|  Clean the board        |  Kitchenware      |        880 |       2 |
|  kitchen knife          |  Kitchenware      |       3000 |       3 |
|  pressure cooker        |  Kitchenware      |       6800 |       4 |
| T T-shirt           |  clothes          |       1000 |       1 |
|  motion T T-shirt       |  clothes          |       4000 |       2 |
+--------------+--------------+------------+---------+
8 rows in set (0.00 sec)

We can see that this is a grouping process for the category column

Then the records in each group are sorted by price , Generate new columns

1.2 Special functions

RANK function
When calculating the sort , If there are records of the same order , Will skip the next position .

example ) Yes 3 Record number one 1 When a :1 position 、1 position 、1 position 、4 position ……

DENSE_RANK function
It's also sort by calculation , Even if there are records of the same rank , And I won't skip the next place .

example ) Yes 3 Record number one 1 When a :1 position 、1 position 、1 position 、2 position ……

ROW_NUMBER function
Give a unique continuous position .

example ) Yes 3 Record number one 1 When a :1 position 、2 position 、3 position 、4 position

Just run an example to see

mysql> SELECT  product_name
    ->        ,product_type
    ->        ,sale_price
    ->        ,RANK() OVER (ORDER BY sale_price) AS ranking
    ->        ,DENSE_RANK() OVER (ORDER BY sale_price) AS dense_ranking
    ->        ,ROW_NUMBER() OVER (ORDER BY sale_price) AS row_num
    -> FROM product; 
+--------------+--------------+------------+---------+---------------+---------+
| product_name | product_type | sale_price | ranking | dense_ranking | row_num |
+--------------+--------------+------------+---------+---------------+---------+
|  Punch        |  Office Supplies      |        500 |       1 |             1 |       1 |
|  Fork          |  Kitchenware      |        500 |       1 |             1 |       2 |
|  Clean the board        |  Kitchenware      |        880 |       3 |             2 |       3 |
| T T-shirt           |  clothes          |       1000 |       4 |             3 |       4 |
|  kitchen knife          |  Kitchenware      |       3000 |       5 |             4 |       5 |
|  motion T T-shirt       |  clothes          |       4000 |       6 |             5 |       6 |
|  Ball pen        |  Office Supplies      |       5000 |       7 |             6 |       7 |
|  pressure cooker        |  Kitchenware      |       6800 |       8 |             7 |       8 |
+--------------+--------------+------------+---------+---------------+---------+
8 rows in set (0.00 sec)

RANK() Is to set records of the same size to the same bit , such as ranking The serial number of the fork is 2, But the same order is set to 1

however RANK() Selected by serial number , That is, the serial number is 3 The kitchen board , Its rank is also 3

So in general, we just set records of the same size to the same sequence number , The other order is the same , Number the position according to the serial number

DENSE_RANK() It also sets records of the same size to the same bit , The serial number of the fork is also 2, The order is 1

however DENSE_RANK() You have selected the sequential order , That is, the number of its position depends on the number of the previous position , Not by serial number

ROW_NUMBER() It is to number the bits directly according to the serial number

1.3 Aggregate functions

Take a direct example , It is probably the starting point of the fixed window , Then extend the window length

mysql> SELECT  product_id
    ->        ,product_name
    ->        ,sale_price
    ->        ,SUM(sale_price) OVER (ORDER BY product_id) AS current_sum
    ->        ,AVG(sale_price) OVER (ORDER BY product_id) AS current_avg
    -> FROM product;  
+------------+--------------+------------+-------------+-------------+
| product_id | product_name | sale_price | current_sum | current_avg |
+------------+--------------+------------+-------------+-------------+
| 0001       | T T-shirt           |       1000 |        1000 |   1000.0000 |
| 0002       |  Punch        |        500 |        1500 |    750.0000 |
| 0003       |  motion T T-shirt       |       4000 |        5500 |   1833.3333 |
| 0004       |  kitchen knife          |       3000 |        8500 |   2125.0000 |
| 0005       |  pressure cooker        |       6800 |       15300 |   3060.0000 |
| 0006       |  Fork          |        500 |       15800 |   2633.3333 |
| 0007       |  Clean the board        |        880 |       16680 |   2382.8571 |
| 0008       |  Ball pen        |       5000 |       21680 |   2710.0000 |
+------------+--------------+------------+-------------+-------------+
8 rows in set (0.02 sec)

It's also amazing , This is moving average and moving sum , The length of the window increases with the increase of records

These two columns of each row of records , It means , Then to this line , Sum and average all the above

1.4 Moving average

It's a variation of the aggregate function , Fixed window size , Slide position

mysql> SELECT product_id
    ->        ,product_name
    ->        ,sale_price
    ->        
    ->        ,AVG(sale_price) OVER (
    ->          ORDER BY product_id
    ->          ROWS 2 PRECEDING
    ->         ) AS moving_avg_1
    ->         
    ->        ,AVG(sale_price) OVER (
    ->          ORDER BY product_id
    ->          ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
    ->         ) AS moving_avg_2
    ->         
    -> FROM product;  
+------------+--------------+------------+--------------+--------------+
| product_id | product_name | sale_price | moving_avg_1 | moving_avg_2 |
+------------+--------------+------------+--------------+--------------+
| 0001       | T T-shirt           |       1000 |    1000.0000 |     750.0000 |
| 0002       |  Punch        |        500 |     750.0000 |    1833.3333 |
| 0003       |  motion T T-shirt       |       4000 |    1833.3333 |    2500.0000 |
| 0004       |  kitchen knife          |       3000 |    2500.0000 |    4600.0000 |
| 0005       |  pressure cooker        |       6800 |    4600.0000 |    3433.3333 |
| 0006       |  Fork          |        500 |    3433.3333 |    2726.6667 |
| 0007       |  Clean the board        |        880 |    2726.6667 |    2126.6667 |
| 0008       |  Ball pen        |       5000 |    2126.6667 |    2940.0000 |
+------------+--------------+------------+--------------+--------------+
8 rows in set (0.00 sec)

We are in the first moving average , It is set to be before the current line 2 That's ok , Then add yourself , by 3 That's ok

So in the second moving average , It is a line before the current line , And the next line , Add yourself , by 3 That's ok

The main difference is in the calculation of the related columns in the second row

For the first one , It seeks the average of the previous line and itself , common 2 Average of rows

But for the second , It's the previous line 、 The next line , And self , common 3 Average of rows

2. rollup

It means , Calculate the value of a total , For the total of categories

( Why don't you put this piece of content in group by There )

mysql> SELECT  product_type
    ->        ,regist_date
    ->        ,SUM(sale_price) AS sum_price
    -> FROM product
    -> GROUP BY product_type, regist_date WITH ROLLUP;  
+--------------+-------------+-----------+
| product_type | regist_date | sum_price |
+--------------+-------------+-----------+
|  Office Supplies      | 2009-09-11  |       500 |
|  Office Supplies      | 2009-11-11  |      5000 |
|  Office Supplies      | NULL        |      5500 |
|  Kitchenware      | 2008-04-28  |       880 |
|  Kitchenware      | 2009-01-15  |      6800 |
|  Kitchenware      | 2009-09-20  |      3500 |
|  Kitchenware      | NULL        |     11180 |
|  clothes          | NULL        |      4000 |
|  clothes          | 2009-09-20  |      1000 |
|  clothes          | NULL        |      5000 |
| NULL         | NULL        |     21680 |
+--------------+-------------+-----------+
11 rows in set (0.00 sec)

This is probably , Uh , Calculate the total value of each category , Then calculate the total value

Say this. regist_date It's very strange , Try deleting it

mysql> SELECT  product_type
    ->        ,SUM(sale_price) AS sum_price
    -> FROM product
    -> GROUP BY product_type WITH ROLLUP;  
+--------------+-----------+
| product_type | sum_price |
+--------------+-----------+
|  Office Supplies      |      5500 |
|  Kitchenware      |     11180 |
|  clothes          |      5000 |
| NULL         |     21680 |
+--------------+-----------+
4 rows in set (0.00 sec)

Um. ,group by Delete the inside , that select Of course, we have to delete

From this result , Only the total value of each category , There are no detailed values

Replace with another column regist_date have a look

mysql> SELECT  product_id
    ->        ,product_type
    ->        ,SUM(sale_price) AS sum_price
    -> FROM product
    -> GROUP BY product_type,product_id WITH ROLLUP;  
+------------+--------------+-----------+
| product_id | product_type | sum_price |
+------------+--------------+-----------+
| 0002       |  Office Supplies      |       500 |
| 0008       |  Office Supplies      |      5000 |
| NULL       |  Office Supplies      |      5500 |
| 0004       |  Kitchenware      |      3000 |
| 0005       |  Kitchenware      |      6800 |
| 0006       |  Kitchenware      |       500 |
| 0007       |  Kitchenware      |       880 |
| NULL       |  Kitchenware      |     11180 |
| 0001       |  clothes          |      1000 |
| 0003       |  clothes          |      4000 |
| NULL       |  clothes          |      5000 |
| NULL       | NULL         |     21680 |
+------------+--------------+-----------+
12 rows in set (0.00 sec)

Well, , Because of the id and date They are all unique

So ,group after , The detailed records of each category are displayed

ah , That's OK , I had been afraid to think about it before

I mean , When we group after , Sometimes you need to select Other things

That is to say ,select No group Columns based on

Now I find that I can throw these columns into group

Although they don't seem to be group Work

But you can group And then be select come out

3. stored procedure

My understanding is that , A pre written function

then in Used to pass in parameters ,out It's outgoing , The grammar is as follows :

[delimiter //]($$, It can be other special characters )
CREATE
    [DEFINER = user]
    PROCEDURE sp_name ([proc_parameter[,...]])
    [characteristic ...] 
[BEGIN]
  routine_body
[END //]($$, It can be other special characters )

Let's go straight to the example

mysql> DROP PROCEDURE IF EXISTS pricecount;
Query OK, 0 rows affected (0.02 sec)

mysql> 
mysql> DELIMITER //
mysql> CREATE PROCEDURE pricecount (IN input_price INT, OUT counts INT)
    -> BEGIN
    ->   SELECT COUNT(*) INTO counts FROM shop.product
    ->   WHERE input_price = sale_price;
    -> END //
Query OK, 0 rows affected (0.01 sec)

mysql> 
mysql> DELIMITER ;
mysql> 
mysql> call pricecount(1000, @counts);
Query OK, 1 row affected (0.00 sec)

mysql> 
mysql> select @counts;
+---------+
| @counts |
+---------+
|       1 |
+---------+
1 row in set (0.00 sec)

mysql> 
mysql> select * from product;
+------------+--------------+--------------+------------+----------------+-------------+
| product_id | product_name | product_type | sale_price | purchase_price | regist_date |
+------------+--------------+--------------+------------+----------------+-------------+
| 0001       | T T-shirt           |  clothes          |       1000 |            500 | 2009-09-20  |
| 0002       |  Punch        |  Office Supplies      |        500 |            320 | 2009-09-11  |
| 0003       |  motion T T-shirt       |  clothes          |       4000 |           2800 | NULL        |
| 0004       |  kitchen knife          |  Kitchenware      |       3000 |           2800 | 2009-09-20  |
| 0005       |  pressure cooker        |  Kitchenware      |       6800 |           5000 | 2009-01-15  |
| 0006       |  Fork          |  Kitchenware      |        500 |           NULL | 2009-09-20  |
| 0007       |  Clean the board        |  Kitchenware      |        880 |            790 | 2008-04-28  |
| 0008       |  Ball pen        |  Office Supplies      |       5000 |           NULL | 2009-11-11  |
+------------+--------------+--------------+------------+----------------+-------------+
8 rows in set (0.00 sec)

First we need to delete pricecount, Prevent already existing , It's a double name

Then there is the declaration part , With // As a separator

And then this DELIMITER ; It feels like an activation function , But it doesn't seem to be

The information I checked said it was , This is telling mysql Well defined stored procedures

however end // It has already been marked ?

What I found was , no way , Without this DELIMITER ; , I always thought it was not over

There is another example of creating tables , We don't do it here , Be the same in essentials while differing in minor points

4. Preprocessing

In a nutshell , Set a placeholder , Then the query does not have to be fully expanded , Just update the placeholder every time

give an example :

mysql> PREPARE stmt1 FROM 
    ->   'SELECT product_id, '>           product_name 
    '> FROM product '>      WHERE product_id = ?'; Query OK, 0 rows affected (0.01 sec) Statement prepared mysql> mysql> SET @pcid = '0005'; 
Query OK, 0 rows affected (0.00 sec)

mysql> 
mysql> EXECUTE stmt1 USING @pcid;
+------------+--------------+
| product_id | product_name |
+------------+--------------+
| 0005       |  pressure cooker        |
+------------+--------------+
1 row in set (0.00 sec)

mysql> DEALLOCATE PREPARE stmt1;
Query OK, 0 rows affected (0.00 sec)

First define the sentence structure , Then set the variable , Finally, pass in variables to execute , Last release statement

Linked with the above stored procedure :

mysql> DROP TABLE IF EXISTS product_test;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> 
mysql> CREATE TABLE IF NOT EXISTS product_test LIKE product;
select * from product_test;Query OK, 0 rows affected (0.04 sec)

mysql> 
mysql> DROP PROCEDURE IF EXISTS insert_product_test;
Query OK, 0 rows affected (0.00 sec)

mysql> 
mysql> DELIMITER //
mysql> CREATE DEFINER=`root`@`localhost` PROCEDURE `insert_product_test`()
    -> BEGIN
    ->     declare i int;
    ->     set i=1;
    ->     while i<9 do
    ->         set @pcid = CONCAT('000', i);
    ->         PREPARE stmt FROM 'INSERT INTO product_test SELECT * FROM shop.product where product_id= ?';
    ->         EXECUTE stmt USING @pcid;
    ->         set i=i+1;
    ->     end while;
    -> END //
Query OK, 0 rows affected (0.01 sec)

mysql> 
mysql> DELIMITER ;
mysql> 
mysql> call insert_product_test();
Query OK, 1 row affected (0.05 sec)

mysql> 
mysql> select * from product_test;
+------------+--------------+--------------+------------+----------------+-------------+
| product_id | product_name | product_type | sale_price | purchase_price | regist_date |
+------------+--------------+--------------+------------+----------------+-------------+
| 0001       | T T-shirt           |  clothes          |       1000 |            500 | 2009-09-20  |
| 0002       |  Punch        |  Office Supplies      |        500 |            320 | 2009-09-11  |
| 0003       |  motion T T-shirt       |  clothes          |       4000 |           2800 | NULL        |
| 0004       |  kitchen knife          |  Kitchenware      |       3000 |           2800 | 2009-09-20  |
| 0005       |  pressure cooker        |  Kitchenware      |       6800 |           5000 | 2009-01-15  |
| 0006       |  Fork          |  Kitchenware      |        500 |           NULL | 2009-09-20  |
| 0007       |  Clean the board        |  Kitchenware      |        880 |            790 | 2008-04-28  |
| 0008       |  Ball pen        |  Office Supplies      |       5000 |           NULL | 2009-11-11  |
+------------+--------------+--------------+------------+----------------+-------------+
8 rows in set (0.00 sec)

First delete table , Rebuild table , And then build a cycle , Use preprocessing to pick and insert data one by one

But the definition of preprocessing should be outside the loop , One definition is enough

And then you add DEALLOCATE PREPARE stmt1; Cancel the declaration at the end

A. Exercises

A.1

Please name the... Used in this chapter product( goods ) The table is executed as follows SELECT The result of the statement .
SELECT product_id
,product_name
,sale_price
,MAX(sale_price) OVER (ORDER BY product_id) AS Current_max_price
FROM product;

The first few columns will not be mentioned , The new one , It should be a fixed starting point , And then the ever - lengthening window

Then the aggregate function is to find the maximum in the window , So the result should be the highest price so far

mysql> SELECT  product_id
    ->        ,product_name
    ->        ,sale_price
    ->        ,MAX(sale_price) OVER (ORDER BY product_id) AS Current_max_price
    -> FROM product;
+------------+--------------+------------+-------------------+
| product_id | product_name | sale_price | Current_max_price |
+------------+--------------+------------+-------------------+
| 0001       | T T-shirt           |       1000 |              1000 |
| 0002       |  Punch        |        500 |              1000 |
| 0003       |  motion T T-shirt       |       4000 |              4000 |
| 0004       |  kitchen knife          |       3000 |              4000 |
| 0005       |  pressure cooker        |       6800 |              6800 |
| 0006       |  Fork          |        500 |              6800 |
| 0007       |  Clean the board        |        880 |              6800 |
| 0008       |  Ball pen        |       5000 |              6800 |
+------------+--------------+------------+-------------------+
8 rows in set (0.00 sec)

A.2

Continue to use product surface , Calculate according to the registration date (regist_date) The sales unit price of each date in ascending order (sale_price) Total of . Sorting is required to set the registration date to NULL Of “ motion T T-shirt ” The record is at 1 position ( That is, think of it as earlier than other dates )

Anyway null The default value is smaller than others , Then arrange them in ascending order

The key problem is to calculate this cumulative value , Accumulated by date , Window functions and group Fine

But the problem is ,rollup The total value , Its regist_date It's also null, This causes the total to jump to the second line

So I gave up rollup, Replace with subquery , A separate column shows the total value

mysql> SELECT  regist_date
    ->         ,SUM(sale_price) OVER (PARTITION BY regist_date ORDER BY regist_date) AS date_price
    ->         ,(select sum(sale_price) from product) AS sum_price
    -> FROM product
    -> ;
+-------------+------------+-----------+
| regist_date | date_price | sum_price |
+-------------+------------+-----------+
| NULL        |       4000 |     21680 |
| 2008-04-28  |        880 |     21680 |
| 2009-01-15  |       6800 |     21680 |
| 2009-09-11  |        500 |     21680 |
| 2009-09-20  |       4500 |     21680 |
| 2009-09-20  |       4500 |     21680 |
| 2009-09-20  |       4500 |     21680 |
| 2009-11-11  |       5000 |     21680 |
+-------------+------------+-----------+
8 rows in set (0.00 sec)

But I found that if you don't take the initiative to sort , It seems that the results are orderly

mysql> SELECT  regist_date
    ->         ,SUM(sale_price)
    -> FROM product
    -> GROUP BY regist_date WITH ROLLUP
    -> -- ORDER BY regist_date
    -> ;
+-------------+-----------------+
| regist_date | SUM(sale_price) |
+-------------+-----------------+
| NULL        |            4000 |
| 2008-04-28  |             880 |
| 2009-01-15  |            6800 |
| 2009-09-11  |             500 |
| 2009-09-20  |            4500 |
| 2009-11-11  |            5000 |
| NULL        |           21680 |
+-------------+-----------------+
7 rows in set (0.00 sec)


A.3

① The window function does not specify PARTITION BY What is the effect of ?

That's it , One , Fixed start window , The length increases with the number of rows of the record

② Why is it that window functions can only be used in SELECT Used in clauses ? actually , stay ORDER BY Clause using the system does not report an error .

The question of execution order

FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY

The window function is used to process the filtered data

A.4

Create... In a simple way 20 One and shop.product Tables with the same structure :

Use preprocessing to define the statement structure for creating tables , Then execute one by one in the stored procedure

DROP PROCEDURE IF EXISTS insert_product_test;

DELIMITER //
CREATE DEFINER=`root`@`localhost` PROCEDURE `insert_product_test`()
BEGIN
    PREPARE stmt FROM 'CREATE TABLE IF NOT EXISTS ? LIKE product';
    declare i int;
    set i=0;
    while i<21 do
        IF i<10 THEN SET @table_name=CONCAT('table','0',i);
        ELSE SET @table_name=CONCAT('table',i);
        EXECUTE stmt USING @table_name;
        set i=i+1;
    end while;
END //

DELIMITER ;

call insert_product_test();

原网站

版权声明
本文为[JxWang05]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/161/202206100834331022.html