当前位置:网站首页>After SQL group query, get the first n records of each group

After SQL group query, get the first n records of each group

2020-11-09 10:51:00 osc A 3gbrdm


One 、 background

Two 、 Practical analysis

3、 ... and 、 summary

One 、 background

lately , There is a functional requirement in development . The system has an information query module , Ask for information to be presented in card form . Here's the picture :

Show the cards according to the project team , Each project group shows the most read TOP2.

Demand analysis : Group by project group , Then take the top of each group with the most reading 2 strip .

Two 、 Practical analysis

be based on Mysql database

The table definition

1、 The project team :team


Primary key


Project team name

2、 Information sheet :info


Primary key


The project team id


Information name


Browse volume


Information content

info Table data is shown in the figure below

Let's preview Select Basic knowledge of

Writing order :

select *columns* 
from *tables* 
where *predicae1* 
group by *columns* 
having *predicae1* 
order by *columns* 
limit *start*, *offset*;

Execution order :

from *tables* 
where *predicae1* 
group by *columns* 
having *predicae1* 
select *columns* 
order by *columns* 
limit *start*, *offset*;


count( Field name ) # Returns the total number of records in this field in the table

DISTINCT Field name # Filter duplicate records in the field


First step : First find out the top two readings in the information sheet

info Information table self correlation

  FROM info a 
        SELECT count(DISTINCT b.pageviews) 
              FROM info b 
                   WHERE a.pageviews < b.pageviews AND a.team_id= b.team_id
      ) < 2 ;

At first glance, it's hard to understand , Here's an example

for instance :

When the amount of reading pageviews a = b = [1,2,3,4]

a.pageviews = 1,b.pageviews  It can take  [2,3,4],count(DISTINCT b.pageviews) = 3 
a.pageviews = 2,b.pageviews  It can take  [3,4],count(DISTINCT b.pageviews) = 2 #  Yes 2 strip , That's the third place  
a.pageviews = 3,b.pageviews  It can take  [4],count(DISTINCT b.pageviews) = 1 #  Yes 1 strip , That's the second place  
a.pageviews = 4,b.pageviews  It can take  [],count(DISTINCT b.pageviews) = 0 #  Yes 0 strip , That is, the biggest   The first name 

count(DISTINCT b.pageviews) Represents several values larger than this value

a.team_id= b.team_id Autocorrelation condition , It's about equal to grouping

therefore Top two Equivalent to count(DISTINCT e2.Salary) < 2 , therefore a.pageviews It can be taken as 3、4, Before the assembly 2 high

The second step : Put the watch again team And table info Connect

SELECT a.id, t.NAME, a.team_id, a.pageviews 
  FROM info a 
    LEFT JOIN team t ON a.team_id = t.id 
        SELECT count(DISTINCT b.pageviews) 
               FROM info b 
                 WHERE a.pageviews < b.pageviews AND a.team_id= b.team_id) < 2 
ORDER BY a.team_id, a.pageviews desc

The results are as follows :

There is also a way to understand

grouping GROUP BY + HAVING, This method can be used to debug the results step by step

SELECT a.id, t.NAME, a.team_id, a.pageviews, COUNT( DISTINCT b.pageviews ) 
  FROM info a 
    LEFT JOIN info b ON ( a.pageviews < b.pageviews AND a.team_id = b.team_id ) 
    LEFT JOIN team t ON a.team_id = t.id 
GROUP BY a.id, t.NAME, a.team_id, a.pageviews 
HAVING COUNT( DISTINCT b.pageviews ) < 2 
ORDER BY a.team_id, a.pageviews DESC

problem : If the number of readings is the same , It just cracked .

Illustrate with examples :

When the amount of reading pageviews a = b = [1,2,2,4]

a.pageviews = 1,b.pageviews  It can take  [2,2,4],count(DISTINCT b.pageviews) = 3 
a.pageviews = 2,b.pageviews  It can take  [4],count(DISTINCT b.pageviews) = 1 #  Yes 1 strip , That is to say, they are tied for the second place  
a.pageviews = 2,b.pageviews  It can take  [4],count(DISTINCT b.pageviews) = 1 #  Yes 1 strip , That's the second place  
a.pageviews = 4,b.pageviews  It can take  [],count(DISTINCT b.pageviews) = 0 #  Yes 0 strip , That is, the biggest   The first name 

count(DISTINCT e2.Salary) < 2 , therefore a.pageviews It can be taken as 2、2、4, Before the assembly 2 high , But there are three pieces of data

3、 ... and 、 summary

Demand transformation : We will find the first few in groups , It's self related , There are several numbers larger than this one

In fact, this is similar to LeetCode The difficulty is hard A database title of

185. All the employees with the top three salaries in the Department

Reference resources :



本文为[osc A 3gbrdm]所创,转载请带上原文链接,感谢