当前位置:网站首页>Five common SQL interview questions

Five common SQL interview questions

2022-06-22 06:29:00 Begin to change

Catalog

One 、 Find continuity 7 Day landing , continuity 30 Day login users ( Little red book written test , Telecom cloud interview ), The problem of the maximum number of consecutive login days -- Window function

  Two 、 Find the number of users who click three times in a row , And there can't be other people's clicks in the middle

3、 ... and 、 Calculate the maximum wage excluding the Department , And the average wage of the minimum wage ( Byte beat interview )-- Window function      

Four 、 Retained calculations , And the calculation of cumulative summation -- Window function , Self coupling (pdd interview )


One 、 Find continuity 7 Day landing , continuity 30 Day login users ( Little red book written test , Telecom cloud interview ), The problem of the maximum number of consecutive login days -- Window function

        1、 First create tables and import data

create table login(
user_id int comment ' user id',
access_time datetime comment ' Access time ',
page_id int comment ' page id',
dt date comment ' Landing date '
);

insert into login values
(1, '2021-06-01 11:13:15', 10, '2021-06-01'),
(1, '2021-06-02 11:13:15', 10, '2021-06-02'),
(1, '2021-06-03 11:13:15', 10, '2021-06-03'),
(1, '2021-06-04 11:13:15', 10, '2021-06-04'),
(1, '2021-06-05 11:13:15', 10, '2021-06-05'),
(1, '2021-06-06 11:13:15', 10, '2021-06-06'),
(1, '2021-06-07 11:13:15', 10, '2021-06-07'),
(2, '2021-06-01 11:13:15', 10, '2021-06-01'),
(2, '2021-06-03 11:13:15', 10, '2021-06-03'),
(2, '2021-06-04 11:13:15', 10, '2021-06-04'),
(2, '2021-06-05 11:13:15', 10, '2021-06-05'),
(3, '2021-06-01 11:13:15', 10, '2021-06-01'),
(3, '2021-06-07 11:13:15', 10, '2021-06-07'),
(3, '2021-06-08 11:13:15', 10, '2021-06-08'),
(3, '2021-06-09 11:13:15', 10, '2021-06-09'),
(3, '2021-06-10 11:13:15', 10, '2021-06-10'),
(3, '2021-06-11 11:13:15', 10, '2021-06-11'),
(3, '2021-06-12 11:13:15', 10, '2021-06-12'),
(3, '2021-06-13 11:13:15', 10, '2021-06-13'),
(4, '2021-06-01 11:13:15', 10, '2021-06-01'),
(4, '2021-06-03 11:13:15', 10, '2021-06-03'),
(4, '2021-06-05 11:13:15', 10, '2021-06-05'),
(4, '2021-06-07 11:13:15', 10, '2021-06-07'),
(4, '2021-06-09 11:13:15', 10, '2021-06-09'),
(4, '2021-06-11 11:13:15', 10, '2021-06-11'),
(5, '2021-06-01 11:13:15', 10, '2021-06-01'),
(5, '2021-06-07 11:13:15', 10, '2021-06-07'),
(5, '2021-06-08 11:13:15', 10, '2021-06-08'),
(5, '2021-06-09 11:13:15', 10, '2021-06-09'),
(5, '2021-06-11 11:13:15', 10, '2021-06-11'),
(5, '2021-06-12 11:13:15', 10, '2021-06-12'),
(5, '2021-06-13 11:13:15', 10, '2021-06-13');

        2、 Ideas

                ① First, set the user as id Grouping , Then use the window function to sort

SELECT *,ROW_NUMBER() over (PARTITION by user_id order by user_id)FROM login where month(dt) = 6

                ② Make a difference between the date and the ranking  ( If the user logs in continuously , So the result of the difference date is the same )

select *,date_sub(dt,INTERVAL ranking day) diff from (SELECT *,ROW_NUMBER() over (PARTITION by user_id order by dt) ranking FROM login where month(dt) = 6) t

               

                 ③ Classify according to user and time difference , Count the number of times , If the number of times is 7 More than days , Is the desired result

select user_id ,count(*) from
(select *, date_sub(dt, interval ranking day) diff from
(select *, row_number() over(partition by user_id order by dt) ranking from login where month(dt)=6) as t) as t1
group by user_id, diff having count(*) >= 7;

  Two 、 Find the number of users who click three times in a row , And there can't be other people's clicks in the middle

        a The table records the click flow information , Include users id , And click time

row_number() over(order by click_time) as rank_1  obtain rank_1 by  1 2 3 4 5 6 7

row_number() over(partition by usr_id order by click_time)  obtain rank_2  by  1 2 1 3 4 5 6

rank_1- rank2  obtain diff  by  0 0 2 1 1 1 1

         At this time, we found that we only need to diff The grouping count is greater than 3 individual , That is, users who click more than three times in a row and no one else clicks in the middle

select distinct usr_id
from    
(
   select *, rank_1- rank2  as diff
   from
  (
      select *,
      row_number() over(order by click_time) as  rank_1
      row_number() over(partition by usr_id order by click_time) as rank_2
      from a
   ) b
) c
group by diff,usr_id
having count(diff) >=3

3、 ... and 、 Calculate the maximum wage excluding the Department , And the average wage of the minimum wage ( Byte beat interview )-- Window function      

        emp surface

  id staff id ,deptno Department number ,salary Wages

         The core is to use the window function to arrange it in descending and ascending order respectively, and then take out the highest and lowest .

select a.deptno,avg(a.salary)
from  
 (
 select *, rank() over( partition by deptno order by salary ) as rank_1
 , rank() over( partition by deptno order by salary desc) as rank_2 
 from emp
 )  a 
group by a.deptno
where a.rank_1 >1 and a.rank_2 >1 

Four 、 Retained calculations , And the calculation of cumulative summation -- Window function , Self coupling (pdd interview )

         The camera in mobile phone is one of the most popular applications , The following figure is a screenshot of some data in the user behavior information table in the database of a mobile phone manufacturer

         Now the mobile phone manufacturer wants to analyze application ( The camera ) The activity of , The following statistics are required : 

The format of the data to be obtained is as follows :

select d.a_t,count(distinct case when d. The time interval =1 then d. user id     
               else null
               end) as   The amount retained on the next day , 
count(distinct case when  The time interval =1 then d. user id
               else null
               end) /count(distinct d. user id) as  Next day retention rate ,
count(distinct case when d. The time interval =3 then d. user id     
               else null
               end) as  3 Daily retention  ,
count(distinct case when  The time interval =3 then d. user id
               else null
               end) /count(distinct d. user id) as 3 Daily retention rate ,
count(distinct case when d. The time interval =7 then d. user id     
               else null
               end) as  7 Daily retention  ,
count(distinct case when  The time interval =7 then d. user id
               else null
               end) /count(distinct d. user id) as 7 Daily retention rate 

from
(select *,timestampdiff(day,a_t,b_t) as  The time interval 
from (select a.` user id`,a. Landing time  as a_t ,b. Landing time  as b_t
from  login information  as a  
left join  login information  as b
on a.` user id`=b.` user id`
where a. apply name = ' The camera ' AND b. apply name =' The camera ') as c) as d
group by d.a_t; 

原网站

版权声明
本文为[Begin to change]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206220545477093.html