当前位置:网站首页>[Niu Ke's questions -sql big factory interview real questions] no2 User growth scenario (a certain degree of information flow)
[Niu Ke's questions -sql big factory interview real questions] no2 User growth scenario (a certain degree of information flow)
2022-07-01 12:52:00 【It bond】

Systematic learning SQL, Please go to the Niuke classic high frequency interview question bank , Participate in practical training , Improve your SQL Skills ~
https://www.nowcoder.com/link/pc_csdncpt_itbd_sql
List of articles
- Preface
- SQL162 2021 year 11 The average time of browsing articles every day in the month
- SQL163 The maximum number of people reading each article at the same time
- SQL164 2021 year 11 The next day retention rate of new users every day in the month
- SQL165 Count the user rating results of active interval
- SQL166 Number of daily activities per day and proportion of new users
- SQL167 Continuous check-in and receive gold coins
Preface
SQL Everyone has to use , But output is not measured by SQL In itself , You need to use this tool , To create other value .
SQL162 2021 year 11 The average time of browsing articles every day in the month
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:31', 0),
(102, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:24', 0),
(102, 9002, '2021-11-01 11:00:00', '2021-11-01 11:00:11', 0),
(101, 9001, '2021-11-02 10:00:00', '2021-11-02 10:00:50', 0),
(102, 9002, '2021-11-02 11:00:01', '2021-11-02 11:00:24', 0);
demand
Scenario logic description :artical_id- article ID Represents the of articles viewed by users ID,
artical_id- article ID by 0 Indicates that the user is on the non article content page ( such as App List page in 、 Activity page, etc ).
problem : Statistics 2021 year 11 The average time of browsing articles every day in the month ( Number of seconds ),
The result is reserved 1 Decimal place , And sort from short to long on time .
answer
select date_format(in_time,"%Y-%m-%d")dt,
round (sum(timestampdiff(second,in_time,out_time))
/count(distinct uid),1) avg_view_len_sec
from tb_user_log
where month(in_time)= 11 and artical_id !=0
group by dt
order by avg_view_len_sec

SQL163 The maximum number of people reading each article at the same time
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:11', 0),
(102, 9001, '2021-11-01 10:00:09', '2021-11-01 10:00:38', 0),
(103, 9001, '2021-11-01 10:00:28', '2021-11-01 10:00:58', 0),
(104, 9002, '2021-11-01 11:00:45', '2021-11-01 11:01:11', 0),
(105, 9001, '2021-11-01 10:00:51', '2021-11-01 10:00:59', 0),
(106, 9002, '2021-11-01 11:00:55', '2021-11-01 11:01:24', 0),
(107, 9001, '2021-11-01 10:00:01', '2021-11-01 10:01:50', 0);
demand
Scenario logic description :artical_id- article ID Represents the of articles viewed by users ID,
artical_id- article ID by 0 Indicates that the user is on the non article content page ( such as App List page in 、 Activity page, etc ).
problem : Count the maximum number of people reading each article at the same time , If there are both entry and departure at the same time ,
First record the increase in the number of users, and then record the decrease , The results are in descending order of the largest number of people .
answer
SELECT artical_id, max(t1) as t2
FROM(
SELECT artical_id,t,
sum(m)over(partition by artical_id order by t,m desc) as t1
from(
select artical_id,in_time as t,1 as m
from tb_user_log
where artical_id<>0
UNION ALL
select artical_id,out_time as t,-1 as m
from tb_user_log
where artical_id<>0
)a
)b
group by artical_id
order by t2 desc

SQL164 2021 year 11 The next day retention rate of new users every day in the month
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 0, '2021-11-01 10:00:00', '2021-11-01 10:00:42', 1),
(102, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:09', 0),
(103, 9001, '2021-11-01 10:00:01', '2021-11-01 10:01:50', 0),
(101, 9002, '2021-11-02 10:00:09', '2021-11-02 10:00:28', 0),
(103, 9002, '2021-11-02 10:00:51', '2021-11-02 10:00:59', 0),
(104, 9001, '2021-11-02 10:00:28', '2021-11-02 10:00:50', 0),
(101, 9003, '2021-11-03 11:00:55', '2021-11-03 11:01:24', 0),
(104, 9003, '2021-11-03 11:00:45', '2021-11-03 11:00:55', 0),
(105, 9003, '2021-11-03 11:00:53', '2021-11-03 11:00:59', 0),
(101, 9002, '2021-11-04 11:00:55', '2021-11-04 11:00:59', 0);
demand
problem : Statistics 2021 year 11 The next day retention rate of new users every day in the month ( Retain 2 Decimal place )
notes :
The retention rate of the next day is the proportion of the number of users newly added on the same day and active on the next day .
If in_time- Entry time and out_time- It's time to leave ,
It is recorded that the user has been active in two days , The results are in ascending order of date .
answer
select a.first_day, round(count(distinct b.uid)/count(distinct a.uid),2) as rate
from (
select uid, date(min(in_time)) as first_day
from tb_user_log
-- where date_format(in_time, '%Y-%m') = '2021-11'
group by uid
having date_format(min(in_time), '%Y-%m') = '2021-11') a
left join (
select uid, date(in_time) as dt
from tb_user_log
union
select uid, date(out_time) as dt
from tb_user_log) b
on a.uid = b.uid and datediff(b.dt, a.first_day) = 1
group by a.first_day
order by a.first_day

SQL165 Count the user rating results of active interval
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(109, 9001, '2021-08-31 10:00:00', '2021-08-31 10:00:09', 0),
(109, 9002, '2021-11-04 11:00:55', '2021-11-04 11:00:59', 0),
(108, 9001, '2021-09-01 10:00:01', '2021-09-01 10:01:50', 0),
(108, 9001, '2021-11-03 10:00:01', '2021-11-03 10:01:50', 0),
(104, 9001, '2021-11-02 10:00:28', '2021-11-02 10:00:50', 0),
(104, 9003, '2021-09-03 11:00:45', '2021-09-03 11:00:55', 0),
(105, 9003, '2021-11-03 11:00:53', '2021-11-03 11:00:59', 0),
(102, 9001, '2021-10-30 10:00:00', '2021-10-30 10:00:09', 0),
(103, 9001, '2021-10-21 10:00:00', '2021-10-21 10:00:09', 0),
(101, 0, '2021-10-01 10:00:00', '2021-10-01 10:00:42', 1);
demand
problem : Count the active interval and rank users , Proportion of users at each active level , Two decimal places are reserved for the result , And in descending order of proportion .
notes :
The user level standard is simplified to : Loyal users ( near 7 Days active and not new users )、 New users ( near 7 Days new )、
Sleeping users ( near 7 Not active for days, but active earlier )、 Lost users ( near 30 Not active for days, but active earlier ).
Suppose today is the maximum of all dates in the data .
near 7 Day means including the day T Close to 7 God , Closed interval [T-6, T].
answer
select user_grade,round(count(uid)
/(select count(distinct uid) from tb_user_log),2) q
from
(
select uid,(case when datediff((select max(in_time) from tb_user_log),max(in_time)) <=6
and datediff((select max(in_time) from tb_user_log),min(in_time)) >6
then ' Loyal users '
when datediff((select max(in_time) from tb_user_log),max(in_time)) <=6
and datediff((select max(in_time) from tb_user_log),min(in_time)) <=6
then ' New users '
when datediff((select max(in_time) from tb_user_log),max(in_time)) >6
and datediff((select max(in_time) from tb_user_log),min(in_time)) <=29
then ' Sleeping users '
else ' Lost users ' end ) user_grade
from tb_user_log
group by uid
) f1
group by user_grade
order by q desc

SQL166 Number of daily activities per day and proportion of new users
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 9001, '2021-10-31 10:00:00', '2021-10-31 10:00:09', 0),
(102, 9001, '2021-10-31 10:00:00', '2021-10-31 10:00:09', 0),
(101, 0, '2021-11-01 10:00:00', '2021-11-01 10:00:42', 1),
(102, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:09', 0),
(108, 9001, '2021-11-01 10:00:01', '2021-11-01 10:01:50', 0),
(108, 9001, '2021-11-02 10:00:01', '2021-11-02 10:01:50', 0),
(104, 9001, '2021-11-02 10:00:28', '2021-11-02 10:00:50', 0),
(106, 9001, '2021-11-02 10:00:28', '2021-11-02 10:00:50', 0),
(108, 9001, '2021-11-03 10:00:01', '2021-11-03 10:01:50', 0),
(109, 9002, '2021-11-03 11:00:55', '2021-11-03 11:00:59', 0),
(104, 9003, '2021-11-03 11:00:45', '2021-11-03 11:00:55', 0),
(105, 9003, '2021-11-03 11:00:53', '2021-11-03 11:00:59', 0),
(106, 9003, '2021-11-03 11:00:45', '2021-11-03 11:00:55', 0);
demand
problem : Count the number of daily activities and the proportion of new users
notes :
Proportion of new users = Number of new users of the day ÷ Number of active users of the day ( The number of days ).
If in_time- Entry time and out_time- It's time to leave , It is recorded that the user has been active in two days .
The proportion of new users is reserved 2 Decimal place , The results are sorted in ascending order by date .
answer
select dt, count(*) as dau, round(sum(new)/count(*), 2) as uv_new_ratio
from (
select uid, dt, case when dt = first_dt then 1 else 0 end as new
from
(select uid, date(in_time) as dt
from tb_user_log
UNION
select uid, date(out_time) as dt
from tb_user_log) t1
left join
(select uid, min(date(in_time)) as first_dt
from tb_user_log
group by uid) t2
using(uid)
) t
group by dt
order by dt

SQL167 Continuous check-in and receive gold coins
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 0, '2021-07-07 10:00:00', '2021-07-07 10:00:09', 1),
(101, 0, '2021-07-08 10:00:00', '2021-07-08 10:00:09', 1),
(101, 0, '2021-07-09 10:00:00', '2021-07-09 10:00:42', 1),
(101, 0, '2021-07-10 10:00:00', '2021-07-10 10:00:09', 1),
(101, 0, '2021-07-11 23:59:55', '2021-07-11 23:59:59', 1),
(101, 0, '2021-07-12 10:00:28', '2021-07-12 10:00:50', 1),
(101, 0, '2021-07-13 10:00:28', '2021-07-13 10:00:50', 1),
(102, 0, '2021-10-01 10:00:28', '2021-10-01 10:00:50', 1),
(102, 0, '2021-10-02 10:00:01', '2021-10-02 10:01:50', 1),
(102, 0, '2021-10-03 11:00:55', '2021-10-03 11:00:59', 1),
(102, 0, '2021-10-04 11:00:45', '2021-10-04 11:00:55', 0),
(102, 0, '2021-10-05 11:00:53', '2021-10-05 11:00:59', 1),
(102, 0, '2021-10-06 11:00:45', '2021-10-06 11:00:55', 1);
demand
Scenario logic description :
artical_id- article ID Represents the of articles viewed by users ID,
A special case artical_id- article ID by 0 Indicates that the user is on the non article content page ( such as App List page in 、 Activity page, etc ).
Be careful : Only artical_id by 0 when sign_in Value is valid .
from 2021 year 7 month 7 Japan 0 PM , Users can sign in every day to receive 1 Gold coin , And you can start accumulating check-in days ,
The second consecutive check-in 3、7 You can receive extra... Every day 2、6 Gold coin .
Every successive check-in 7 Days to re accumulate check-in days ( That is, reset the check-in days :
For the first 8 The day of check-in is recorded as the first day of a new round of check-in , led 1 Gold coin )
problem : Calculate each user 2021 year 7 The number of gold coins obtained each month since the month ( The event ends at 10 The end of the month ,
11 month 1 No more gold coins will be obtained for check-in starting on the th ). Results by month 、ID Ascending sort .
notes : If the check-in record in_time- Entry time and out_time- It's time to leave ,
Also only recorded as in_time Check in on the corresponding date .
answer
SELECT uid,DATE_FORMAT(sign_dt,'%Y%m')as month,sum(coin)
FROM
(SELECT uid,sign_dt,TIMESTAMPADD(day,-diff+1,sign_dt)as start_day ,
case (DENSE_RANK() over (PARTITION by uid,TIMESTAMPADD(day,-diff+1,sign_dt) ORDER BY sign_dt))%7
WHEN 3 then 3
WHEN 0 THEN 7
ELSE 1 end as coin
FROM
(SELECT uid ,DATE_FORMAT(in_time,'%Y%m%d')as sign_dt,
DENSE_RANK() over(PARTITION by uid ORDER BY in_time) as diff
FROM tb_user_log
WHERE DATE_FORMAT(in_time,'%Y%m%d') BETWEEN 20210707 and 20211031
AND artical_id =0 AND sign_in =1 )t1 )t2
GROUP BY uid,DATE_FORMAT(sign_dt,'%Y%m')
ORDER BY DATE_FORMAT(sign_dt,'%Y%m') ,uid


边栏推荐
- be based on. NETCORE development blog project starblog - (13) add friendship link function
- [today in history] July 1: the father of time sharing system was born; Alipay launched barcode payment; The first TV advertisement in the world
- When Sqlalchemy deletes records with foreign key constraints, the foreign key constraints do not work. What is the solution?
- 题目 1004: 母牛的故事(递推)
- Update a piece of data from the database. Will CDC get two pieces of data with OP fields D and C at the same time? I remember before, only OP was U
- 基因检测,如何帮助患者对抗疾病?
- 路由基础之OSPF LSA详细讲解
- Vs code set code auto save
- redis探索之缓存击穿、缓存雪崩、缓存穿透
- Class initialization and instantiation
猜你喜欢

Queue operation---
![[encounter Django] - (II) database configuration](/img/13/9512c1e03349092874055771c3433d.png)
[encounter Django] - (II) database configuration
![leetcode:329. The longest incremental path in the matrix [DFS + cache + no backtracking + elegance]](/img/10/acd162c3adf9d6f14fa5a551dc0d25.png)
leetcode:329. The longest incremental path in the matrix [DFS + cache + no backtracking + elegance]

用.Net Core接入微信公众号开发

I spent tens of thousands of dollars to learn and bring goods: I earned 3 yuan in three days, and the transaction depends on the bill

The future of game guild in decentralized games

How can genetic testing help patients fight disease?

VM虚拟机配置动态ip和静态ip访问
![[today in history] July 1: the father of time sharing system was born; Alipay launched barcode payment; The first TV advertisement in the world](/img/41/76687ea13e1722654b235f2cfa66ce.png)
[today in history] July 1: the father of time sharing system was born; Alipay launched barcode payment; The first TV advertisement in the world

Zabbix 6.0 源码安装以及 HA 配置
随机推荐
GID: open vision proposes a comprehensive detection model knowledge distillation | CVPR 2021
基于开源流批一体数据同步引擎 ChunJun 数据还原 —DDL 解析模块的实战分享
Operator-1 first acquaintance with operator
First intention is the most important
软件测试中功能测试流程
There are still many things to be done in the second half of the year
不同的测试技术区分
Zero copy technology of MySQL
mysql统计账单信息(下):数据导入及查询
Logstash error: cannot reload pipeline, because the existing pipeline is not reloadable
Three stages of aho
网络socket的状态要怎么统计?
基因检测,如何帮助患者对抗疾病?
Powerful, easy-to-use, professional editor / notebook software suitable for programmers / software developers, comprehensive evaluation and comprehensive recommendation
哪个券商公司开户佣金低又安全又可靠
《MATLAB 神经网络43个案例分析》:第40章 动态神经网络时间序列预测研究——基于MATLAB的NARX实现
题目 2612: 蓝桥杯2021年第十二届省赛真题-最少砝码(枚举找规律+递推)
nexus搭建npm依赖私库
Will it affect the original MySQL database to read the data of a MySQL table in full by flick MySQL CDC
I spent tens of thousands of dollars to learn and bring goods: I earned 3 yuan in three days, and the transaction depends on the bill