当前位置:网站首页>[Niu Ke's questions -sql big factory interview real questions] no2 User growth scenario (a certain degree of information flow)
[Niu Ke's questions -sql big factory interview real questions] no2 User growth scenario (a certain degree of information flow)
2022-07-01 12:52:00 【It bond】

Systematic learning SQL, Please go to the Niuke classic high frequency interview question bank , Participate in practical training , Improve your SQL Skills ~
https://www.nowcoder.com/link/pc_csdncpt_itbd_sql
List of articles
- Preface
- SQL162 2021 year 11 The average time of browsing articles every day in the month
- SQL163 The maximum number of people reading each article at the same time
- SQL164 2021 year 11 The next day retention rate of new users every day in the month
- SQL165 Count the user rating results of active interval
- SQL166 Number of daily activities per day and proportion of new users
- SQL167 Continuous check-in and receive gold coins
Preface
SQL Everyone has to use , But output is not measured by SQL In itself , You need to use this tool , To create other value .
SQL162 2021 year 11 The average time of browsing articles every day in the month
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:31', 0),
(102, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:24', 0),
(102, 9002, '2021-11-01 11:00:00', '2021-11-01 11:00:11', 0),
(101, 9001, '2021-11-02 10:00:00', '2021-11-02 10:00:50', 0),
(102, 9002, '2021-11-02 11:00:01', '2021-11-02 11:00:24', 0);
demand
Scenario logic description :artical_id- article ID Represents the of articles viewed by users ID,
artical_id- article ID by 0 Indicates that the user is on the non article content page ( such as App List page in 、 Activity page, etc ).
problem : Statistics 2021 year 11 The average time of browsing articles every day in the month ( Number of seconds ),
The result is reserved 1 Decimal place , And sort from short to long on time .
answer
select date_format(in_time,"%Y-%m-%d")dt,
round (sum(timestampdiff(second,in_time,out_time))
/count(distinct uid),1) avg_view_len_sec
from tb_user_log
where month(in_time)= 11 and artical_id !=0
group by dt
order by avg_view_len_sec

SQL163 The maximum number of people reading each article at the same time
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:11', 0),
(102, 9001, '2021-11-01 10:00:09', '2021-11-01 10:00:38', 0),
(103, 9001, '2021-11-01 10:00:28', '2021-11-01 10:00:58', 0),
(104, 9002, '2021-11-01 11:00:45', '2021-11-01 11:01:11', 0),
(105, 9001, '2021-11-01 10:00:51', '2021-11-01 10:00:59', 0),
(106, 9002, '2021-11-01 11:00:55', '2021-11-01 11:01:24', 0),
(107, 9001, '2021-11-01 10:00:01', '2021-11-01 10:01:50', 0);
demand
Scenario logic description :artical_id- article ID Represents the of articles viewed by users ID,
artical_id- article ID by 0 Indicates that the user is on the non article content page ( such as App List page in 、 Activity page, etc ).
problem : Count the maximum number of people reading each article at the same time , If there are both entry and departure at the same time ,
First record the increase in the number of users, and then record the decrease , The results are in descending order of the largest number of people .
answer
SELECT artical_id, max(t1) as t2
FROM(
SELECT artical_id,t,
sum(m)over(partition by artical_id order by t,m desc) as t1
from(
select artical_id,in_time as t,1 as m
from tb_user_log
where artical_id<>0
UNION ALL
select artical_id,out_time as t,-1 as m
from tb_user_log
where artical_id<>0
)a
)b
group by artical_id
order by t2 desc

SQL164 2021 year 11 The next day retention rate of new users every day in the month
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 0, '2021-11-01 10:00:00', '2021-11-01 10:00:42', 1),
(102, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:09', 0),
(103, 9001, '2021-11-01 10:00:01', '2021-11-01 10:01:50', 0),
(101, 9002, '2021-11-02 10:00:09', '2021-11-02 10:00:28', 0),
(103, 9002, '2021-11-02 10:00:51', '2021-11-02 10:00:59', 0),
(104, 9001, '2021-11-02 10:00:28', '2021-11-02 10:00:50', 0),
(101, 9003, '2021-11-03 11:00:55', '2021-11-03 11:01:24', 0),
(104, 9003, '2021-11-03 11:00:45', '2021-11-03 11:00:55', 0),
(105, 9003, '2021-11-03 11:00:53', '2021-11-03 11:00:59', 0),
(101, 9002, '2021-11-04 11:00:55', '2021-11-04 11:00:59', 0);
demand
problem : Statistics 2021 year 11 The next day retention rate of new users every day in the month ( Retain 2 Decimal place )
notes :
The retention rate of the next day is the proportion of the number of users newly added on the same day and active on the next day .
If in_time- Entry time and out_time- It's time to leave ,
It is recorded that the user has been active in two days , The results are in ascending order of date .
answer
select a.first_day, round(count(distinct b.uid)/count(distinct a.uid),2) as rate
from (
select uid, date(min(in_time)) as first_day
from tb_user_log
-- where date_format(in_time, '%Y-%m') = '2021-11'
group by uid
having date_format(min(in_time), '%Y-%m') = '2021-11') a
left join (
select uid, date(in_time) as dt
from tb_user_log
union
select uid, date(out_time) as dt
from tb_user_log) b
on a.uid = b.uid and datediff(b.dt, a.first_day) = 1
group by a.first_day
order by a.first_day

SQL165 Count the user rating results of active interval
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(109, 9001, '2021-08-31 10:00:00', '2021-08-31 10:00:09', 0),
(109, 9002, '2021-11-04 11:00:55', '2021-11-04 11:00:59', 0),
(108, 9001, '2021-09-01 10:00:01', '2021-09-01 10:01:50', 0),
(108, 9001, '2021-11-03 10:00:01', '2021-11-03 10:01:50', 0),
(104, 9001, '2021-11-02 10:00:28', '2021-11-02 10:00:50', 0),
(104, 9003, '2021-09-03 11:00:45', '2021-09-03 11:00:55', 0),
(105, 9003, '2021-11-03 11:00:53', '2021-11-03 11:00:59', 0),
(102, 9001, '2021-10-30 10:00:00', '2021-10-30 10:00:09', 0),
(103, 9001, '2021-10-21 10:00:00', '2021-10-21 10:00:09', 0),
(101, 0, '2021-10-01 10:00:00', '2021-10-01 10:00:42', 1);
demand
problem : Count the active interval and rank users , Proportion of users at each active level , Two decimal places are reserved for the result , And in descending order of proportion .
notes :
The user level standard is simplified to : Loyal users ( near 7 Days active and not new users )、 New users ( near 7 Days new )、
Sleeping users ( near 7 Not active for days, but active earlier )、 Lost users ( near 30 Not active for days, but active earlier ).
Suppose today is the maximum of all dates in the data .
near 7 Day means including the day T Close to 7 God , Closed interval [T-6, T].
answer
select user_grade,round(count(uid)
/(select count(distinct uid) from tb_user_log),2) q
from
(
select uid,(case when datediff((select max(in_time) from tb_user_log),max(in_time)) <=6
and datediff((select max(in_time) from tb_user_log),min(in_time)) >6
then ' Loyal users '
when datediff((select max(in_time) from tb_user_log),max(in_time)) <=6
and datediff((select max(in_time) from tb_user_log),min(in_time)) <=6
then ' New users '
when datediff((select max(in_time) from tb_user_log),max(in_time)) >6
and datediff((select max(in_time) from tb_user_log),min(in_time)) <=29
then ' Sleeping users '
else ' Lost users ' end ) user_grade
from tb_user_log
group by uid
) f1
group by user_grade
order by q desc

SQL166 Number of daily activities per day and proportion of new users
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 9001, '2021-10-31 10:00:00', '2021-10-31 10:00:09', 0),
(102, 9001, '2021-10-31 10:00:00', '2021-10-31 10:00:09', 0),
(101, 0, '2021-11-01 10:00:00', '2021-11-01 10:00:42', 1),
(102, 9001, '2021-11-01 10:00:00', '2021-11-01 10:00:09', 0),
(108, 9001, '2021-11-01 10:00:01', '2021-11-01 10:01:50', 0),
(108, 9001, '2021-11-02 10:00:01', '2021-11-02 10:01:50', 0),
(104, 9001, '2021-11-02 10:00:28', '2021-11-02 10:00:50', 0),
(106, 9001, '2021-11-02 10:00:28', '2021-11-02 10:00:50', 0),
(108, 9001, '2021-11-03 10:00:01', '2021-11-03 10:01:50', 0),
(109, 9002, '2021-11-03 11:00:55', '2021-11-03 11:00:59', 0),
(104, 9003, '2021-11-03 11:00:45', '2021-11-03 11:00:55', 0),
(105, 9003, '2021-11-03 11:00:53', '2021-11-03 11:00:59', 0),
(106, 9003, '2021-11-03 11:00:45', '2021-11-03 11:00:55', 0);
demand
problem : Count the number of daily activities and the proportion of new users
notes :
Proportion of new users = Number of new users of the day ÷ Number of active users of the day ( The number of days ).
If in_time- Entry time and out_time- It's time to leave , It is recorded that the user has been active in two days .
The proportion of new users is reserved 2 Decimal place , The results are sorted in ascending order by date .
answer
select dt, count(*) as dau, round(sum(new)/count(*), 2) as uv_new_ratio
from (
select uid, dt, case when dt = first_dt then 1 else 0 end as new
from
(select uid, date(in_time) as dt
from tb_user_log
UNION
select uid, date(out_time) as dt
from tb_user_log) t1
left join
(select uid, min(date(in_time)) as first_dt
from tb_user_log
group by uid) t2
using(uid)
) t
group by dt
order by dt

SQL167 Continuous check-in and receive gold coins
Create table statement
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log (
id INT PRIMARY KEY AUTO_INCREMENT COMMENT ' Self increasing ID',
uid INT NOT NULL COMMENT ' user ID',
artical_id INT NOT NULL COMMENT ' video ID',
in_time datetime COMMENT ' Entry time ',
out_time datetime COMMENT ' Time to leave ',
sign_in TINYINT DEFAULT 0 COMMENT ' Check in or not '
) CHARACTER SET utf8 COLLATE utf8_bin;
INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES
(101, 0, '2021-07-07 10:00:00', '2021-07-07 10:00:09', 1),
(101, 0, '2021-07-08 10:00:00', '2021-07-08 10:00:09', 1),
(101, 0, '2021-07-09 10:00:00', '2021-07-09 10:00:42', 1),
(101, 0, '2021-07-10 10:00:00', '2021-07-10 10:00:09', 1),
(101, 0, '2021-07-11 23:59:55', '2021-07-11 23:59:59', 1),
(101, 0, '2021-07-12 10:00:28', '2021-07-12 10:00:50', 1),
(101, 0, '2021-07-13 10:00:28', '2021-07-13 10:00:50', 1),
(102, 0, '2021-10-01 10:00:28', '2021-10-01 10:00:50', 1),
(102, 0, '2021-10-02 10:00:01', '2021-10-02 10:01:50', 1),
(102, 0, '2021-10-03 11:00:55', '2021-10-03 11:00:59', 1),
(102, 0, '2021-10-04 11:00:45', '2021-10-04 11:00:55', 0),
(102, 0, '2021-10-05 11:00:53', '2021-10-05 11:00:59', 1),
(102, 0, '2021-10-06 11:00:45', '2021-10-06 11:00:55', 1);
demand
Scenario logic description :
artical_id- article ID Represents the of articles viewed by users ID,
A special case artical_id- article ID by 0 Indicates that the user is on the non article content page ( such as App List page in 、 Activity page, etc ).
Be careful : Only artical_id by 0 when sign_in Value is valid .
from 2021 year 7 month 7 Japan 0 PM , Users can sign in every day to receive 1 Gold coin , And you can start accumulating check-in days ,
The second consecutive check-in 3、7 You can receive extra... Every day 2、6 Gold coin .
Every successive check-in 7 Days to re accumulate check-in days ( That is, reset the check-in days :
For the first 8 The day of check-in is recorded as the first day of a new round of check-in , led 1 Gold coin )
problem : Calculate each user 2021 year 7 The number of gold coins obtained each month since the month ( The event ends at 10 The end of the month ,
11 month 1 No more gold coins will be obtained for check-in starting on the th ). Results by month 、ID Ascending sort .
notes : If the check-in record in_time- Entry time and out_time- It's time to leave ,
Also only recorded as in_time Check in on the corresponding date .
answer
SELECT uid,DATE_FORMAT(sign_dt,'%Y%m')as month,sum(coin)
FROM
(SELECT uid,sign_dt,TIMESTAMPADD(day,-diff+1,sign_dt)as start_day ,
case (DENSE_RANK() over (PARTITION by uid,TIMESTAMPADD(day,-diff+1,sign_dt) ORDER BY sign_dt))%7
WHEN 3 then 3
WHEN 0 THEN 7
ELSE 1 end as coin
FROM
(SELECT uid ,DATE_FORMAT(in_time,'%Y%m%d')as sign_dt,
DENSE_RANK() over(PARTITION by uid ORDER BY in_time) as diff
FROM tb_user_log
WHERE DATE_FORMAT(in_time,'%Y%m%d') BETWEEN 20210707 and 20211031
AND artical_id =0 AND sign_in =1 )t1 )t2
GROUP BY uid,DATE_FORMAT(sign_dt,'%Y%m')
ORDER BY DATE_FORMAT(sign_dt,'%Y%m') ,uid


边栏推荐
- Zero copy technology of MySQL
- be based on. NETCORE development blog project starblog - (13) add friendship link function
- 【开发大杀器】之Idea
- 我选的热门专业,四年后成了“天坑”
- 单点登录SSO与JWT好文整理
- 木架的场景功能
- 下半年还有很多事要做
- Mysql间隙锁
- Vs code setting Click to open a new file window without overwriting the previous window
- Jenkins+webhooks-多分支参数化构建-
猜你喜欢

《MATLAB 神经网络43个案例分析》:第40章 动态神经网络时间序列预测研究——基于MATLAB的NARX实现

手机便签应用

Logstash error: cannot reload pipeline, because the existing pipeline is not reloadable

Zabbix 6.0 源码安装以及 HA 配置

Look at the sky at dawn and the clouds at dusk, and enjoy the beautiful pictures

The popular major I chose became "Tiankeng" four years later

工具箱之 IKVM.NET 项目新进展

基于开源流批一体数据同步引擎 ChunJun 数据还原 —DDL 解析模块的实战分享

项目部署,一点也不难!

MHA high availability cluster deployment and failover of database
随机推荐
QT 播放器之列表[通俗易懂]
"Analysis of 43 cases of MATLAB neural network": Chapter 40 research on prediction of dynamic neural network time series -- implementation of NARX based on MATLAB
Wechat applet - 80 practical examples of wechat applet projects
买卖其实也有风险
Question d'entrevue de Huawei: recrutement
Feign & Eureka & Zuul & Hystrix 流程
系统测试UI测试总结与问题(面试)
Fundamentals of number theory and its code implementation
Idea of [developing killer]
天青色等烟雨
Circular linked list--
SSO and JWT good article sorting
运行Powershell脚本提示“因为在此系统上禁止运行脚本”解决办法
Topic 2612: the real topic of the 12th provincial competition of the Blue Bridge Cup in 2021 - the least weight (enumerating and finding rules + recursion)
CS5268优势替代AG9321MCQ Typec多合一扩展坞方案
Project deployment is not difficult at all!
不同的测试技术区分
logstash报错:Cannot reload pipeline, because the existing pipeline is not reloadable
基因检测,如何帮助患者对抗疾病?
有没有大佬 遇到过flink监控postgresql数据库, 检查点无法使用的问题