当前位置:网站首页>同期群分析是什么?教你用 SQL 来搞定
同期群分析是什么?教你用 SQL 来搞定
2022-06-29 00:41:00 【俊红的数据分析之路】
目录
一、同期群分析的定义
二、SQL 步骤
1. 查看数据
2. 根据 uid 、年月聚合用户人数
3. 计算年月的差额(天数)
4. 计算年月的差额(月数)
5. 透视(根据 uid 、首次付费年月去透视年月差额的用户人数)
6. 计算留存率
一、同期群分析的定义
「同期群分析」(Cohort Analysis)是一种通过“纵横”结合对用户分群的细分类型分析的方法:
「横向上」——分析同期群随着周期推移而发生的变化
「纵向上」——分析在生命周期相同阶段的群组之间的差异
「同期群」指的是同一时期的群体,可以是同一天注册的用户、同一天第一次发生付费行为的用户等。
「周期的指标变化」是指用户在一定周期内的留存率、付费率等指标。
同期群分析包含三个核心的元素:
「客户首次行为时间」:这是划分同期群体的基点
「时间周期维度」:比如 N 日留存率、N 日转化率中的 N 日,一般即为 +N 日、+N 月
「变化的指标」:比如注册转化率、付款转化率、留存率等指标
同期群分析给到更加细致的衡量指标,可以实时监控真实的用户行为、衡量用户价值,并为营销方案的优化和改进提供支撑,避免“被平均”的虚荣数据。
二、SQL 步骤
下面我使用 PostgreSQL 拆分步骤来实现基于首单日期的用户留存率同期群报表,「每一步骤都是在前一步骤的基础上进行再加工」,这在代码中的子查询中也得到体现,理清了思路就会发现其实很简单。
重点有以下几点:
统计出每个用户的「首单时间」
计算首单时间和实际下单时间的「日期差」
对于付费用户数需要「去重统计」
注意字段「格式的转换」
1. 查看数据
-- 0. 查看数据
SELECT * FROM "日志" LIMIT 10;
2. 根据 uid 、年月聚合用户人数
-- 1. 根据 uid 、年月聚合用户人数
SELECT
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid;
3. 计算年月的差额(天数)
-- 2. 计算年月的差额(天数)
SELECT *,to_date(t.年月,'YYYY-MM') - to_date(t.首次付费年月,'YYYY-MM') AS 天数差额
FROM (SELECT
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid) AS t;
4. 计算年月的差额(月数)
-- 3. 计算年月的差额(月数)
SELECT t.*,
(case when t."天数差额" <= 30 then '首月'
when t."天数差额" <= 60 then '+1月'
when t."天数差额" <= 90 then '+2月'
when t."天数差额" <= 120 then '+3月'
when t."天数差额" <= 150 then '+4月'
else NULL
END) AS 月差额
FROM (SELECT *,to_date(t.年月,'YYYY-MM') - to_date(t.首次付费年月,'YYYY-MM') AS 天数差额
FROM (SELECT
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid) AS t) AS t;
5. 透视(根据 uid 、首次付费年月去透视年月差额的用户人数)
-- 4. 透视(根据 uid 、首次付费年月去透视年月差额的用户人数)
SELECT t.首次付费年月,
count(distinct case when t.年月差额 = 0 then t.uid else NULL end) AS 首月,
count(distinct case when t.年月差额 = 1 then t.uid else NULL end) AS "+1月",
count(distinct case when t.年月差额 = 2 then t.uid else NULL end) AS "+2月",
count(distinct case when t.年月差额 = 3 then t.uid else NULL end) AS "+3月",
count(distinct case when t.年月差额 = 4 then t.uid else NULL end) AS "+4月"
FROM (SELECT * FROM (SELECT *,round((to_date(t.年月,'YYYY-MM') - to_date(t.首次付费年月,'YYYY-MM')) / 30,0) AS 年月差额
FROM (SELECT
"日志".uid:: text,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid) AS t) AS t) AS t
GROUP BY t.首次付费年月;
6. 计算留存率
-- 5. 计算留存率
SELECT t.首次付费年月,t.首月,
round((t."+1月"::numeric / t.首月::numeric) * 100,2)::text || '%' AS "1月后",
round((t."+2月"::numeric / t.首月::numeric) * 100,2)::text || '%' AS "2月后",
round((t."+3月"::numeric / t.首月::numeric) * 100,2)::text || '%' AS "3月后",
round((t."+4月"::numeric / t.首月::numeric) * 100,2)::text || '%' AS "4月后"
FROM(SELECT t.首次付费年月,
count(distinct case when t.年月差额 = 0 then t.uid else NULL end) AS 首月,
count(distinct case when t.年月差额 = 1 then t.uid else NULL end) AS "+1月",
count(distinct case when t.年月差额 = 2 then t.uid else NULL end) AS "+2月",
count(distinct case when t.年月差额 = 3 then t.uid else NULL end) AS "+3月",
count(distinct case when t.年月差额 = 4 then t.uid else NULL end) AS "+4月"
FROM (SELECT * FROM (SELECT *,round((to_date(t.年月,'YYYY-MM') - to_date(t.首次付费年月,'YYYY-MM')) / 30,0) AS 年月差额
FROM (SELECT
"日志".uid:: text,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid) AS t) AS t) AS t
GROUP BY t.首次付费年月) AS t;
- END -
对比Excel系列图书累积销量达15w册,让你轻松掌握数据分析技能,可以点击下方链接进行了解选购:边栏推荐
猜你喜欢

What is redis

WPF implementation calls local camera~

The magical zero knowledge proof can not only keep secrets, but also make others believe you!

卷绕工艺与叠片工艺的对比

浏览器缓存库设计总结(localStorage/indexedDB)

Précautions d'installation et d'utilisation des joints rotatifs

Analysis of basic structure and working principle of slip ring

The company has a new Post-00 test paper king. The old oilman said that he could not do it. He has been

Ensemble de données sur les visages masqués et méthode de génération des visages masqués

单机多实例MYSQL主从复制
随机推荐
[leetcode] 522. Longest special sequence II violence + double pointer
12. Détection d'objets Mask rcnn
Go1.18 new feature: discard strings Title Method, a new pit!
单机多实例MYSQL主从复制
[image denoising] matlab code for removing salt and pepper noise based on fast and effective multistage selective convolution filter
如果你会玩这4个自媒体运营工具,副业收入6000+很轻松
Operation level smart campus system source code smart campus applet source code + electronic class card + face recognition system
Two fresh students: one is practical and likes to work overtime, and the other is skilled. How to choose??
Redis是什么
FATAL ERROR: Could not find ./bin/my_print_defaults的解决办法
卷绕工艺与叠片工艺的对比
6.28 学习内容
Structure of the actual combat battalion | module 5
Reference materials in the process of using Excel
Use and principle of handlerthread
[Gym 102423]-Elven Efficiency | 思维
How the slip ring motor works
每日一题: 数组中数字出现的次数
每日一题:数组中数字出现的次数2
How to calculate the income tax of foreign-funded enterprises