当前位置:网站首页>同期群分析是什么?教你用 SQL 来搞定
同期群分析是什么?教你用 SQL 来搞定
2022-06-29 00:41:00 【俊红的数据分析之路】
目录
一、同期群分析的定义
二、SQL 步骤
1. 查看数据
2. 根据 uid 、年月聚合用户人数
3. 计算年月的差额(天数)
4. 计算年月的差额(月数)
5. 透视(根据 uid 、首次付费年月去透视年月差额的用户人数)
6. 计算留存率
一、同期群分析的定义
「同期群分析」(Cohort Analysis)是一种通过“纵横”结合对用户分群的细分类型分析的方法:
「横向上」——分析同期群随着周期推移而发生的变化
「纵向上」——分析在生命周期相同阶段的群组之间的差异
「同期群」指的是同一时期的群体,可以是同一天注册的用户、同一天第一次发生付费行为的用户等。
「周期的指标变化」是指用户在一定周期内的留存率、付费率等指标。
同期群分析包含三个核心的元素:
「客户首次行为时间」:这是划分同期群体的基点
「时间周期维度」:比如 N 日留存率、N 日转化率中的 N 日,一般即为 +N 日、+N 月
「变化的指标」:比如注册转化率、付款转化率、留存率等指标
同期群分析给到更加细致的衡量指标,可以实时监控真实的用户行为、衡量用户价值,并为营销方案的优化和改进提供支撑,避免“被平均”的虚荣数据。
二、SQL 步骤
下面我使用 PostgreSQL 拆分步骤来实现基于首单日期的用户留存率同期群报表,「每一步骤都是在前一步骤的基础上进行再加工」,这在代码中的子查询中也得到体现,理清了思路就会发现其实很简单。
重点有以下几点:
统计出每个用户的「首单时间」
计算首单时间和实际下单时间的「日期差」
对于付费用户数需要「去重统计」
注意字段「格式的转换」
1. 查看数据
-- 0. 查看数据
SELECT * FROM "日志" LIMIT 10;
2. 根据 uid 、年月聚合用户人数
-- 1. 根据 uid 、年月聚合用户人数
SELECT
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid;
3. 计算年月的差额(天数)
-- 2. 计算年月的差额(天数)
SELECT *,to_date(t.年月,'YYYY-MM') - to_date(t.首次付费年月,'YYYY-MM') AS 天数差额
FROM (SELECT
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid) AS t;
4. 计算年月的差额(月数)
-- 3. 计算年月的差额(月数)
SELECT t.*,
(case when t."天数差额" <= 30 then '首月'
when t."天数差额" <= 60 then '+1月'
when t."天数差额" <= 90 then '+2月'
when t."天数差额" <= 120 then '+3月'
when t."天数差额" <= 150 then '+4月'
else NULL
END) AS 月差额
FROM (SELECT *,to_date(t.年月,'YYYY-MM') - to_date(t.首次付费年月,'YYYY-MM') AS 天数差额
FROM (SELECT
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid) AS t) AS t;
5. 透视(根据 uid 、首次付费年月去透视年月差额的用户人数)
-- 4. 透视(根据 uid 、首次付费年月去透视年月差额的用户人数)
SELECT t.首次付费年月,
count(distinct case when t.年月差额 = 0 then t.uid else NULL end) AS 首月,
count(distinct case when t.年月差额 = 1 then t.uid else NULL end) AS "+1月",
count(distinct case when t.年月差额 = 2 then t.uid else NULL end) AS "+2月",
count(distinct case when t.年月差额 = 3 then t.uid else NULL end) AS "+3月",
count(distinct case when t.年月差额 = 4 then t.uid else NULL end) AS "+4月"
FROM (SELECT * FROM (SELECT *,round((to_date(t.年月,'YYYY-MM') - to_date(t.首次付费年月,'YYYY-MM')) / 30,0) AS 年月差额
FROM (SELECT
"日志".uid:: text,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid) AS t) AS t) AS t
GROUP BY t.首次付费年月;
6. 计算留存率
-- 5. 计算留存率
SELECT t.首次付费年月,t.首月,
round((t."+1月"::numeric / t.首月::numeric) * 100,2)::text || '%' AS "1月后",
round((t."+2月"::numeric / t.首月::numeric) * 100,2)::text || '%' AS "2月后",
round((t."+3月"::numeric / t.首月::numeric) * 100,2)::text || '%' AS "3月后",
round((t."+4月"::numeric / t.首月::numeric) * 100,2)::text || '%' AS "4月后"
FROM(SELECT t.首次付费年月,
count(distinct case when t.年月差额 = 0 then t.uid else NULL end) AS 首月,
count(distinct case when t.年月差额 = 1 then t.uid else NULL end) AS "+1月",
count(distinct case when t.年月差额 = 2 then t.uid else NULL end) AS "+2月",
count(distinct case when t.年月差额 = 3 then t.uid else NULL end) AS "+3月",
count(distinct case when t.年月差额 = 4 then t.uid else NULL end) AS "+4月"
FROM (SELECT * FROM (SELECT *,round((to_date(t.年月,'YYYY-MM') - to_date(t.首次付费年月,'YYYY-MM')) / 30,0) AS 年月差额
FROM (SELECT
"日志".uid:: text,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' ) AS 年月,
min(to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )) OVER(PARTITION BY "日志".uid) AS 首次付费年月
FROM
"日志"
GROUP BY
"日志".uid,
to_char( to_date( "日志"."日期", 'YYYY-MM' ), 'YYYY-MM' )
ORDER BY "日志".uid) AS t) AS t) AS t
GROUP BY t.首次付费年月) AS t;
- END -
对比Excel系列图书累积销量达15w册,让你轻松掌握数据分析技能,可以点击下方链接进行了解选购:边栏推荐
- Reading notes of English grammar new thinking Basic Edition 2 (I)
- LG. Hankson's interesting questions, C language
- How the slip ring motor works
- Daily practice: delete duplicates in the ordered array
- Roson's QT journey 80 qurl class
- Reprint: VTK notes - clipping and segmentation - irregular closed loop clipping -vtkselectpolydata class (black mountain old demon)
- 运营级智慧校园系统源码 智慧校园小程序源码+电子班牌+人脸识别系统
- Reprint: VTK notes - clipping and segmentation - 3D curve or geometric cutting volume data (black mountain old demon)
- PR 2021 quick start tutorial, how to use audio editing in PR?
- 搭建单机 nacos 负载均衡ribbon 轮询策略 权重2种方式
猜你喜欢

【leetcode】1719. Number of schemes for reconstructing a tree

Nodejs installation and download

架构实战营|模块5

转载:VTK笔记-裁剪分割-三维曲线或几何切割体数据(黑山老妖)

每日一题:数组中数字出现的次数2

mysql 8.0以上报2058 解决方式

Comics | goodbye, postman! One stop collaboration makes apipost more fragrant!
JVM底层又是如何实现synchronized的
![User login (remember the user) & user registration (verification code) [using cookie session technology]](/img/31/c84c1e15aa1c73814c4ad643e3dd36.png)
User login (remember the user) & user registration (verification code) [using cookie session technology]

每日一题:消失的数字
随机推荐
Getting started with SQL
请问基金是否靠谱,安全吗
Sampling with VerilogA module
LG. Hankson's interesting questions, C language
What is redis
Report on the convenient bee Lantern Festival: the first peak sales of pasta products this year; prefabricated wine dumplings became the winners
Is pension insurance a financial product? Where is the expected return?
Is it safe to open an account on great wisdom
HandlerThread使用及原理
Remove HTML tags from Oracle
Reprint: VTK notes - clipping and segmentation - 3D curve or geometric cutting volume data (black mountain old demon)
Reasons for high price of optical fiber slip ring
Test experience: how testers evolve from 0 to 1
运营级智慧校园系统源码 智慧校园小程序源码+电子班牌+人脸识别系统
[staff] pedal mark (step on pedal ped mark | release pedal * mark | corresponding pedal command in MIDI | continuous control signal | switch control signal)
[200 opencv routines] 101 adaptive median filter
点击劫持:X-Frame-Options未配置
Précautions d'installation et d'utilisation des joints rotatifs
[communication] wide band source DOA estimation method based on incoherent signal subspace (ISM)
[image detection] recognition of the front and back of a coin based on texture features with matlab code attached