当前位置:网站首页>Clickhouse eliminates the gap caused by group by
Clickhouse eliminates the gap caused by group by
2022-07-01 02:09:00 【I'm a bad person】
This is the problem I encountered several months ago , I always thought I had recorded it , Suddenly found no / Over your face .jpg/. Demand is based on date 、 Country 、 supplier 、 equipment …… Equal conditions , Statistics DUA、 Retention rate and other data information . I am not responsible for this demand , It's just that colleagues meet with group by Resulting in date discontinuity , Ask me . This article only provides one idea , In end, colleagues did not use this scheme to solve this problem at database level , But in java Solve in the program .
1. The simplified problem reappears ,e.g. Due to China 4 month 5 To 20 Japan dua and use_times Are all 0, So these days disappeared in the results , Produced time discontinuity .
SELECT
event_date,
country_name,
groupBitmapOr(active_user) AS dau,
sum(use_times) AS use_times
FROM dws_app_global_active_retained
WHERE (country_name IN ('China', 'France', 'Hong Kong')) AND (event_date >= '2022-03-25')
GROUP BY
event_date,
country_name
ORDER BY
country_name ASC,
event_date ASC┌─event_date─┬─country_name─┬─dau─┬─use_times─┐
│ 2022-03-27 │ China │ 1 │ 1 │
│ 2022-03-28 │ China │ 1 │ 2 │
│ 2022-03-29 │ China │ 2 │ 64 │
│ 2022-03-30 │ China │ 1 │ 4 │
│ 2022-03-31 │ China │ 2 │ 4 │
│ 2022-04-01 │ China │ 1 │ 5 │
│ 2022-04-02 │ China │ 1 │ 1 │
│ 2022-04-03 │ China │ 1 │ 1 │
│ 2022-04-04 │ China │ 1 │ 2 │
│ 2022-04-21 │ China │ 1 │ 5 │
│ 2022-04-28 │ China │ 1 │ 6 │
│ 2022-04-29 │ China │ 1 │ 4 │
│ 2022-04-30 │ China │ 1 │ 12 │
│ 2022-05-01 │ China │ 1 │ 10 │
│ 2022-05-02 │ China │ 1 │ 9 │
│ 2022-05-03 │ China │ 1 │ 3 │
│ 2022-05-04 │ China │ 1 │ 2 │
│ 2022-05-05 │ China │ 1 │ 6 │
│ 2022-05-06 │ China │ 1 │ 3 │
│ 2022-05-07 │ China │ 1 │ 5 │
│ 2022-05-08 │ China │ 1 │ 4 │
│ 2022-03-25 │ Hong Kong │ 2 │ 5 │
│ 2022-03-26 │ Hong Kong │ 1 │ 2 │
│ 2022-03-27 │ Hong Kong │ 1 │ 2 │
│ 2022-03-28 │ Hong Kong │ 2 │ 26 │
│ 2022-03-29 │ Hong Kong │ 2 │ 56 │
│ 2022-03-30 │ Hong Kong │ 3 │ 17 │
│ 2022-03-31 │ Hong Kong │ 4 │ 12 │
│ 2022-04-01 │ Hong Kong │ 1 │ 1 │
│ 2022-04-02 │ Hong Kong │ 2 │ 23 │
│ 2022-04-06 │ Hong Kong │ 1 │ 1 │
│ 2022-04-07 │ Hong Kong │ 2 │ 9 │
│ 2022-04-08 │ Hong Kong │ 2 │ 29 │
└────────────┴──────────────┴─────┴───────────┘2. Fill in the time gap first (WITH FILL STEP n)
SELECT
event_date,
country_name,
groupBitmapOr(active_user) AS dau,
sum(use_times) AS use_times
FROM dws_app_global_active_retained
WHERE (country_name IN ('China', 'France', 'Hong Kong')) AND (event_date >= '2022-03-25')
GROUP BY
event_date,
country_name
ORDER BY
country_name ASC,
event_date ASC WITH FILL STEP 1
┌─event_date─┬─country_name─┬─dau─┬─use_times─┐
│ 2022-03-27 │ China │ 1 │ 1 │
│ 2022-03-28 │ China │ 1 │ 2 │
│ 2022-03-29 │ China │ 2 │ 64 │
│ 2022-03-30 │ China │ 1 │ 4 │
│ 2022-03-31 │ China │ 2 │ 4 │
│ 2022-04-01 │ China │ 1 │ 5 │
│ 2022-04-02 │ China │ 1 │ 1 │
│ 2022-04-03 │ China │ 1 │ 1 │
│ 2022-04-04 │ China │ 1 │ 2 │
│ 2022-04-05 │ │ 0 │ 0 │
│ 2022-04-06 │ │ 0 │ 0 │
│ 2022-04-07 │ │ 0 │ 0 │
│ 2022-04-08 │ │ 0 │ 0 │
│ 2022-04-09 │ │ 0 │ 0 │
│ 2022-04-10 │ │ 0 │ 0 │
│ 2022-04-11 │ │ 0 │ 0 │
│ 2022-04-12 │ │ 0 │ 0 │
│ 2022-04-13 │ │ 0 │ 0 │
│ 2022-04-14 │ │ 0 │ 0 │
│ 2022-04-15 │ │ 0 │ 0 │
│ 2022-04-16 │ │ 0 │ 0 │
│ 2022-04-17 │ │ 0 │ 0 │
│ 2022-04-18 │ │ 0 │ 0 │
│ 2022-04-19 │ │ 0 │ 0 │
│ 2022-04-20 │ │ 0 │ 0 │
│ 2022-04-21 │ China │ 1 │ 5 │
│ 2022-04-22 │ │ 0 │ 0 │
│ 2022-04-23 │ │ 0 │ 0 │
│ 2022-04-24 │ │ 0 │ 0 │
│ 2022-04-25 │ │ 0 │ 0 │
│ 2022-04-26 │ │ 0 │ 0 │
│ 2022-04-27 │ │ 0 │ 0 │
│ 2022-04-28 │ China │ 1 │ 6 │
│ 2022-04-29 │ China │ 1 │ 4 │
│ 2022-04-30 │ China │ 1 │ 12 │
│ 2022-05-01 │ China │ 1 │ 10 │
│ 2022-05-02 │ China │ 1 │ 9 │
│ 2022-05-03 │ China │ 1 │ 3 │
│ 2022-05-04 │ China │ 1 │ 2 │
│ 2022-05-05 │ China │ 1 │ 6 │
│ 2022-05-06 │ China │ 1 │ 3 │
│ 2022-05-07 │ China │ 1 │ 5 │
│ 2022-05-08 │ China │ 1 │ 4 │
│ 2022-03-25 │ Hong Kong │ 2 │ 5 │
│ 2022-03-26 │ Hong Kong │ 1 │ 2 │
│ 2022-03-27 │ Hong Kong │ 1 │ 2 │
│ 2022-03-28 │ Hong Kong │ 2 │ 26 │
│ 2022-03-29 │ Hong Kong │ 2 │ 56 │
│ 2022-03-30 │ Hong Kong │ 3 │ 17 │
│ 2022-03-31 │ Hong Kong │ 4 │ 12 │
│ 2022-04-01 │ Hong Kong │ 1 │ 1 │
│ 2022-04-02 │ Hong Kong │ 2 │ 23 │
│ 2022-04-06 │ Hong Kong │ 1 │ 1 │
│ 2022-04-07 │ Hong Kong │ 2 │ 9 │
│ 2022-04-08 │ Hong Kong │ 2 │ 29 │
└────────────┴──────────────┴─────┴───────────┘3. At this time, a new problem appears , namely country_name There is a gap , So we need to use the previous 《Clickhouse Vacancy value processing 》 This article refers to “ Filling adjacent value method ” Fill the gap (arrayFill).
with temp as( SELECT
event_date,
country_name,
groupBitmapOr(active_user) AS dau,
sum(use_times) AS use_times
FROM dws_app_global_active_retained
WHERE (country_name IN ('China', 'France', 'Hong Kong')) AND (event_date >= '2022-03-25')
GROUP BY
event_date,
country_name
ORDER BY
country_name ASC,
event_date ASC WITH FILL STEP 1 )
select tuple.1 as event_date, tuple.2 as country_name,tuple.3 as dau, tuple.4 as use_times from
(select arrayJoin(
arrayZip(
groupArray(event_date),
arrayFill(x ->x !='',groupArray(country_name)) ,
groupArray(dau),
groupArray(use_times)
)
) as tuple
from temp)┌─event_date─┬─country_name─┬─dau─┬─use_times─┐
│ 2022-03-27 │ China │ 1 │ 1 │
│ 2022-03-28 │ China │ 1 │ 2 │
│ 2022-03-29 │ China │ 2 │ 64 │
│ 2022-03-30 │ China │ 1 │ 4 │
│ 2022-03-31 │ China │ 2 │ 4 │
│ 2022-04-01 │ China │ 1 │ 5 │
│ 2022-04-02 │ China │ 1 │ 1 │
│ 2022-04-03 │ China │ 1 │ 1 │
│ 2022-04-04 │ China │ 1 │ 2 │
│ 2022-04-05 │ China │ 0 │ 0 │
│ 2022-04-06 │ China │ 0 │ 0 │
│ 2022-04-07 │ China │ 0 │ 0 │
│ 2022-04-08 │ China │ 0 │ 0 │
│ 2022-04-09 │ China │ 0 │ 0 │
│ 2022-04-10 │ China │ 0 │ 0 │
│ 2022-04-11 │ China │ 0 │ 0 │
│ 2022-04-12 │ China │ 0 │ 0 │
│ 2022-04-13 │ China │ 0 │ 0 │
│ 2022-04-14 │ China │ 0 │ 0 │
│ 2022-04-15 │ China │ 0 │ 0 │
│ 2022-04-16 │ China │ 0 │ 0 │
│ 2022-04-17 │ China │ 0 │ 0 │
│ 2022-04-18 │ China │ 0 │ 0 │
│ 2022-04-19 │ China │ 0 │ 0 │
│ 2022-04-20 │ China │ 0 │ 0 │
│ 2022-04-21 │ China │ 1 │ 5 │
│ 2022-04-22 │ China │ 0 │ 0 │
│ 2022-04-23 │ China │ 0 │ 0 │
│ 2022-04-24 │ China │ 0 │ 0 │
│ 2022-04-25 │ China │ 0 │ 0 │
│ 2022-04-26 │ China │ 0 │ 0 │
│ 2022-04-27 │ China │ 0 │ 0 │
│ 2022-04-28 │ China │ 1 │ 6 │
│ 2022-04-29 │ China │ 1 │ 4 │
│ 2022-04-30 │ China │ 1 │ 12 │
│ 2022-05-01 │ China │ 1 │ 10 │
│ 2022-05-02 │ China │ 1 │ 9 │
│ 2022-05-03 │ China │ 1 │ 3 │
│ 2022-05-04 │ China │ 1 │ 2 │
│ 2022-05-05 │ China │ 1 │ 6 │
│ 2022-05-06 │ China │ 1 │ 3 │
│ 2022-05-07 │ China │ 1 │ 5 │
│ 2022-05-08 │ China │ 1 │ 4 │
│ 2022-03-25 │ Hong Kong │ 2 │ 5 │
│ 2022-03-26 │ Hong Kong │ 1 │ 2 │
│ 2022-03-27 │ Hong Kong │ 1 │ 2 │
│ 2022-03-28 │ Hong Kong │ 2 │ 26 │
│ 2022-03-29 │ Hong Kong │ 2 │ 56 │
│ 2022-03-30 │ Hong Kong │ 3 │ 17 │
│ 2022-03-31 │ Hong Kong │ 4 │ 12 │
│ 2022-04-01 │ Hong Kong │ 1 │ 1 │
│ 2022-04-02 │ Hong Kong │ 2 │ 23 │
│ 2022-04-06 │ Hong Kong │ 1 │ 1 │
│ 2022-04-07 │ Hong Kong │ 2 │ 9 │
│ 2022-04-08 │ Hong Kong │ 2 │ 29 │
└────────────┴──────────────┴─────┴───────────┘边栏推荐
- LabVIEW计算相机图像传感器分辨率以及镜头焦距
- [punch in questions] integrated daily 5 questions sharing (phase I)
- What is project management?
- SWT / anr problem - binder stuck
- Alphabet-Rearrange-Inator 3000(字典树自定义排序)
- Leetcode(524)——通过删除字母匹配到字典里最长单词
- When facing the industrial Internet, they even use the ways and methods of consuming the Internet to land and practice the industrial Internet
- electron之坑addon
- SWT/ANR问题--Binder Stuck
- 思特奇加入openGauss开源社区,共同推动数据库产业生态发展
猜你喜欢
![[fundamentals of wireless communication-15]: illustrated mobile communication technology and application development-3-overview of digital communication 2G GSM, CDMA, 3G wdcma/cdma200/td-scdma, 4G LTE](/img/22/1efa444220131359b06005f597c9db.png)
[fundamentals of wireless communication-15]: illustrated mobile communication technology and application development-3-overview of digital communication 2G GSM, CDMA, 3G wdcma/cdma200/td-scdma, 4G LTE

【JS】【掘金】获取关注了里不在关注者里的人

Selenium经典面试题-多窗口切换解决方案

正向代理和反向代理快速理解

int和位数组互转

The personal test is effective, and the JMeter desktop shortcut is quickly created

(translation) reasons why real-time inline verification is easier for users to make mistakes
![SQL语句关联表 如何添加关联表的条件 [需要null值或不需要null值]](/img/91/0efbc13597be4dba5b9cf4e8644e35.png)
SQL语句关联表 如何添加关联表的条件 [需要null值或不需要null值]
![[无线通信基础-15]:图解移动通信技术与应用发展-3- 数字通信2G GSM、CDMA、3G WDCMA/CDMA200/TD-SCDMA、4G LTE、5G NR概述](/img/22/1efa444220131359b06005f597c9db.png)
[无线通信基础-15]:图解移动通信技术与应用发展-3- 数字通信2G GSM、CDMA、3G WDCMA/CDMA200/TD-SCDMA、4G LTE、5G NR概述

There is no future to be expected. It is just the last fantasy of a migrant worker before he dies
随机推荐
AS400 entretien d'usine
Test essential tool - postman practical tutorial
Log4j2 threadcontext log link tracking
After working for 6 years, let's take stock of the golden rule of the workplace where workers mix up
How do the top ten securities firms open accounts? Also, is it safe to open an account online?
A preliminary understanding of operator overloading
With one-stop insight into industry hot spots, the new function "traffic market" of feigua data station B is launched!
Static domain and static method
项目管理是什么?
QML控件类型:ToolTip
静态域与静态方法
机器学习10-信念贝叶斯分类器
修复表中的名字(首字符大写,其他小写)
How to select securities companies? In addition, is it safe to open a mobile account?
[Agora] user management
[fundamentals of wireless communication-15]: illustrated mobile communication technology and application development-3-overview of digital communication 2G GSM, CDMA, 3G wdcma/cdma200/td-scdma, 4G LTE
Opencv -- Notes
What other hot spots are hidden under 1500W playback? Station B 2 future trends you can't miss
【agora】用户管理
The image variables in the Halcon variable window are not displayed, and it is useless to restart the software and the computer