当前位置:网站首页>Clickhouse materialized view
Clickhouse materialized view
2022-07-05 03:45:00 【Younger Cheng】
One 、 Concept
clickhouse Materialized view of is a kind of persistence of query results , It has brought us the improvement of query efficiency . There is no difference between user query and table lookup , Its essence is a table , A table that is always pre calculated , The creation process uses a special engine :
notes : Use create grammar , A hidden target will be created to save the view data , Use To Table name , Save to a displayed table , No addition To Table name , The default table name is .inner. Materialized view name ,
CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db.]table_name [TO[db.]name] [ENGINE = engine] [POPULATE] AS SELECT …
“ Query result set ” Has a wide range of , It can be a simple copy of some data in the basic table , It can also be multi table join The result or a subset of it 、 Or the aggregation index of the original data . therefore , The materialized view will not change with the change of the underlying table , So it is also called snapshot
Restrictions on creating materialized views :
1、 The of materialized views must be specified engine For data storage
2、 Use To[db].[table] Grammatical time , Do not use POPULATE,POPULATE All historical data will be loaded and converted , There is an unavailable time when there is a large amount of data ( It is not officially recommended to use when creating materialized views POPULATE, Because the data written in the process of creating materialized views cannot be inserted into materialized views )
3、 Query statement (select) It can contain the following sentences :DISTINCT、GROUP BY、LIMIT etc.
4、 If the definition of materialized view uses TO [db.]name Sub statement , You can uninstall the view of the target table DETACH Reload ATTACH
The difference between materialized view and ordinary view :
Normal view doesn't save data , Only query statements are saved , When querying, you still read data from the original table , You can think of a normal view as a subquery .
Materialized views are Store the query results into disk or memory according to the corresponding engine , Reorganize the data , A new table is generated
Advantages and disadvantages :
advantage : Fast query speed , Write materialized view rules in advance , It is much better than directly querying the original data ( Materialized views play a synchronous role )
shortcoming : The essence is streaming data Usage scenarios of , Adopt cumulative Technology , Historical data should be used for de duplication 、 Analysis of denuclearization . Limited use scenarios , If you add many materialized views to a table , Writing this table will consume a lot of resources , For example, the data bandwidth is full , Sudden increase in storage
# Create table
CREATE TABLE `ts_area_info` (
id UInt32 ,
createDate Date ,
userId UInt32 ,
url String,
income UInt8
) ENGINE=MergeTree()
PARTITION BY toYYYYMM(createDate)
ORDER BY (id,createDate,intHash32(userId))
SAMPLE BY intHash32(userId)
SETTINGS index_granularity = 8192
# Create materialized views
CREATE MATERIALIZED VIEW area_mv
ENGINE SummingMergeTree
PARTITION BY toYYYYMM(createDate)
ORDER BY (id,createDate,intHash32(userId))
AS
select * from ts_area_info;
Newly generated materialized view :
Key fields :
populate: Create a table to synchronize data
final: Go to the latest data
Two 、 Materialized views act as aggregate tables
Realize aggregation when synchronizing data
# Use materialized views to synchronize aggregate tables
1、 Create a schedule
drop table tb_order;
create table tb_order(
id UInt8 ,
createDate Date ,
money UInt64
)
ENGINE =MergeTree()
order by id;
2、 insert data
insert into tb_order values(1,toDate(now()),100),
(2,toDate(now()),100),
(3,toDate(now()),100),
(1,toDate(now()),100),
(2,toDate(now()),200),
(3,toDate(now()),300);
3、 Create materialized views to synchronize data
CREATE MATERIALIZED VIEW order_mv
ENGINE AggregatingMergeTree()
PARTITION BY toYYYYMM(createDate)
ORDER BY (id,createDate)
POPULATE AS
select id,createDate,sumState(money) as ms from tb_order
GROUP BY id,createDate;
4、 Query materialized view
select id,createDate,sumMerge(ms) from order_mv GROUP BY id,createDate;
5、 Reinsert to view synchronized data
insert into tb_order values(1,toDate(now()),100),(2,toDate(now()),100);
insert into tb_order values(1,toDate('2022-06-29'),100),(2,toDate('2022-06-29'),100);
6、 Query order table
select * from tb_order;
notes :
The primary key is not specified when creating the table , Will be used by default order by Use the field of as the primary key
边栏推荐
- Necessary fonts for designers
- Assembly - getting started
- JWT vulnerability recurrence
- Subversive cognition: what does SRE do?
- 【软件逆向-基础知识】分析方法、汇编指令体系结构
- The perfect car for successful people: BMW X7! Superior performance, excellent comfort and safety
- Logstash、Fluentd、Fluent Bit、Vector? How to choose the appropriate open source log collector
- LeetCode 234. Palindrome linked list
- 深度学习——LSTM基础
- [deep learning] deep learning reference materials
猜你喜欢
[luat-air105] 4.1 file system FS
IPv6 experiment
特殊版:SpreadJS v15.1 VS SpreadJS v15.0
ActiveReportsJS 3.1 VS ActiveReportsJS 3.0
[groovy] string (string type variable definition | character type variable definition)
De debugging (set the main thread as hidden debugging to destroy the debugging Channel & debugger detection)
PlasticSCM 企业版Crack
Blue Bridge Cup single chip microcomputer -- PWM pulse width modulation
程序员的视力怎么样? | 每日趣闻
[untitled]
随机推荐
Nmap使用手册学习记录
Easy processing of ten-year futures and stock market data -- Application of tdengine in Tongxinyuan fund
Kubernetes - Multi cluster management
[untitled]
LeetCode 234. Palindrome linked list
Google Chrome CSS will not update unless the cache is cleared - Google Chrome CSS doesn't update unless clear cache
Redis source code analysis: redis cluster
Dart series: collection of best practices
speed or tempo in classical music
Share the newly released web application development framework based on blazor Technology
English essential vocabulary 3400
【PHP特性-变量覆盖】函数的使用不当、配置不当、代码逻辑漏洞
There is a question about whether the parallelism can be set for Flink SQL CDC. If the parallelism is greater than 1, will there be a sequence problem?
Yuancosmic ecological panorama [2022 latest]
Unity implements the code of the attacked white flash (including shader)
Use of kubesphere configuration set (configmap)
[groovy] string (string type variable definition | character type variable definition)
Monitoring web performance with performance
特殊版:SpreadJS v15.1 VS SpreadJS v15.0
[groovy] string (string injection function | asBoolean | execute | minus)