当前位置:网站首页>ClickHouse 二级索引
ClickHouse 二级索引
2022-08-04 22:56:00 【jasong】
1 ClickHouse稀疏索引
个人理解(就是目录,就是每页的关键字 + 加关键字所在位置
index(第n个索引,关键字) ,
mrk(偏移,偏移对应的位置offset)
index->mrk->bin)
2 二级索引
关键字 | 说明 |
|---|---|
index name | 索引别名 |
Index expression | 索引源字段 |
Type | minmax, set, bloom filter,map |
GRANULARITY | 索引粒度 ,如ClickHouse 默认稀疏索引默认是8192 ,我理解 8192*GRANULARITY就是 skip_index.mrk 的block 大小 |
skpidx{index_name}.idx | which contains the ordered expression values) |
skpidx{index_name}.mrk2 | which contains the corresponding offsets into the associated data column files. |
3 表索引设置
- use_skip_indexes (0 or 1, default 1). 默认过滤所有index
- force_data_skipping_indexes 强制使用哪个index
4 管理索引
ALTER TABLE [db].table_name [ON CLUSTER cluster] ADD INDEX name expression TYPE type GRANULARITY value [FIRST|AFTER name]- Adds index description to tables metadata.ALTER TABLE [db].table_name [ON CLUSTER cluster] DROP INDEX name- Removes index description from tables metadata and deletes index files from disk.ALTER TABLE [db.]table_name [ON CLUSTER cluster] MATERIALIZE INDEX name [IN PARTITION partition_name]- Rebuilds the secondary indexnamefor the specifiedpartition_name. Implemented as a mutation. IfIN PARTITIONpart is omitted then it rebuilds the index for the whole table data.
5 Example
CREATE TABLE table_name
(
u64 UInt64,
i32 Int32,
s String,
...
INDEX a (u64 * i32, s) TYPE minmax GRANULARITY 3,
INDEX b (u64 * length(s)) TYPE set(1000) GRANULARITY 4
) ENGINE = MergeTree()
SELECT count() FROM table WHERE s < 'z'
SELECT count() FROM table WHERE u64 * i32 == 10 AND u64 * length(s) >= 1234
CREATE TABLE data
(
key Int,
d1 Int,
d1_null Nullable(Int),
INDEX d1_idx d1 TYPE minmax GRANULARITY 1,
INDEX d1_null_idx assumeNotNull(d1_null) TYPE minmax GRANULARITY 1
)
Engine=MergeTree()
ORDER BY key;
SELECT * FROM data_01515;
SELECT * FROM data_01515 SETTINGS force_data_skipping_indices=''; -- query will produce CANNOT_PARSE_TEXT error.
SELECT * FROM data_01515 SETTINGS force_data_skipping_indices='d1_idx'; -- query will produce INDEX_NOT_USED error.
SELECT * FROM data_01515 WHERE d1 = 0 SETTINGS force_data_skipping_indices='d1_idx'; -- Ok.
SELECT * FROM data_01515 WHERE d1 = 0 SETTINGS force_data_skipping_indices='`d1_idx`'; -- Ok (example of full featured parser).
SELECT * FROM data_01515 WHERE d1 = 0 SETTINGS force_data_skipping_indices='`d1_idx`, d1_null_idx'; -- query will produce INDEX_NOT_USED error, since d1_null_idx is not used.
SELECT * FROM data_01515 WHERE d1 = 0 AND assumeNotNull(d1_null) = 0 SETTINGS force_data_skipping_indices='`d1_idx`, d1_null_idx'; -- Ok.6 索引类型
7 支持哪些函数
Function (operator) / Index | primary key | minmax | ngrambf_v1 | tokenbf_v1 | bloom_filter |
|---|---|---|---|---|---|
equals (=, ==) | |||||
notEquals(!=, <>) | |||||
like | * | ||||
notLike | * | ||||
startsWith | * | ||||
endsWith | * | * | * | ||
multiSearchAny | * | * | * | * | |
in | |||||
notIn | |||||
less (<) | * | * | * | ||
greater (>) | * | * | * | ||
lessOrEquals (<=) | * | * | * | ||
greaterOrEquals (>=) | * | * | * | ||
empty | * | * | * | ||
notEmpty | * | * | * | ||
hasToken | * | * | * | * |
8 Demo
https://clickhouse.com/docs/en/guides/improving-query-performance/skipping-indexes#skip-best-practices
1 创建 默认 8192 的稀疏索引
CREATE TABLE skip_table
(
my_key UInt64,
my_value UInt64
)
ENGINE MergeTree primary key my_key
SETTINGS index_granularity=8192;
INSERT INTO skip_table SELECT number, intDiv(number,4096) FROM numbers(100000000);
SELECT * FROM skip_table WHERE my_value IN (125, 700)
┌─my_key─┬─my_value─┐
│ 512000 │ 125 │
│ 512001 │ 125 │
│ ... | ... |
└────────┴──────────┘2 创建 8192 * 2 的二级索引
ALTER TABLE skip_table ADD INDEX vix my_value TYPE set(100) GRANULARITY 2;
/*ALTER TABLE xx ADD INDEX game_id_index game_id TYPE bloom_filter(0.01) GRANULARITY 1;*/3 生效历史数据
ALTER TABLE skip_table MATERIALIZE INDEX vix;4 验证
SELECT * FROM skip_table WHERE my_value IN (125, 700)
┌─my_key─┬─my_value─┐
│ 512000 │ 125 │
│ 512001 │ 125 │
│ ... | ... |
└────────┴──────────┘
8192 rows in set. Elapsed: 0.051 sec. Processed 32.77 thousand rows, 360.45 KB (643.75 thousand rows/s., 7.08 MB/s.)
see detail
SET send_logs_level='trace';
<Debug> default.skip_table (933d4b2c-8cea-4bf9-8c93-c56e900eefd1) (SelectExecutor): Index `vix` has dropped 6102/6104 granules.下方为图形解释,每个稀疏索引为 8192*2 ,索引每2两个Granule为一个Skip Index ,1 Block
边栏推荐
- enumerate()函数
- [Cultivation of internal skills of string functions] strlen + strstr + strtok + strerror (3)
- Use ngrok to optimize web pages on raspberry pi (1)
- 最温馨的家园
- 得不到你的心,就用“分布式锁”锁住你的人
- [QNX Hypervisor 2.2用户手册]10.4 vdev hpet
- FinClip崁入式搭建生态平台,降低合作门槛
- 【3D建模制作技巧分享】zbrush贴图映射小技巧
- [Cultivation of internal skills of memory operation functions] memcpy + memmove + memcmp + memset (4)
- 地面高度检测/平面提取与检测(Fast Plane Extraction in Organized Point Clouds Using Agglomerative Hierarchical Clu)
猜你喜欢
随机推荐
应用联合、体系化推进。集团型化工企业数字化转型路径
FinClip崁入式搭建生态平台,降低合作门槛
线上虚拟展馆展示具有哪些优势
2022/8/3
enumerate()函数
Latex fast insert author ORCID
Shell编程之循环语句与函数的使用
【3D建模制作技巧分享】ZBrush如何使用Z球
使用cpolar优化树莓派上的网页(1)
ffplay视频播放原理分析
一点点读懂regulator(四)
Service Mesh landing path
Deep Learning RNN Architecture Analysis
go语言的日志实现(打印日志、日志写入文件、日志切割)
功耗控制之DVFS介绍
[Mock Interview - 10 Years of Work] Are more projects an advantage?
Pytest学习-Fixture
JVM内存配置参数GC日志
The Record of Reminding myself
PID Controller Improvement Notes No. 7: Improve the anti-overshoot setting of the PID controller








