当前位置:网站首页>ClickHouse 数据插入、更新与删除操作 SQL

ClickHouse 数据插入、更新与删除操作 SQL

2022-08-03 09:16:00 及时机芯

f5a290eafca4532cc7ba12bc985f2166.png


1.1.数据****操作

数据操作语言( DML,Data Manipulation Language) 用于在数据库表中添加(插入)、删除和修改(更新)数据。本节主要介绍ClickHouse中的数据插入、更新与删除操作。

1.1.1.概述

数据操作语言DML,包括SQL数据更改语句,它修改存储的数据,但不修改数据模型,例如数据库模式或数据库表结构。DML语言常见的语法模式如下:

INSERT INTO ... VALUES ...
UPDATE ... SET ... WHERE ...
DELETE FROM ... WHERE ...

但是,在 ClickHouse 中,UPDATE 与DELETE 是设计在了 ALTER 指令体系中的。

1.1.2.插入****数据

一次一条数据插入

INSERT INTO clickhouse_tutorial.user_tag (user_id, gender, age, active_level, date)
VALUES (1, 'male', '18', '1', '2022-03-21');
INSERT INTO clickhouse_tutorial.user_tag (user_id, gender, age, active_level, date)
VALUES (2, 'female', '16', '2', '2022-03-21');

一次多条数据插入

INSERT INTO clickhouse_tutorial.user_tag (user_id, gender, age, active_level, date)
VALUES (3, 'female', '20', '3', '2022-03-21'),
       (4, 'female', '22', '4', '2022-03-21');

插入SELECT查询返回数据

INSERT INTO clickhouse_tutorial.user_tag
(UserID, WatchID, EventTime, Sex, Age, OS, RegionID, RequestNum, EventDate)
SELECT
    UserID,
    WatchID,
    EventTime,
    Sex,
    Age,
    OS,
    RegionID,
    RequestNum,
    EventDate
FROM tutorial.hits_v1
 
Query id: bfed9d12-b838-4125-9ee2-f61049bf0a56
 
 Progress: 30.91 million rows, 835.62 MB (6.66 million rows/s., 180.00 MB/s.) (0.0 CPU, 172.08 MB RAM)
████████████████████████████████████████████████ 99%
 
Ok.
 
0 rows in set. Elapsed: 7.552 sec. Processed 53.24 million rows, 1.37 GB (7.05 million rows/s., 180.96 MB/s.)

1.1.3.UPDATE更新数据

语法

ALTER TABLE [db.]table UPDATE column1 = expr1 [, ...] WHERE filter_expr

功能说明

更新表数据。ClickHouse 中ALTER TABLE 前缀种语法与大多数其他支持 SQL 的数据库系统不同。它旨在表明,与 OLTP 数据库中的类似查询不同,这是一项并非为频繁使用而设计的繁重操作。ALTER 查询是通过一种称为“突变”(Mutation)的机制实现的。

关于 ALTERTABLE … UPDATE命令,详细说明如下:

1.WHERE子句中的过滤表达式filter_expr的值是UInt8类型,指定要更新的数据行。

2.不支持更新用于计算主键或分区键的列。

3.一个UPDATE操作可以包含多个用逗号分隔的命令,例如column1 = expr1,column2= expr2。 UPDATE操作数据处理是同步还是异步,由系统配置项 mutation_sync 设置,可取值为:

0 - execute asynchronously。

1 - wait current server。

2 - wait all replicas if they exist。默认为0,异步后台进程执行,类似于 *MergeTree 表中的合并操作。

4.对于 *MergeTree 表,Mutation操作通过重写整个数据Part来执行,Mutation不具备原子性。数据 Part一旦准备好就被MutationPart替换,并且在Mutation执行期间,SELECT查询结果中可以看到,来自已经变异Part的数据,以及来自尚未变异Part的数据。

5.Mutation按照创建顺序排序,并按该顺序应用于每个MutationPart。

6.在Mutation提交之前插入到表中的数据会被执行Mutation操作,提交之后插入的数据不会执行Mutation操作。

7.Mutation操作不会阻塞数据插入。

8.可以查看 system.mutations 表跟踪突变的进度。

9.即使重新启动 ClickHouse 服务器,成功提交的变更仍将继续执行。一旦提交,就无法回滚突变。

10.如果Mutation由于某种原因被卡住,可以使用 KILL MUTATION 查询取消它。

11.已经完成Mutation的条目不会立即删除。保留条目的数量由 finished_mutations_to_keep 存储引擎参数确定。

12.在系统配置表system.settings中,有关mutation的配置项如下:

SELECT *
FROM system.settings
WHERE name LIKE '%mutation%'
FORMAT Vertical
 
Query id: 24f6ca70-7117-41c5-bc3e-dd6615d5ee6d
 
Row 1:
──────
name:        background_merges_mutations_concurrency_ratio
value:       2
changed:     0
description: Ratio between a number of how many operations could be processed and a number threads to process them. Only has meaning at server startup.
min:         ????
max:         ????
readonly:    0
type:        Float
 
Row 2:
──────
name:        mutations_sync
value:       0
changed:     0
description: Wait for synchronous execution of ALTER TABLE UPDATE/DELETE queries (mutations). 0 - execute asynchronously. 1 - wait current server. 2 - wait all replicas if they exist.
min:         ????
max:         ????
readonly:    0
type:        UInt64
 
Row 3:
──────
name:        allow_nondeterministic_mutations
value:       0
changed:     0
description: Allow non-deterministic functions in ALTER UPDATE/ALTER DELETE statements
min:         ????
max:         ????
readonly:    0
type:        Bool
 
3 rows in set. Elapsed: 0.003 sec.

实例讲解

1、更新之前的数据
SELECT
    WatchID,
    JavaEnable,
    GoodEvent
FROM tutorial.hits_v1
WHERE WatchID = 7043438415214026105
 
Query id: e0dc9ae5-8f24-48e5-a56d-d107afa1dfe3
 
┌─────────────WatchID─┬─JavaEnable─┬─GoodEvent─┐
│ 7043438415214026105 │          1 │         1 │
└─────────────────────┴────────────┴───────────┘
2**、**UPDATE 目标数据行
ALTER TABLE tutorial.hits_v1
    UPDATE JavaEnable = 0, GoodEvent = 0 WHERE WatchID = 7043438415214026105
 
Query id: 32ac8c6b-c78c-4a3c-ab72-29ed38fda687
 
Ok.
3**、**查看更新结果
SELECT
    WatchID,
    JavaEnable,
    GoodEvent
FROM tutorial.hits_v1
WHERE WatchID = 7043438415214026105
 
Query id: 26591216-0cb4-4c6d-9b89-db9ff588d469
 
┌─────────────WatchID─┬─JavaEnable─┬─GoodEvent─┐
│ 7043438415214026105 │          0 │         0 │
└─────────────────────┴────────────┴───────────┘

可以看到,目标数据已经被更新。

4**、**查看 Mutation 执行日志

我们可以去服务器端查看 UPDATE 操作提交之后的日志:

2022.03.31 03:06:47.898420 [ 6154029 ] {} TCP-Session: fd75399f-85bc-4d6d-a86b-d69fa899f6d9 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default

2022.03.31 03:06:47.898553 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} executeQuery: (from 127.0.0.1:52757) ALTER TABLE tutorial.hits_v1 UPDATE JavaEnable = 0, GoodEvent=0 WHERE WatchID=7043438415214026105;

2022.03.31 03:06:47.898600 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} ContextAccess (default): Access granted: ALTER UPDATE(JavaEnable, GoodEvent) ON tutorial.hits_v1

2022.03.31 03:06:47.900244 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} tutorial.hits_v1 (0fa45bfe-c9ca-4df7-b7bf-7bd268a6225d): Added mutation: mutation_33.txt

2022.03.31 03:06:47.900311 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} MemoryTracker: Peak memory usage (for query): 0.00 B.

2022.03.31 03:06:47.900365 [ 6154029 ] {} TCPHandler: Processed in 0.00198 sec.

5**、**查看 Mutation 详情

可以看到“Added mutation: mutation_33.txt”这样一行关键日志。去系统表system.mutations中查看mutation_33.txt的详情如下:

SELECT *
FROM system.mutations
WHERE mutation_id = 'mutation_33.txt'
FORMAT Vertical
 
Query id: 3f7c995b-82e3-41fa-bf5b-fc50be8da824
 
Row 1:
──────
database:                   tutorial
table:                      hits_v1
mutation_id:                mutation_33.txt
command:                    UPDATE JavaEnable = 0, GoodEvent = 0 WHERE WatchID = 7043438415214026105
create_time:                2022-03-31 03:06:47
block_numbers.partition_id: ['']
block_numbers.number:       [33]
parts_to_do_names:          []
parts_to_do:                0
is_done:                    1
latest_failed_part:         
latest_fail_time:           1970-01-01 08:00:00
latest_fail_reason:         
 
1 rows in set. Elapsed: 0.003 sec.

1.1.4.DELETE删除****数据

语法

ALTER TABLE [db.]table [ON CLUSTER cluster] DELETE WHERE filter_expr

功能说明

删除表数据。

实例讲解

1**、**要删除的目标数据行
SELECT
    WatchID,
    JavaEnable,
    GoodEvent
FROM tutorial.hits_v1
WHERE WatchID = 7043438415214026105
 
Query id: 1444174c-142b-43ec-8ad6-54da1d871277
 
┌─────────────WatchID─┬─JavaEnable─┬─GoodEvent─┐
│ 7043438415214026105 │          0 │         0 │
└─────────────────────┴────────────┴───────────┘
2、执行删除操作
ALTER TABLE tutorial.hits_v1
    DELETE WHERE WatchID = 7043438415214026105
3**、**验证删除结果
SELECT
    WatchID,
    JavaEnable,
    GoodEvent
FROM tutorial.hits_v1
WHERE WatchID = 7043438415214026105
 
Query id: fd6a7536-4f2e-4fe3-8b03-bf1aed45302f
 
Ok.
 
0 rows in set. Elapsed: 0.018 sec. Processed 8.87 million rows, 70.99 MB (480.03 million rows/s., 3.84 GB/s.)
4、查看删除操作的服务端日志

根据 query_id: a303b0d4-564d-48e5-9f32-c7e2df554b1f 查询ClickHouseServer 端日志如下:

2022.03.31 03:15:10.905286 [ 6154029 ] {} TCP-Session: fd75399f-85bc-4d6d-a86b-d69fa899f6d9 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default

2022.03.31 03:15:10.905429 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} executeQuery: (from 127.0.0.1:52757) ALTER TABLE tutorial.hits_v1 DELETE WHERE WatchID=7043438415214026105;

2022.03.31 03:15:10.905482 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} ContextAccess (default): Access granted: ALTER DELETE ON tutorial.hits_v1

2022.03.31 03:15:10.906794 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} InterpreterSelectQuery: MergeTreeWhereOptimizer: condition “isZeroOrNull(WatchID = 7043438415214026105)” moved to PREWHERE

2022.03.31 03:15:10.911000 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} tutorial.hits_v1 (0fa45bfe-c9ca-4df7-b7bf-7bd268a6225d): Added mutation: mutation_35.txt

2022.03.31 03:15:10.911081 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} MemoryTracker: Peak memory usage (for query): 0.00 B.

2022.03.31 03:15:10.911139 [ 6154029 ] {} TCPHandler: Processed in 0.005891 sec.

可以看到关键日志:“Added mutation: mutation_35.txt”。

另外,我们还看到了WHERE过滤自动转为PREWHERE优化的日志:“InterpreterSelectQuery: MergeTreeWhereOptimizer: condition “isZeroOrNull(WatchID = 7043438415214026105)” moved to PREWHERE”。

5**、**查看 mutation 详情

根据日志内容“Added mutation: mutation_35.txt”,查询 mutation_35.txt详情如下:

SELECT *
FROM system.mutations
WHERE mutation_id = 'mutation_35.txt'
FORMAT Vertical
 
Query id: c7e42d42-feef-4a18-a46f-ea97aa9d7b7e
 
Row 1:
──────
database:                   tutorial
table:                      hits_v1
mutation_id:                mutation_35.txt
command:                    DELETE WHERE WatchID = 7043438415214026105
create_time:                2022-03-31 03:15:10
block_numbers.partition_id: ['']
block_numbers.number:       [35]
parts_to_do_names:          []
parts_to_do:                0
is_done:                    1
latest_failed_part:         
latest_fail_time:           1970-01-01 08:00:00
latest_fail_reason:         
 
1 rows in set. Elapsed: 0.003 sec.

1.1.5.EXCHANGE****操作

语法

EXCHANGE TABLES|DICTIONARIES [db0.]name_A AND [db1.]name_B
EXCHANGE DICTIONARIES [db0.]dict_A AND [db1.]dict_B

功能说明

1.EXCHANGE 操作以原子操作的方式交换两个表或字典的名称。

2.EXCHANGE 操作也可以通过使用RENAME 操作来完成,区别是RENAME不是原子操作。 例如,RENAME重命名交换两张表new_table、old_table:

RENAME TABLE new_table TO tmp, old_table TO new_table, tmp TO old_table;

直接使用EXCHANGE 命令实现如下:

EXCHANGE TABLES new_table AND old_table;

3.EXCHANGE 底层是通过系统调用renameat2() 实现,LinuxKernel3.15+版本才支持。

4.只有Atomic数据库引擎支持EXCHANGE操作。

在Atomic 数据库引擎下创建的数据表,支持无锁原子CREATE/DROP/RENAME 操作,并且支持原子EXCHANGE TABLES A and B 直接交换两张表。

应用场景

EXCHANGE 命令可以实现 AB 两张表的快速切换。AB 表切换的使用场景很广泛,比如历史表归档、批量抽数、数据同步过程写临时表等等,都可以采用 AB 表切换的思路来实现。

实例讲解

1、创建两张表
drop table if exists tutorial.hits_v2;
drop table if exists tutorial.hits_v3;
CREATE TABLE tutorial.hits_v2
(
    `WatchID` UInt64,
    `UserID` UInt64,
    `JavaEnable` UInt8,
    `Title` String,
    `GoodEvent` Int16,
    `EventTime` DateTime,
    `EventDate` Date,
    `RequestNum` UInt32,
    `RequestTry` UInt8
)
    ENGINE = MergeTree()
        PARTITION BY toYYYYMM(EventDate)
        ORDER BY (WatchID, EventDate, intHash32(UserID))
        SAMPLE BY intHash32(UserID);
 
CREATE TABLE tutorial.hits_v3
(
    `WatchID` UInt64,
    `UserID` UInt64,
    `JavaEnable` UInt8,
    `Title` String,
    `GoodEvent` Int16,
    `EventTime` DateTime,
    `EventDate` Date,
    `RequestNum` UInt32,
    `RequestTry` UInt8
)
    ENGINE = MergeTree()
        PARTITION BY toYYYYMM(EventDate)
        ORDER BY (WatchID, EventDate, intHash32(UserID))
        SAMPLE BY intHash32(UserID);
2**、**交换两张表的名字
EXCHANGE TABLES tutorial.hits_v1 AND tutorial.hits_v2;

上面的命令,在笔者的 MacOS电脑上是报错的:

Received exception from server (version 22.4.1):
Code: 48. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: RENAME EXCHANGE is not supported. (NOT_IMPLEMENTED)

查看内核版本:

$uname -a
Darwin C02FJ0KMMD6V 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64

可见, Mac OS Darwin Kernel Version 20.3.0内核版本,还不支持renameat2函数系统调用。

EXCHANGE****实现原理

ClickHouse源代码实现在renameat2.cpp中,相关代码行如下:

#if !defined(__NR_renameat2)
    #if defined(__x86_64__)
        #define __NR_renameat2 316
    #elif defined(__aarch64__)
        #define __NR_renameat2 276
    #elif defined(__ppc64__)
        #define __NR_renameat2 357
    #elif defined(__riscv)
        #define __NR_renameat2 276
    #else
        #error "Unsupported architecture"
    #endif
#endif
...
static bool renameat2(const std::string & old_path, const std::string & new_path, int flags)
{
    if (!supportsRenameat2())
        return false;
    if (old_path.empty() || new_path.empty())
        throw Exception(ErrorCodes::LOGICAL_ERROR, "Cannot rename {} to {}: path is empty", old_path, new_path);
 
    /// int olddirfd (ignored for absolute oldpath), const char *oldpath,
    /// int newdirfd (ignored for absolute newpath), const char *newpath,
    /// unsigned int flags
    if (0 == syscall(__NR_renameat2, AT_FDCWD, old_path.c_str(), AT_FDCWD, new_path.c_str(), flags))
        return true;
 
    /// EINVAL means that filesystem does not support one of the flags.
    /// It also may happen when running clickhouse in docker with Mac OS as a host OS.
    /// supportsRenameat2() with uname is not enough in this case, because virtualized Linux kernel is used.
    /// Other cases when EINVAL can be returned should never happen.
    if (errno == EINVAL)
        return false;
    /// We should never get ENOSYS on Linux, because we check kernel version in supportsRenameat2Impl().
    /// However, we can get in on WSL.
    if (errno == ENOSYS)
        return false;
 
    if (errno == EEXIST)
        throwFromErrno(fmt::format("Cannot rename {} to {} because the second path already exists", old_path, new_path), ErrorCodes::ATOMIC_RENAME_FAIL);
    if (errno == ENOENT)
        throwFromErrno(fmt::format("Paths cannot be exchanged because {} or {} does not exist", old_path, new_path), ErrorCodes::ATOMIC_RENAME_FAIL);
    throwFromErrnoWithPath(fmt::format("Cannot rename {} to {}", old_path, new_path), new_path, ErrorCodes::SYSTEM_ERROR);
}
...
bool supportsRenameat2()
{
    static bool supports = supportsRenameat2Impl();
    return supports;
}
...
static bool supportsRenameat2Impl()
{
    VersionNumber renameat2_minimal_version(3, 15, 0); // since linux kernel 3.15
    VersionNumber linux_version(Poco::Environment::osVersion());
    return linux_version >= renameat2_minimal_version;
}

1.1.6.OPTIMIZE****操作

语法

OPTIMIZE TABLE [db.]name [ON CLUSTER cluster]
[PARTITION partition | PARTITION ID 'partition_id'] [FINAL]
[DEDUPLICATE [BY expression]]

功能说明

1.OPTIMIZE操作尝试为数据库表db.table ,初始化一个调度计划外的数据part合并操作。

2.OPTIMIZE操作仅支持MergeTree系列表引擎、MaterializedView和 Buffer表引擎。

3.当 OPTIMIZE 与 ReplicatedMergeTree 系列表引擎一起使用时,ClickHouse 创建一个用于合并的任务,并等待在所有副本上执行(replication_alter_partitions_sync=2)或者等待当前副本上执行(replication_alter_partitions_sync=1)。

4.OPTIMIZE无法修复“Too many parts”错误。

配置项replication_alter_partitions_sync说明

系统配置项replication_alter_partitions_sync,用来指定等待副本分区变更操作(ALTER、OPTIMIZE 或 TRUNCATE等)执行的sync策略。

replication_alter_partitions_sync可取值:

0 — 不等待,直接异步执行。

1 — 同步等待在当前副本上执行。

2 — 同步等待在所有副本上执行。

实例讲解

执行OPTIMIZE TABLE命令:

OPTIMIZE TABLE clickhouse_tutorial.user_tag

Query id: c03335ce-fee8-42b6-bc36-3e23a8b59b29

Ok.

0 rows in set. Elapsed: 0.210 sec.

查看 Server 端日志:

2022.03.31 04:52:31.277919 [ 6156710 ] {} TCP-Session: 610442b9-37d5-49e0-821d-c25984fc7f41 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default

2022.03.31 04:52:31.278067 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} executeQuery: (from 127.0.0.1:53198) optimize table clickhouse_tutorial.user_tag;

2022.03.31 04:52:31.278128 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} ContextAccess (default): Access granted: OPTIMIZE ON clickhouse_tutorial.user_tag

2022.03.31 04:52:31.278239 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} clickhouse_tutorial.user_tag (edac2738-6508-46fa-aee7-7c5560f4539c) (MergerMutator): Selected 3 parts from 20140320_46_46_0 to 20140320_59_59_0

2022.03.31 04:52:31.278294 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} DiskLocal: Reserving 11.19 MiB on disk `default`, having unreserved 192.05 GiB.

2022.03.31 04:52:31.278340 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} MergeTask::PrepareStage: Merging 3 parts: from 20140320_46_46_0 to 20140320_59_59_0 into Wide

MergeTreeSequentialSource: Reading 2 marks from part pRegionID, total 1443 rows starting from the beginning of the part

2022.03.31 04:52:31.470792 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} MergeTask::MergeProjectionsStage: Merge sorted 3917 rows, containing 3 columns (3 merged, 0 gathered) in 0.099764 sec., 39262.65987731045 rows/sec., 15.67 MiB/sec.

2022.03.31 04:52:31.479793 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} MergedBlockOutputStream: filled checksums pRegionID (state Active)

clickhouse_tutorial.user_tag (edac2738-6508-46fa-aee7-7c5560f4539c): Renaming temporary part tmp_merge_20140320_46_59_1 to 20140320_46_59_1.

2022.03.31 04:52:31.487058 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} clickhouse_tutorial.user_tag (edac2738-6508-46fa-aee7-7c5560f4539c) (MergerMutator): Merged 3 parts: from 20140320_46_46_0 to 20140320_59_59_0

2022.03.31 04:52:31.487648 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} MemoryTracker: Peak memory usage Mutate/Merge: 36.46 MiB.

2022.03.31 04:52:31.487756 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} MemoryTracker: Peak memory usage (for query): 0.00 B.

2022.03.31 04:52:31.487861 [ 6156710 ] {} TCPHandler: Processed in 0.20999 sec.

可以看到执行OPTIMIZE TABLE命名,服务端通过MergerMutator(源码MergeTreeDataMergerMutator.cpp)发起了一个MergeTask(源码MergeTask.cpp)任务,执行了一组数据 Part 的合并(merge)、突变(mutation)和移动(move)操作。

其中,MergerMutator为后台进程选择数据Part,并执行合并、突变和移动等操作。

1.1.7.ATTACH操作

语法

ATTACH操作与 CREATE 的功能相同。

挂载数据库表或者字典:

ATTACH TABLE|DICTIONARY [IF NOT EXISTS] [db.]name [ON CLUSTER cluster] ...

从数据文件和指定表结构挂载表:

ATTACH TABLE name FROM 'path/to/data/' (col1 Type1, ...)

该操作使用提供的结构创建一个新表,并将表数据从提供的目录,挂载到 `user_files` 中。

功能说明

挂载表或字典,执行ATTACH查询后,服务器将知道表或字典的存在。例如,在将ClickHouse数据库移动到另一台服务器时,可以使用此操作迁移数据。ATTACH 操作不会在磁盘上创建数据,而是假设数据已经在适当的位置,并且只是将有关表或字典的信息添加到服务器。

并将表数据从提供的目录中附加到 `user_files` 中。

实例讲解

从数据文件ATTACH 建表

1、准备数据文件data.CSV:

DROP TABLE IF EXISTS clickhouse_tutorial.my_test_table;

INSERT INTO TABLE FUNCTION file(‘/Users/data/clickhouse/user_files/my_test_table/data.CSV’, ‘CSV’, ‘s String, n UInt8’) VALUES (‘abc’, 3);

2、使用 ATTACH从表数据文件目录/Users/data/clickhouse/user_files/my_test_table/建表:

ATTACH TABLE clickhouse_tutorial.my_test_table

FROM ‘/Users/data/clickhouse/user_files/my_test_table’

(s String, n UInt8)

ENGINE = File(CSV);

3、查看表数据:

SELECT *

FROM clickhouse_tutorial.my_test_table

Query id: 33a9e9d9-e95f-4aa0-806c-6362a2f1baeb

┌─s───┬─n─┐

│ abc │ 3 │

└─────┴───┘

1 rows in set. Elapsed: 0.002 sec.

ATTACH 被分离的表

1、分离表

DETACH TABLE clickhouse_tutorial.my_test_table

2、查询分离表

SELECT *

FROM clickhouse_tutorial.my_test_table

Query id: 6a16501c-23ca-47ed-970c-1a14cf30a897

0 rows in set. Elapsed: 0.001 sec.

Received exception from server (version 22.4.1):

Code: 60. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: Table clickhouse_tutorial.my_test_table doesn’t exist. (UNKNOWN_TABLE)

3、挂载表

ATTACH TABLE clickhouse_tutorial.my_test_table

Query id: 09cbec26-cbb6-41fc-ad58-eb6264ff111f

Ok.

0 rows in set. Elapsed: 0.001 sec.

4、查询表

SELECT *

FROM clickhouse_tutorial.my_test_table

Query id: a9a7cad8-3c06-4397-8762-ad6160091d91

┌─s───┬─n─┐

│ abc │ 3 │

└─────┴───┘

1 rows in set. Elapsed: 0.002 sec.

1.1.8.DETACH****操作

语法

DETACH TABLE|VIEW|DICTIONARY [IF EXISTS] [db.]name [ON CLUSTER cluster] [PERMANENTLY]

功能说明

DETACH : 分离表、视图或字典。详细说明如下:

分离操作不会删除表、物化视图、字典数据或元数据。

如果实体未“永久”分离,则在下一次服务器启动时,服务器将读取元数据并再次召回该表、视图或字典。如果实体被“永久”分离,则不会自动召回。

无论表或字典是否被永久分离,都可以使用ATTACH操作重新挂载它。

不能RENAME TABLE、DROPTABLE已经分离的表。

不能CREATE TABLE 与永久分离表名称相同。

实例讲解

创建test表:

CREATE TABLE clickhouse_tutorial.test ENGINE = Log AS SELECT * FROM numbers(10);

查询数据:

SELECT *
FROM clickhouse_tutorial.test
 
Query id: 55175f17-9d56-4909-93e3-99c7b945b02c
 
┌─number─┐
│      0 │
│      1 │
│      2 │
│      3 │
│      4 │
│      5 │
│      6 │
│      7 │
│      8 │
│      9 │
└────────┘
 
10 rows in set. Elapsed: 0.002 sec.

DETACH test表:

DETACH TABLE clickhouse_tutorial.test
 
Query id: f16d9541-3640-4556-9466-850b99d131aa
 
Ok.
 
0 rows in set. Elapsed: 0.001 sec.

再次查询test 表,提示表不存在::

SELECT *
FROM clickhouse_tutorial.test
 
Query id: 8338bbc9-6d19-4726-a550-c773851ddf58
 
0 rows in set. Elapsed: 0.001 sec.
 
Received exception from server (version 22.4.1):
Code: 60. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: Table clickhouse_tutorial.test doesn't exist. (UNKNOWN_TABLE)

但是,此时也不能使用表名test创建另外一张表:

CREATE TABLE clickhouse_tutorial.test
ENGINE = Log AS
SELECT *
FROM numbers(10)
 
Query id: c338b28a-45c9-41a3-a6c0-f6ddb374df52
 
0 rows in set. Elapsed: 0.001 sec.
 
Received exception from server (version 22.4.1):
Code: 57. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: Table `clickhouse_tutorial`.`test` already exists (detached). (TABLE_ALREADY_EXISTS)

先自我介绍一下,小编13年上师交大毕业,曾经在小公司待过,去过华为OPPO等大厂,18年进入阿里,直到现在。深知大多数初中级java工程师,想要升技能,往往是需要自己摸索成长或是报班学习,但对于培训机构动则近万元的学费,着实压力不小。自己不成体系的自学效率很低又漫长,而且容易碰到天花板技术停止不前。因此我收集了一份《java开发全套学习资料》送给大家,初衷也很简单,就是希望帮助到想自学又不知道该从何学起的朋友,同时减轻大家的负担。添加下方名片,即可获取全套学习资料哦

原网站

版权声明
本文为[及时机芯]所创,转载请带上原文链接,感谢
https://blog.csdn.net/m0_67401055/article/details/126098952