当前位置:网站首页>Clickhouse synchronization MySQL (based on materialization engine)

Clickhouse synchronization MySQL (based on materialization engine)

2022-07-05 03:45:00 Younger Cheng

One 、Mysql engine ( Not recommended )

Synchronization considerations :1、 The library name specification cannot have “-”, The table name specification cannot have “-”

CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster]
ENGINE = MySQL('host:port', ['database' | database], 'user', 'password')

Parameter description :

  • host:port — MySQL Service address
  • database — MySQL Database name
  • user — MySQL user name
  • password — MySQL User password

MySQL The engine will be remote MySQL Tables in the server / Library mapping to ClickHouse in ,MySQL The database engine will convert the query to MySQL Syntax and send to MySQL Server ( Equivalent to direct use of mysql)

 Database synchronization :
CREATE DATABASE IF NOT EXISTS mysql_db ENGINE = MySQL ('localhost:3306', 'mysql', 'root', '123456');

 Table synchronization :
CREATE TABLE IF NOT EXISTS tmp ENGINE = MergeTree ORDER BY id AS SELECT * FROM mysql('localhost:3306','test','user','root','123456')

Two 、MaterializedMySQL( To use , But the official is still in the experimental stage )

CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster]
ENGINE = MaterializedMySQL('host:port', ['database' | database], 'user', 'password') [SETTINGS ...]
[TABLE OVERRIDE table1 (...), TABLE OVERRIDE table2 (...)]

establish ClickHouse database , contain MySQL All the tables in , And all the data in these tables .

ClickHouse Server as MySQL Copy work . It reads binlog And implement DDL and DML Inquire about

 1、mysql Turn on binlog and GTID

vim /etc/mysql/my.cnf

[mysqld]
#  Appoint binlog Log storage location 
#log-bin=/data/logs/mysql/mysql-bin.log  
log-bin=/var/lib/mysql/mysql-bin

#  Turn on GTID Pattern 
gtid-mode=ON
#  Set master-slave strong consistency 
enforce-gtid-consistency=1
#  Log 
log-slave-updates=1
binlog_format=ROW

2、 Create a replication pipeline

# Start the materialization engine 
SET allow_experimental_database_materialized_mysql=1;
CREATE DATABASE yfc 
ENGINE = MaterializeMySQL('localhost:3306', 'yfc', 'root', '123456') 

advantage : By monitoring mysql Of binlog file , Achieve incremental updates , Increased efficiency

Data restrictions :

1、 Sync mysql Before the data ,mysql Every watch of should have primary key( If there is no primary key , Error will be reported during synchronization )

2、MaterializedMySQL It is a library level engine , During synchronization, the table data in the whole database will be synchronized

3、mysql Data synchronization to clickhouse Index conversion will occur after : stay ClickHouse In the table ,MySQL Of  PRIMARY KEY  and  INDEX  Clause is converted to  ORDER BY  Tuples

4、mysql In the transformation of clickhouse Table time , Each table will add two fields :_sign(1: write in 、-1: Delete ),_version

5、 stay clickhouse When adding synchronization in , There is no physical deletion , Only pass _sign Flag field to realize data filtering

6、 stay mysql conversion clickhouse when , Default ReplacingMergeTree engine , Ensure that no duplicate data appears

example :

#1、 Query the synchronized database 
show databses;
use yfc;
show tables;

#2、 View the table creation statement 
show create table sku_info;

#3、 Query synchronized data 
select *,_sign,_version from sku_info;

#4、 The new data ( Go to mysql Of sku_info Insert data into the table )
INSERT INTO `yfc`.`sku_info`(`id`, `sku_code`) VALUES (3, '10003');

#5、 Modifying data 
UPDATE `yfc`.`ts_store_info` set cust_name = ' Continuous commissioning test ' where id= 111;

 

#6、 Delete data 
DELETE FROM `yfc`.`sku_info` where id= 3;

 

3、 ... and 、mysql Table function

Four 、datax

原网站

版权声明
本文为[Younger Cheng]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/186/202207050309085144.html