当前位置:网站首页>Merge and migrate data from small data volume, sub database and sub table Mysql to tidb
Merge and migrate data from small data volume, sub database and sub table Mysql to tidb
2022-07-03 06:04:00 【Tianxiang shop】
If you want to put more upstream MySQL The database instance is merged and migrated to the same downstream TiDB In the database , And the amount of data is small , You can use DM Use the tool to merge and migrate databases and tables . In this article “ Small data volume ” Usually refers to TiB Below grade . This article introduces the operation steps of merge and migration with examples 、 matters needing attention 、 Troubleshooting, etc . This document applies to :
- TiB Consolidation and migration of sub database and sub table data within level
- be based on MySQL binlog The incremental 、 Continue to merge and migrate by database and table
To migrate the sum of sub tables 1 TiB The above data , be DM The tool takes a long time , May refer to From the big data volume sub database sub table MySQL Merge and migrate data to TiDB.
This article takes a simple scenario as an example , The two data sources in the example MySQL The database and table data of the instance are migrated to the downstream TiDB colony . The schematic diagram is as follows .

data source MySQL example 1 and example 2 The following table structure is used , It is planned that store_01 and store_02 in sale The first table is merged and imported downstream store.sale surface
| Schema | Tables |
|---|---|
| store_01 | sale_01, sale_02 |
| store_02 | sale_01, sale_02 |
The structure of the migration target library is as follows :
| Schema | Tables |
|---|---|
| store | sale |
Prerequisite
Split table data conflict check
If database and table merging are involved in the migration , Data from multiple sub tables may cause data conflicts of primary keys or unique indexes . So before migration , It is necessary to check the business characteristics of each sub table data . Please refer to Conflict handling of cross table data in primary key or unique index
In this example :sale_01 and sale_02 The same table structure is as follows :
CREATE TABLE `sale_01` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `sid` bigint(20) NOT NULL, `pid` bigint(20) NOT NULL, `comment` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `sid` (`sid`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
among id List as primary key ,sid List as fragment key , With global uniqueness .id Columns have self incrementing properties , The repetition of multiple sub table ranges will cause data conflicts .sid It can ensure that the unique index is satisfied globally , Therefore, you can follow the reference Remove the primary key attribute of the auto increment primary key The operations described in bypass id Column . Create downstream sale Remove on table id Unique key attribute of the column
CREATE TABLE `sale` ( `id` bigint(20) NOT NULL, `sid` bigint(20) NOT NULL, `pid` bigint(20) NOT NULL, `comment` varchar(255) DEFAULT NULL, INDEX (`id`), UNIQUE KEY `sid` (`sid`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
The first 1 Step : create data source
newly build source1.yaml file , Write the following :
# Unique name , Do not repeat . source-id: "mysql-01" # DM-worker Whether to use the global transaction identifier (GTID) Pull binlog. The premise of use is upstream MySQL Enabled GTID Pattern . If there is master-slave automatic switching in the upstream , Must be used GTID Pattern . enable-gtid: true from: host: "${host}" # for example :172.16.10.81 user: "root" password: "${password}" # Clear text password is supported but not recommended , It is recommended to use dmctl encrypt Encrypt the plaintext password and use port: 3306
Execute the following command in the terminal , Use tiup dmctl Load the data source configuration into DM In the cluster :
tiup dmctl --master-addr ${advertise-addr} operate-source create source1.yaml
The parameters in this command are described as follows :
| Parameters | describe |
|---|---|
| --master-addr | dmctl Any of the clusters to be connected DM-master Node {advertise-addr}, for example :172.16.10.71:8261 |
| operate-source create | towards DM The cluster loads the data source |
Repeat the above operations until all data sources are added .
The first 2 Step : Create migration tasks
newly build task1.yaml file , Write the following :
name: "shard_merge" # Task mode , May be set as # full: Only full data migration # incremental: binlog Real time synchronization # all: Total quantity + binlog transfer task-mode: all # Database and table consolidation tasks need to be configured shard-mode. Pessimistic coordination mode is used by default "pessimistic", After in-depth understanding of the principles and limitations of optimistic coordination mode , It can also be set to optimistic coordination mode "optimistic" # Details are available :https://docs.pingcap.com/zh/tidb/dev/feature-shard-merge/ shard-mode: "pessimistic" meta-schema: "dm_meta" # Will be created in the downstream database schema Used to store metadata ignore-checking-items: ["auto_increment_ID"] # In this example, there is a self incrementing primary key in the upstream , Therefore, this check item needs to be ignored target-database: host: "${host}" # for example :192.168.0.1 port: 4000 user: "root" password: "${password}" # Clear text password is supported but not recommended , It is recommended to use dmctl encrypt Encrypt the plaintext password and use mysql-instances: - source-id: "mysql-01" # data source ID, namely source1.yaml Medium source-id route-rules: ["sale-route-rule"] # Applied to this data source table route The rules filter-rules: ["store-filter-rule", "sale-filter-rule"] # Applied to this data source binlog event filter The rules block-allow-list: "log-bak-ignored" # Applied to this data source Block & Allow Lists The rules - source-id: "mysql-02" route-rules: ["sale-route-rule"] filter-rules: ["store-filter-rule", "sale-filter-rule"] block-allow-list: "log-bak-ignored" # Split table consolidation configuration routes: sale-route-rule: schema-pattern: "store_*" table-pattern: "sale_*" target-schema: "store" target-table: "sale" # Filter section DDL event filters: sale-filter-rule: schema-pattern: "store_*" table-pattern: "sale_*" events: ["truncate table", "drop table", "delete"] action: Ignore store-filter-rule: schema-pattern: "store_*" events: ["drop database"] action: Ignore # Black and white list block-allow-list: log-bak-ignored: do-dbs: ["store_*"]
The above is the minimum task configuration for performing migration . More configuration items about tasks , You can refer to DM Introduction to the complete configuration file of the task
If you want to know about routes,filters Wait for more usage , Please refer to :
- Table routing
- Block & Allow Table Lists
- How to filter binlog event
- How to use SQL Expression filtering DML
The first 3 Step : Start the task
Before you start the data migration task , It is recommended to use check-task Command to check whether the configuration conforms to DM Configuration requirements for , To reduce the probability of error reporting in the later stage .
tiup dmctl --master-addr ${advertise-addr} check-task task.yaml
Use tiup dmctl Execute the following command to start the data migration task .
tiup dmctl --master-addr ${advertise-addr} start-task task.yaml
The parameters in this command are described as follows :
| Parameters | describe |
|---|---|
| --master-addr | dmctl Any of the clusters to be connected DM-master Node {advertise-addr}, for example :172.16.10.71:8261 |
| start-task | The command is used to create a data migration task |
If the task fails to start , You can change the configuration according to the prompt of the returned result start-task task.yaml Command to restart the task . Please refer to Faults and handling methods as well as common problem
The first 4 Step : View task status
If you need to know DM Whether there are running migration tasks and task status in the cluster , You can use tiup dmctl perform query-status Command to query :
tiup dmctl --master-addr ${advertise-addr} query-status ${task-name}
Detailed interpretation of query results , Please refer to State of the query
The first 5 Step : Monitor tasks and view logs ( Optional )
You can go through Grafana Or log to view the historical status of the migration task and various internal operation indicators .
adopt Grafana see
If you use TiUP Deploy DM When the cluster , Deployed correctly Prometheus、Alertmanager And Grafana, Use the IP And Port access Grafana, choice DM Of dashboard see DM Relevant monitoring items .
Through the log view
DM At run time ,DM-worker, DM-master And dmctl Will output relevant information through the log , It contains information about the migration task . The log directory of each component is as follows :
- DM-master Log directory : adopt DM-master Process parameters
--log-fileSet up . If you use TiUP Deploy DM, The log directory is located in/dm-deploy/dm-master-8261/log/. - DM-worker Log directory : adopt DM-worker Process parameters
--log-fileSet up . If you use TiUP Deploy DM, The log directory is located in/dm-deploy/dm-worker-8262/log/.
- DM-master Log directory : adopt DM-master Process parameters
边栏推荐
- 理解 期望(均值/估计值)和方差
- [teacher Zhao Yuqiang] MySQL high availability architecture: MHA
- Kubernetes notes (I) kubernetes cluster architecture
- 从 Amazon Aurora 迁移数据到 TiDB
- 理解 YOLOV1 第一篇 预测阶段
- Method of finding prime number
- Exception when introducing redistemplate: noclassdeffounderror: com/fasterxml/jackson/core/jsonprocessingexception
- Code generator - single table query crud - generator
- 88. 合并两个有序数组
- Understand expectations (mean / estimate) and variances
猜你喜欢

Kubernetes resource object introduction and common commands (V) - (configmap)

Qt读写Excel--QXlsx插入图表5

Cesium 点击获取模型表面经纬度高程坐标(三维坐标)

Redhat7系统root用户密码破解

Synthetic keyword and NBAC mechanism

Capacity expansion mechanism of map

pytorch DataLoader实现miniBatch(未完成)

Personal outlook | looking forward to the future from Xiaobai's self analysis and future planning

Kubernetes notes (10) kubernetes Monitoring & debugging

项目总结--01(接口的增删改查;多线程的使用)
随机推荐
伯努利分布,二项分布和泊松分布以及最大似然之间的关系(未完成)
Use telnet to check whether the port corresponding to the IP is open
智牛股项目--05
Pytorch builds the simplest version of neural network
Life is a process of continuous learning
Kubernetes notes (IV) kubernetes network
Download the corresponding version of chromedriver
[branch and cycle] | | super long detailed explanation + code analysis + a trick game
Using the ethtool command by example
Configure DTD of XML file
Get a screenshot of a uiscrollview, including off screen parts
Apple submitted the new MAC model to the regulatory database before the spring conference
MySQL 5.7.32-winx64 installation tutorial (support installing multiple MySQL services on one host)
Txt document download save as solution
Deep learning, thinking from one dimensional input to multi-dimensional feature input
Alibaba cloud OOS file upload
Oauth2.0 - explanation of simplified mode, password mode and client mode
Why should there be a firewall? This time xiaowai has something to say!!!
Exception when introducing redistemplate: noclassdeffounderror: com/fasterxml/jackson/core/jsonprocessingexception
Solve the 1251 client does not support authentication protocol error of Navicat for MySQL connection MySQL 8.0.11