当前位置:网站首页>Migrate data from Amazon aurora to tidb
Migrate data from Amazon aurora to tidb
2022-07-03 06:05:00 【Tianxiang shop】
This document describes how to start from Amazon Aurora Migrate data to TiDB, Migration process adopts DB snapshot, It can save a lot of space and time costs . The whole migration includes two processes :
- Use Lightning Import full data to TiDB
- Use DM Continue incremental synchronization to TiDB( Optional )
Prerequisite
- install Dumpling and Lightning.
- obtain Dumpling what is needed Upstream database permissions .
- obtain Lightning Required downstream database permissions .
Import full data to TiDB
The first 1 Step : export Aurora Snapshot file to Amazon S3
stay Aurora On , Execute the following command , Query and record the current binlog Location :
mysql> SHOW MASTER STATUS;You'll get something like this , Please record binlog Name and location , For subsequent steps :
+------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-------------------+ | mysql-bin.000002 | 52806 | | | | +------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.012 sec)export Aurora Snapshot file . Please refer to Aurora Official documents of :Exporting DB snapshot data to Amazon S3.
Please note that , The time interval between the above two steps is recommended not to exceed 5 minute , Otherwise it's recorded binlog Too old location may cause data conflict during incremental synchronization .
After completing the above two steps , You need to prepare the following information :
- When creating snapshot points ,Aurora binlog Name and location of .
- Of snapshot files S3 route , And those with access SecretKey and AccessKey.
The first 2 Step : export schema
because Aurora The generated snapshot file does not contain the table creation statement file , So you need to use Dumpling Self export schema And use Lightning Create downstream schema. You can also skip this step , And manually create it downstream schema.
Run the following command , It is recommended to use --filter Parameters export only the schema:
tiup dumpling --host ${host} --port 3306 --user root --password ${password} --filter 'my_db1.table[12]' --no-data --output 's3://my-bucket/schema-backup?region=us-west-2' --filter "mydb.*"
The parameters used in the command are described as follows . For more information, please refer to Dumpling overview.
| Parameters | explain |
|---|---|
| -u or --user | Aurora MySQL Database users |
| -p or --password | MySQL User password of database |
| -P or --port | MySQL Port of the database |
| -h or --host | MySQL Database IP Address |
| -t or --thread | Number of threads exported |
| -o or --output | The directory where the exported files are stored , Support local file path or External storage URL Format |
| -r or --row | Maximum lines per file |
| -F | Specify the maximum size of a single file , Unit is MiB, recommended value 256 MiB |
| -B or --database | Export specified database |
| -T or --tables-list | Export the specified data table |
| -d or --no-data | Do not export data , Export only schema |
| -f or --filter | Export tables that match patterns , Unavailable -T Use it together , Grammar can refer to table filter |
The first 3 Step : To write Lightning The configuration file
Create tidb-lightning.toml The configuration file :
vim tidb-lightning.toml
[tidb] # The goal is TiDB Cluster information . host = ${host} # for example :172.16.32.1 port = ${port} # for example :4000 user = "${user_name}" # for example :"root" password = "${password}" # for example :"rootroot" status-port = ${status-port} # Table structure information from TiDB Of “ Status port ” Get example :10080 pd-addr = "${ip}:${port}" # colony PD The address of ,lightning adopt PD Get some information , for example 172.16.31.3:2379. When backend = "local" when status-port and pd-addr Must be filled in correctly , Otherwise, an exception will appear in the import . [tikv-importer] # "local": This mode is used by default , Apply to TB Large amount of data above level , But downstream during import TiDB Unable to provide external services . # "tidb":TB Data volume below level can also be used `tidb` Back end mode , The downstream TiDB Can normally provide services . For more information about the backend mode, see :https://docs.pingcap.com/tidb/stable/tidb-lightning-backends backend = "local" # Set the temporary storage address of sorted key value pairs , The destination path must be an empty directory , The directory space must be larger than the size of the dataset to be imported , It is recommended to set it to and `data-source-dir` Different disk directories and flash media , Monopoly IO You will get better import performance . sorted-kv-dir = "${path}" [mydumper] # Address of snapshot file data-source-dir = "${s3_path}" # eg: s3://my-bucket/sql-backup?region=us-west-2 [[mydumper.files]] # analysis parquet The expression required by the file pattern = '(?i)^(?:[^/]*/)*([a-z0-9_]+)\.([a-z0-9_]+)/(?:[^/]*/)*(?:[a-z0-9\-_.]+\.(parquet))$' schema = '$1' table = '$2' type = '$3'
If you need to TiDB Turn on TLS , Please refer to TiDB Lightning Configuration.
The first 4 Step : Import full data to TiDB
Use Lightning In the downstream TiDB Build table :
tiup tidb-lightning -config tidb-lightning.toml -d 's3://my-bucket/schema-backup?region=us-west-2'function
tidb-lightning. If you start the program directly from the command line , Maybe becauseSIGHUPSignal and exit , Suggest cooperationnohuporscreenTools such as , Such as :Will have access to this Amazon S3 The account number stored in the backend SecretKey and AccessKey Pass in as an environment variable Lightning node . At the same time, it also supports from
~/.aws/credentialsRead the credential file .export AWS_ACCESS_KEY_ID=${access_key} export AWS_SECRET_ACCESS_KEY=${secret_key} nohup tiup tidb-lightning -config tidb-lightning.toml > nohup.out 2>&1 &After the import starts , You can view the progress in any of the following ways :
- adopt
grepLog keywordsprogressView progress , Default 5 Minute update . - Check the progress through the monitoring panel , Please refer to TiDB Lightning monitor .
- adopt Web Page view progress , Please refer to Web Interface .
- adopt
After import ,TiDB Lightning Will automatically exit . Check the last of the log 5 There will be
the whole procedure completed, It indicates that the import was successful .
Be careful
Whether the import is successful or not , The last line will show tidb lightning exit. It just means TiDB Lightning The normal exit , Does not mean that the task is completed .
If you encounter problems during import , Please see the TiDB Lightning common problem .
Continuously and incrementally synchronize data to TiDB( Optional )
Prerequisite
The first 1 Step : create data source
newly build
source1.yamlfile , Write the following :# Unique name , Do not repeat . source-id: "mysql-01" # DM-worker Whether to use the global transaction identifier (GTID) Pull binlog. The premise of use is upstream MySQL Enabled GTID Pattern . If there is master-slave automatic switching in the upstream , Must be used GTID Pattern . enable-gtid: false from: host: "${host}" # for example :172.16.10.81 user: "root" password: "${password}" # Clear text password is supported but not recommended , It is recommended to use dmctl encrypt Encrypt the plaintext password and use port: 3306Execute the following command in the terminal , Use
tiup dmctlLoad the data source configuration into DM In the cluster :tiup dmctl --master-addr ${advertise-addr} operate-source create source1.yamlThe parameters in this command are described as follows :
Parameters describe --master-addrdmctl Any of the clusters to be connected DM-master Node {advertise-addr}, for example :172.16.10.71:8261 operate-source createtowards DM The cluster loads the data source
The first 2 Step : Create migration tasks
newly build task1.yaml file , Write the following :
# Task name , Multiple tasks running at the same time cannot have the same name . name: "test" # Task mode , May be set as # full: Only full data migration # incremental: binlog Real time synchronization # all: Total quantity + binlog transfer task-mode: "incremental" # The downstream TiDB Configuration information . target-database: host: "${host}" # for example :172.16.10.83 port: 4000 user: "root" password: "${password}" # Clear text password is supported but not recommended , It is recommended to use dmctl encrypt Encrypt the plaintext password and use # Black and white list global configuration , Each instance refers to . block-allow-list: # If DM The version is earlier than v2.0.0-beta.2 Then use black-white-list. listA: # name do-tables: # White list of upstream tables that need to be migrated . - db-name: "test_db" # The library name of the table to be migrated . tbl-name: "test_table" # Name of the table to be migrated . # Configure data sources mysql-instances: - source-id: "mysql-01" # data source ID, namely source1.yaml Medium source-id block-allow-list: "listA" # Introduce the black and white list configuration above . # syncer-config-name: "global" # Quote... Above syncers Incremental data configuration . meta: # task-mode by incremental And downstream database checkpoint When there is no binlog Where the migration starts ; If checkpoint There is , with checkpoint Subject to . binlog-name: "mysql-bin.000004" # “Step 1. export Aurora Snapshot file to Amazon S3” Log location recorded in , When there is master-slave switching in the upstream , You have to use gtid. binlog-pos: 109227 # binlog-gtid: "09bec856-ba95-11ea-850a-58f2b4af5188:1-9" # 【 Optional configuration 】 If incremental data migration requires repeated migration of data that has been migrated in full data migration , It needs to be turned on safe mode Avoid incremental data migration errors . ## This scenario is often seen in the following situations : The fully migrated data does not belong to a consistent snapshot of the data source , Then synchronize the incremental data from a location earlier than the full migration of data . # syncers: # sync Operation configuration parameters of processing unit . # global: # Configuration name . # safe-mode: true # Set to true, From the data source INSERT to REPLACE, take UPDATE to DELETE And REPLACE, This ensures that data can be imported repeatedly when migrating under the condition that there is a primary key or a unique index in the table structure DML. Before starting or resuming the incremental replication task 1 Within minutes TiDB DM It will start automatically safe mode.
The above is the minimum task configuration for performing migration . More configuration items about tasks , You can refer to DM Introduction to the complete configuration file of the task
The first 3 Step : Start the task
Before you start the data migration task , It is recommended to use check-task Command to check whether the configuration conforms to DM Configuration requirements for , To reduce the probability of error reporting in the later stage :
tiup dmctl --master-addr ${advertise-addr} check-task task.yaml
Use tiup dmctl Execute the following command to start the data migration task .
tiup dmctl --master-addr ${advertise-addr} start-task task.yaml
The parameters in this command are described as follows :
| Parameters | describe |
|---|---|
--master-addr | dmctl Any of the clusters to be connected DM-master Node {advertise-addr}, for example : 172.16.10.71:8261 |
start-task | The command is used to create a data migration task |
If the task fails to start , You can change the configuration according to the prompt of the returned result , Execute the above command again , Restart the task . Please refer to Faults and handling methods as well as common problem .
The first 4 Step : View task status
If you need to know DM Whether there are running migration tasks and task status in the cluster , You can use tiup dmctl perform query-status Command to query :
tiup dmctl --master-addr ${advertise-addr} query-status ${task-name}
Detailed interpretation of query results , Please refer to State of the query .
The first 5 Step : Monitor tasks and view logs
To view the historical status of the migration task and more internal operation indicators , Please refer to the following steps .
If you use TiUP Deploy DM When the cluster , Deployed correctly Prometheus、Alertmanager And Grafana, Use the IP And port entry Grafana, choice DM Of dashboard see DM Relevant monitoring items .
DM At run time ,DM-worker, DM-master And dmctl Will output relevant information through the log . The log directory of each component is as follows :
- DM-master Log directory : adopt DM-master Process parameters
--log-fileSet up . If you use TiUP Deploy DM, The log directory is located in/dm-deploy/dm-master-8261/log/. - DM-worker Log directory : adopt DM-worker Process parameters
--log-fileSet up . If you use TiUP Deploy DM, The log directory is located in/dm-deploy/dm-worker-8262/log/.
Explore more
边栏推荐
- [Zhao Yuqiang] deploy kubernetes cluster with binary package
- Apple submitted the new MAC model to the regulatory database before the spring conference
- pytorch 多分类中的损失函数
- Kubernetes notes (IV) kubernetes network
- Detailed explanation of contextclassloader
- Core principles and source code analysis of disruptor
- [teacher Zhao Yuqiang] calculate aggregation using MapReduce in mongodb
- Alibaba cloud Alipay sandbox payment
- [teacher Zhao Yuqiang] use Oracle's tracking file
- [function explanation (Part 2)] | [function declaration and definition + function recursion] key analysis + code diagram
猜你喜欢

GPS坐标转百度地图坐标的方法

Cesium 点击获三维坐标(经纬度高程)

Core principles and source code analysis of disruptor
![[teacher Zhao Yuqiang] index in mongodb (Part 2)](/img/a7/2140744ebad9f1dc0a609254cc618e.jpg)
[teacher Zhao Yuqiang] index in mongodb (Part 2)

Synthetic keyword and NBAC mechanism

SVN分支管理

Code generator - single table query crud - generator

Apt update and apt upgrade commands - what is the difference?

Error 1045 (28000) occurs when Linux logs in MySQL: access denied for user 'root' @ 'localhost' (using password: yes)

Skywalking8.7 source code analysis (I): agent startup process, agent configuration loading process, custom class loader agentclassloader, plug-in definition system, plug-in loading
随机推荐
Leetcode solution - 02 Add Two Numbers
[written examination question analysis] | | get [sizeof and strlen] [pointer and array] graphic explanation + code analysis
[together Shangshui Shuo series] day 7 content +day8
智牛股项目--04
Get a screenshot of a uiscrollview, including off screen parts
Life is a process of continuous learning
MySQL 5.7.32-winx64 installation tutorial (support installing multiple MySQL services on one host)
Intel's new GPU patent shows that its graphics card products will use MCM Packaging Technology
[trivia of two-dimensional array application] | [simple version] [detailed steps + code]
Alibaba cloud Alipay sandbox payment
Solve the 1251 client does not support authentication protocol error of Navicat for MySQL connection MySQL 8.0.11
Maximum likelihood estimation, divergence, cross entropy
最大似然估计,散度,交叉熵
智牛股--03
Solve the problem of automatic disconnection of SecureCRT timeout connection
[teacher Zhao Yuqiang] MySQL high availability architecture: MHA
[teacher Zhao Yuqiang] Flink's dataset operator
Sorry, this user does not exist!
Loss function in pytorch multi classification
Detailed explanation of iptables (1): iptables concept