当前位置:网站首页>Use dumping to back up tidb cluster data to S3 compatible storage
Use dumping to back up tidb cluster data to S3 compatible storage
2022-07-06 08:03:00 【Tianxiang shop】
This document describes how to put Kubernetes On TiDB The data of the cluster is backed up to compatible S3 On the storage of . In this document “ Backup ”, All refer to full backup ( namely Ad-hoc Full backup and scheduled full backup ).
The backup method described in this document is based on TiDB Operator(v1.1 And above ) Of CustomResourceDefinition (CRD) Realization , Bottom use Dumpling The tool obtains the logical backup of the cluster , Then upload the backup data to compatible S3 On the storage of .
Dumpling Is a data export tool , The tool can store in TiDB/MySQL The data in is exported as SQL perhaps CSV Format , It can be used to complete logical full backup or export .
Use scenarios
If you need to TiDB Cluster data in Ad-hoc Full volume backup or Scheduled full backup Backup to compatible S3 On the storage of , And there are the following requirements for data backup , Consider the backup scheme introduced in this article :
- export SQL or CSV Formatted data
- For single SQL Limit the memory of the statement
- export TiDB Snapshot of historical data
Ad-hoc Full volume backup
Ad-hoc Full backup by creating a custom Backup custom resource (CR) Object to describe a backup .TiDB Operator According to this Backup Object to complete the specific backup process . If an error occurs during the backup , The program will not automatically retry , At this time, it needs to be handled manually .
Current compatibility S3 In storage ,Ceph and Amazon S3 It can work normally after testing . The following provides how to TiDB The data of the cluster is backed up to Ceph and Amazon S3 Examples of these two types of storage . The example assumes that the pair is deployed in Kubernetes tidb-cluster This namespace Medium TiDB colony demo1 Data backup , The following is the specific operation process .
precondition
Use Dumpling Backup TiDB Cluster data to S3 front , Make sure you have the following permissions to back up the database :
mysql.tidbTabularSELECTandUPDATEjurisdiction : Before and after backup ,Backup CR You need a database account with this permission , Used to adjust GC Time .- Global permissions :
SELECT、RELOAD、LOCK TABLES、 andREPLICATION CLIENT.
The following is an example of how to create a backup user :
CREATE USER 'backup'@'%' IDENTIFIED BY '...'; GRANT SELECT, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'backup'@'%'; GRANT UPDATE, SELECT ON mysql.tidb TO 'backup'@'%';
The first 1 Step :Ad-hoc Full backup environment preparation
Execute the following command , according to backup-rbac.yaml stay
tidb-clusterNamespace creates role-based access control (RBAC) resources .kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/v1.3.6/manifests/backup/backup-rbac.yaml -n tidb-clusterRemote storage access authorization .
If you use Amazon S3 To back up the cluster , There are three ways to grant permissions , Reference resources AWS Account Authorization Authorized access is compatible S3 Remote storage of ; Use Ceph When testing backup as back-end storage , It's through AccessKey and SecretKey Mode Authorization , Please refer to adopt AccessKey and SecretKey to grant authorization .
establish
backup-demo1-tidb-secretsecret. The secret Store for access TiDB Clustered root Account and key .kubectl create secret generic backup-demo1-tidb-secret --from-literal=password=${password} --namespace=tidb-cluster
The first 2 Step : Back up data to compatible S3 The storage
Be careful
because
rcloneThere is problem , If you use Amazon S3 Store backup , also Amazon S3 Open theAWS-KMSencryption , You need yaml Add the following to the filespec.s3.optionsConfigure to ensure successful backup :spec: ... s3: ... options: - --ignore-checksum
This section provides a variety of ways to store access . Just use the method that suits your situation .
- By importing AccessKey and SecretKey Backup to Amazon S3 Methods
- By importing AccessKey and SecretKey Backup to Ceph Methods
- By binding IAM And Pod Backup to Amazon S3 Methods
- By binding IAM And ServiceAccount Backup to Amazon S3 Methods
Method 1: establish
BackupCR, adopt AccessKey and SecretKey Back up the data to Amazon S3.kubectl apply -f backup-s3.yamlbackup-s3.yamlThe contents of the document are as follows :--- apiVersion: pingcap.com/v1alpha1 kind: Backup metadata: name: demo1-backup-s3 namespace: tidb-cluster spec: from: host: ${tidb_host} port: ${tidb_port} user: ${tidb_user} secretName: backup-demo1-tidb-secret s3: provider: aws secretName: s3-secret region: ${region} bucket: ${bucket} # prefix: ${prefix} # storageClass: STANDARD_IA # acl: private # endpoint: # dumpling: # options: # - --threads=16 # - --rows=10000 # tableFilter: # - "test.*" # storageClassName: local-storage storageSize: 10GiMethod 2: establish
BackupCR, adopt AccessKey and SecretKey Back up the data to Ceph.kubectl apply -f backup-s3.yamlbackup-s3.yamlThe contents of the document are as follows :--- apiVersion: pingcap.com/v1alpha1 kind: Backup metadata: name: demo1-backup-s3 namespace: tidb-cluster spec: from: host: ${tidb_host} port: ${tidb_port} user: ${tidb_user} secretName: backup-demo1-tidb-secret s3: provider: ceph secretName: s3-secret endpoint: ${endpoint} # prefix: ${prefix} bucket: ${bucket} # dumpling: # options: # - --threads=16 # - --rows=10000 # tableFilter: # - "test.*" # storageClassName: local-storage storageSize: 10GiMethod 3: establish
BackupCR, adopt IAM binding Pod Back up the data to Amazon S3.kubectl apply -f backup-s3.yamlbackup-s3.yamlThe contents of the document are as follows :--- apiVersion: pingcap.com/v1alpha1 kind: Backup metadata: name: demo1-backup-s3 namespace: tidb-cluster annotations: iam.amazonaws.com/role: arn:aws:iam::123456789012:role/user spec: backupType: full from: host: ${tidb_host} port: ${tidb_port} user: ${tidb_user} secretName: backup-demo1-tidb-secret s3: provider: aws region: ${region} bucket: ${bucket} # prefix: ${prefix} # storageClass: STANDARD_IA # acl: private # endpoint: # dumpling: # options: # - --threads=16 # - --rows=10000 # tableFilter: # - "test.*" # storageClassName: local-storage storageSize: 10GiMethod 4: establish
BackupCR, adopt IAM binding ServiceAccount Back up the data to Amazon S3.kubectl apply -f backup-s3.yamlbackup-s3.yamlThe contents of the document are as follows :--- apiVersion: pingcap.com/v1alpha1 kind: Backup metadata: name: demo1-backup-s3 namespace: tidb-cluster spec: backupType: full serviceAccount: tidb-backup-manager from: host: ${tidb_host} port: ${tidb_port} user: ${tidb_user} secretName: backup-demo1-tidb-secret s3: provider: aws region: ${region} bucket: ${bucket} # prefix: ${prefix} # storageClass: STANDARD_IA # acl: private # endpoint: # dumpling: # options: # - --threads=16 # - --rows=10000 # tableFilter: # - "test.*" # storageClassName: local-storage storageSize: 10Gi
The above example will TiDB The data of the cluster is fully exported and backed up to Amazon S3 and Ceph On .Amazon S3 Of acl、endpoint、storageClass Configuration items can be omitted . The rest are not Amazon S3 But compatible S3 Storage can be used and Amazon S3 Similar configuration . Refer to the example above Ceph Configuration of , Omit fields that do not need to be configured . More compatible S3 Storage related configuration reference S3 Storage field introduction .
In the example above ,.spec.dumpling Express Dumpling Related configuration , Can be in options Field assignment Dumpling Operation parameters of , For details, see Dumpling Using document ; By default, this field can be configured without . When you don't specify Dumpling The configuration of ,options The default values of the fields are as follows :
options: - --threads=16 - --rows=10000
more Backup CR For detailed explanation of fields, please refer to Backup CR Field is introduced .
Create good Backup CR after , You can view the backup status through the following commands :
kubectl get bk -n tidb-cluster -owide
To get a Backup job Details of , Please use the following command . For $backup_job_name, Please use the name in the output of the previous command .
kubectl describe bk -n tidb-cluster $backup_job_name
If you want to run again Ad-hoc Backup , You need Delete the backed up Backup CR And recreate .
Scheduled full backup
Users set backup policies to TiDB The cluster performs scheduled backup , At the same time, set the retention policy of backup to avoid too many backups . Scheduled full backup through customized BackupSchedule CR Object to describe . A full backup will be triggered every time the backup time point , The bottom layer of scheduled full backup passes Ad-hoc Full backup . The following are the specific steps to create a scheduled full backup :
The first 1 Step : Regular full backup environment preparation
Same as Ad-hoc Full backup environment preparation .
The first 2 Step : Back up data in full on a regular basis to S3 Compatible storage
Be careful
because rclone There is problem , If you use Amazon S3 Store backup , also Amazon S3 Open the AWS-KMS encryption , You need yaml Add the following to the file spec.backupTemplate.s3.options Configure to ensure successful backup :
spec: ... backupTemplate: ... s3: ... options: - --ignore-checksum
Method 1: establish
BackupScheduleCR Turn on TiDB Scheduled full backup of the cluster , adopt AccessKey and SecretKey Back up the data to Amazon S3:kubectl apply -f backup-schedule-s3.yamlbackup-schedule-s3.yamlThe contents of the document are as follows :--- apiVersion: pingcap.com/v1alpha1 kind: BackupSchedule metadata: name: demo1-backup-schedule-s3 namespace: tidb-cluster spec: #maxBackups: 5 #pause: true maxReservedTime: "3h" schedule: "*/2 * * * *" backupTemplate: from: host: ${tidb_host} port: ${tidb_port} user: ${tidb_user} secretName: backup-demo1-tidb-secret s3: provider: aws secretName: s3-secret region: ${region} bucket: ${bucket} # prefix: ${prefix} # storageClass: STANDARD_IA # acl: private # endpoint: # dumpling: # options: # - --threads=16 # - --rows=10000 # tableFilter: # - "test.*" # storageClassName: local-storage storageSize: 10GiMethod 2: establish
BackupScheduleCR Turn on TiDB Scheduled full backup of the cluster , adopt AccessKey and SecretKey Back up the data to Ceph:kubectl apply -f backup-schedule-s3.yamlbackup-schedule-s3.yamlThe contents of the document are as follows :--- apiVersion: pingcap.com/v1alpha1 kind: BackupSchedule metadata: name: demo1-backup-schedule-ceph namespace: tidb-cluster spec: #maxBackups: 5 #pause: true maxReservedTime: "3h" schedule: "*/2 * * * *" backupTemplate: from: host: ${tidb_host} port: ${tidb_port} user: ${tidb_user} secretName: backup-demo1-tidb-secret s3: provider: ceph secretName: s3-secret endpoint: ${endpoint} bucket: ${bucket} # prefix: ${prefix} # dumpling: # options: # - --threads=16 # - --rows=10000 # tableFilter: # - "test.*" # storageClassName: local-storage storageSize: 10GiMethod 3: establish
BackupScheduleCR Turn on TiDB Scheduled full backup of the cluster , adopt IAM binding Pod Back up the data to Amazon S3:kubectl apply -f backup-schedule-s3.yamlbackup-schedule-s3.yamlThe contents of the document are as follows :--- apiVersion: pingcap.com/v1alpha1 kind: BackupSchedule metadata: name: demo1-backup-schedule-s3 namespace: tidb-cluster annotations: iam.amazonaws.com/role: arn:aws:iam::123456789012:role/user spec: #maxBackups: 5 #pause: true maxReservedTime: "3h" schedule: "*/2 * * * *" backupTemplate: from: host: ${tidb_host} port: ${tidb_port} user: ${tidb_user} secretName: backup-demo1-tidb-secret s3: provider: aws region: ${region} bucket: ${bucket} # prefix: ${prefix} # storageClass: STANDARD_IA # acl: private # endpoint: # dumpling: # options: # - --threads=16 # - --rows=10000 # tableFilter: # - "test.*" # storageClassName: local-storage storageSize: 10GiMethod 4: establish
BackupScheduleCR Turn on TiDB Scheduled full backup of the cluster , adopt IAM binding ServiceAccount Back up the data to Amazon S3:kubectl apply -f backup-schedule-s3.yamlbackup-schedule-s3.yamlThe contents of the document are as follows :--- apiVersion: pingcap.com/v1alpha1 kind: BackupSchedule metadata: name: demo1-backup-schedule-s3 namespace: tidb-cluster spec: #maxBackups: 5 #pause: true maxReservedTime: "3h" schedule: "*/2 * * * *" serviceAccount: tidb-backup-manager backupTemplate: from: host: ${tidb_host} port: ${tidb_port} user: ${tidb_user} secretName: backup-demo1-tidb-secret s3: provider: aws region: ${region} bucket: ${bucket} # prefix: ${prefix} # storageClass: STANDARD_IA # acl: private # endpoint: # dumpling: # options: # - --threads=16 # - --rows=10000 # tableFilter: # - "test.*" # storageClassName: local-storage storageSize: 10Gi
After the scheduled full backup is created , You can view the status of scheduled full backup through the following commands :
kubectl get bks -n tidb-cluster -owide
Check all the backup pieces below the scheduled full backup :
kubectl get bk -l tidb.pingcap.com/backup-schedule=demo1-backup-schedule-s3 -n tidb-cluster
From the above example ,backupSchedule The configuration of consists of two parts . Part of it is backupSchedule Unique configuration , The other part is backupTemplate.backupTemplate Specify the configuration related to cluster and remote storage , Fields and Backup CR Medium spec equally , Please refer to Backup CR Field is introduced .backupSchedule For the introduction of unique configuration items, please refer to BackupSchedule CR Field is introduced .
Be careful
TiDB Operator Will create a PVC, This PVC At the same time Ad-hoc Full backup and scheduled full backup , The backup data will be stored in PV, Then upload to remote storage . If you want to delete this after the backup PVC, You can refer to Delete resources First back up Pod Delete , Then take it. PVC Delete .
If the backup and upload to the remote storage are successful ,TiDB Operator Will automatically delete the local backup file . If the upload fails , Then the local backup file will be preserved .
Delete the backed up Backup CR
When the backup is complete , You may need to delete the backup Backup CR. Please refer to Delete the backed up Backup CR.
边栏推荐
- JS select all and tab bar switching, simple comments
- MFC sends left click, double click, and right click messages to list controls
- Transformer principle and code elaboration
- [Yugong series] February 2022 U3D full stack class 010 prefabricated parts
- Migrate data from SQL files to tidb
- Golang DNS 随便写写
- 861. Score after flipping the matrix
- [Yugong series] February 2022 U3D full stack class 011 unity section 1 mind map
- 备份与恢复 CR 介绍
- Circuit breaker: use of hystrix
猜你喜欢
![[research materials] 2022 enterprise wechat Ecosystem Research Report - Download attached](/img/35/898a8086bc35462b0fcb9e6b58b86b.jpg)
[research materials] 2022 enterprise wechat Ecosystem Research Report - Download attached

"Designer universe": "benefit dimension" APEC public welfare + 2022 the latest slogan and the new platform will be launched soon | Asia Pacific Financial Media

面向个性化需求的在线云数据库混合调优系统 | SIGMOD 2022入选论文解读

Pangolin Library: control panel, control components, shortcut key settings

让学指针变得更简单(三)

Golang DNS write casually

在 uniapp 中使用阿里图标

Easy to use tcp-udp_ Debug tool download and use

Hcip day 16

Go learning notes (3) basic types and statements (2)
随机推荐
Analysis of Top1 accuracy and top5 accuracy examples
Risk planning and identification of Oracle project management system
Hackathon ifm
Data governance: data quality
"Designer universe" APEC design +: the list of winners of the Paris Design Award in France was recently announced. The winners of "Changsha world center Damei mansion" were awarded by the national eco
How to estimate the number of threads
使用 TiDB Lightning 恢复 S3 兼容存储上的备份数据
[Yugong series] creation of 009 unity object of U3D full stack class in February 2022
Nft智能合约发行,盲盒,公开发售技术实战--合约篇
Asia Pacific Financial Media | female pattern ladyvision: forced the hotel to upgrade security. The drunk woman died in the guest room, and the hotel was sentenced not to pay compensation | APEC secur
What are the ways to download network pictures with PHP
数据治理:数据质量篇
It's hard to find a job when the industry is in recession
Migrate data from CSV files to tidb
ROS learning (IX): referencing custom message types in header files
Hcip day 16
DataX self check error /datax/plugin/reader/_ drdsreader/plugin. Json] does not exist
[KMP] template
Go learning notes (3) basic types and statements (2)
"Designer universe" Guangdong responds to the opinions of the national development and Reform Commission. Primary school students incarnate as small community designers | national economic and Informa