当前位置:网站首页>Tidb unsafe recover (tikv downtime is greater than or equal to half the number of replicas)
Tidb unsafe recover (tikv downtime is greater than or equal to half the number of replicas)
2022-06-11 17:33:00 【On the way to data communication】
One 、 background
| name | Number |
|---|---|
| tikv | 4 |
| copy | 3 |
Two 、 Environmental preparation
1. install jq
# ubuntu
apt install jq
# centos
yum install jq
2. Prepare the data
May adopt sysbench Or write your own script to do it
3. Simulate downtime
Delete the corresponding tikv Data directory or forced shrink tikv
4. The phenomenon
mysql> select * from region_1 limit 10;
ERROR 9005 (HY000): Region is unavailable
ERROR 9002 (HY000): TiKV server timeout
3、 ... and 、 Simulation scenario
1. Two downtime tikv
Because there are three copies , It's just two downtime tikv, No loss of data , For recovery methods, refer to 3、 ... and kv Two downtime
2. Three downtime tikv
2.1. View unconnected store
# Record "state_name": "Disconnected" Of store id( My is 4,5,1253)
tiup ctl:v4.0.13 pd -u http://pd_ip:pd_port store
2.2. View copy loss
# View those that have lost more than half of their copies region
region --jq='.regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length as $total | map(if .==(1253,4,5) then . else empty end) | length>=$total)}'
# View lost copies of region
region --jq='.regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length as $total | map(if .==(1253,4,5) then . else empty end) | length>=$total)}'
# jq Please refer to https://asktug.com/t/topic/63086
2.3. close pd Dispatch , Avoid exceptions during recovery
# Go into interactive mode
tiup ctl:v4.0.13 pd -u http://pd_ip:pd_port -i
# Execute the following commands respectively
config set region-schedule-limit 0
config set replica-schedule-limit 0
config set leader-schedule-limit 0
config set merge-schedule-limit 0
# Check whether the scheduling is closed
operator show
2.4. stop it tikv process ( Prevent execution unsafe-recover remove-fail-stores The file lock failed )
tiup cluster stop cluster_name -R tikv
2.5. Conduct unsafe-recover remove-fail-stores
2.5.1 take tikvctl Move to all States Normal kv In machine
scp /data/tidb/.tiup/components/ctl/v4.0.13/tikv-ctl [email protected]:/home/tidb
scp /data/tidb/.tiup/components/ctl/v4.0.13/tikv-ctl [email protected]:/home/tidb
scp /data/tidb/.tiup/components/ctl/v4.0.13/tikv-ctl [email protected]:/home/tidb
2.5.2 perform tikvctl command
# 4.0.x Version command ,-s Refer to store id,--all-regions It means everything region,-r Can be used to specify region Instead of --all-regions
# unsafe-recover remove-fail-stores( The faulty machine starts from the specified Region Of peer Remove from list )
./tikv-ctl --db /data/tikv/tikv-data28016/db unsafe-recover remove-fail-stores -s 1253,4,5 --all-regions
# 5.x Version command
./tikv-ctl --data-dir /data/tikv/tikv-data28016 unsafe-recover remove-fail-stores -s 1253,4,5 --all-regions
The above steps have removed the lost two copies of region Come back , The next step is to lose all three copies region The recovery of
2.6 Repair the missing three copies region
2.6.1 see region The situation of
curl http://tidb_ip:10080/regions/1189
{
"start_key": "dIAAAAAAAAAZ",
"end_key": "dIAAAAAAAAAb",
"start_key_hex": "748000000000000019",
"end_key_hex": "74800000000000001b",
"region_id": 52,
"frames": [
{
"db_name": "mysql",
"table_name": "stats_buckets",
"table_id": 25,
"is_record": false,
"index_name": "tbl",
"index_id": 1
},
{
"db_name": "mysql",
"table_name": "stats_buckets",
"table_id": 25,
"is_record": true
}
]
2.6.2 Create an empty region
# v4 edition
./tikv-ctl --db /data/tidb/tidb-data/tikv-20160/db recreate-region -p pd_ip:pd_port -r region_id
# v5 edition
./tikv-ctl --data-dir /data/tidb/tidb-data/tikv-20160/ recreate-region -p pd_ip:pd_port -r region_id
2.7. recovery pd Dispatch
# Go into interactive mode
tiup ctl:v4.0.13 pd -u http://pd_ip:pd_port -i
# Execute the following commands respectively ( The value is the value before closing )
config set region-schedule-limit 2048
config set replica-schedule-limit 64
config set leader-schedule-limit 4
config set merge-schedule-limit 8
2.8. start-up tikv colony
tiup cluster start cluster_name -R tikv
notes : At this time, the cluster can normally access , But the data will be lost , The nodes need to be expanded to ensure three copies
边栏推荐
- 活动 | Authing 首次渠道合作活动圆满落幕
- ffmpeg硬件编解码Nvidia GPU
- 05_ Feature Engineering - dimension reduction
- Is the second-class cost engineer worth the exam? What is the development prospect?
- threejs利用indexeddb缓存加载glb模型
- Authing biweekly news: authing forum launched (4.25-5.8)
- 论文阅读 dyngraph2vec: Capturing Network Dynamics using Dynamic Graph Representation Learning
- Go path: goroot & gopath
- sql server中关于FORCESCAN的使用以及注意项
- Mathematical foundations of information security Chapter 3 - finite fields (I)
猜你喜欢

论文阅读 dyngraph2vec: Capturing Network Dynamics using Dynamic Graph Representation Learning

Bentley 使用 Authing 快速实现应用系统与身份的集成

adb 命令学习笔记

ffmpeg奇偶场帧Interlace progressive命令和代码处理

QLineEdit 设置输入掩码

Vscode configures eslint to automatically format an error "auto fix is enabled by default. use the single string form“

Activity | authing's first channel cooperation activity came to a successful conclusion

Service学习笔记01-启动方式与生命周期

Docker installs mysql5.7 (enable binlog function and modify characters)

which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_ mod
随机推荐
String to numeric value
定制 or 订阅?未来中国 SaaS 行业发展趋势是什么?
How to become an optimist organization?
ForEach遍历集合、 集合容器
信息安全数学基础 Chapter 2——同余
which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mod
C language: use H and C. summary of problems encountered in documents
说说集合的面试题
6-6 批量求和(*)
【Mysql】redo log,undo log 和binlog详解(四)
QLineEdit 设置输入掩码
Error: error summary of pointer as function parameter
Typescript learning notes (II)
sql server中关于FORCESCAN的使用以及注意项
括号生成---2022/02/25
Service learning notes 02- actual combat startservice and bindservice
Go path: goroot & gopath
Centos7 server configuration (IV) -- installing redis
Splitting method of MySQL large tables
vscode保存代碼時自動eslint格式化