当前位置:网站首页>Tidb DM alarm DM_ sync_ process_ exists_ with_ Error troubleshooting
Tidb DM alarm DM_ sync_ process_ exists_ with_ Error troubleshooting
2022-07-02 11:18:00 【On the way of data communication】
One 、 background
dm Synchronization task alarm DM_sync_process_exists_with_error, Automatically recover in a minute , Want to check the reason
Two 、 Observation log error
1.dm journal
[2022/06/28 14:31:13.364 +00:00] [ERROR] [db.go:201] ["execute statements failed after retry"] [task=task-name] [unit="binlog replication"] [queries="[sql]"] [arguments="[[]]"] [error="[code=10006:class=database:scope=not-set:level=high], Message: execute statement failed: commit, RawCause: invalid connection"]
2. The upstream mysql journal
2022-06-28T14:31:19.413211Z 28801 [Note] Aborted connection 28801 to db: 'unconnected' user: '***' host: 'ip' (Got an error reading communication packets)
2022-06-28T14:31:22.154980Z 28802 [Note] Aborted connection 28802 to db: 'unconnected' user: '***' host: 'ip' (Got an error reading communication packets)
2022-06-28T14:31:32.158508Z 28804 [Note] Start binlog_dump to master_thread_id(28804) slave_server(429505412), pos(mysql-bin-changelog.103037, 36247149)
2022-06-28T14:31:32.158739Z 28803 [Note] Start binlog_dump to master_thread_id(28803) slave_server(429505202), pos(mysql-bin-changelog.103037, 40373779)
3. The downstream tidb journal
[2022/06/28 14:31:12.419 +00:00] [WARN] [client_batch.go:638] ["wait response is cancelled"] [to=dm_worker_ip:20160] [cause="context canceled"]
[2022/06/28 14:31:12.419 +00:00] [WARN] [client_batch.go:638] ["wait response is cancelled"] [to=dm_worker_ip:20160] [cause="context canceled"]
[2022/06/28 14:31:12.419 +00:00] [WARN] [client_batch.go:638] ["wait response is cancelled"] [to=dm_worker_ip:20160] [cause="context canceled"]
[2022/06/28 14:31:12.419 +00:00] [WARN] [client_batch.go:638] ["wait response is cancelled"] [to=dm_worker_ip:20160] [cause="context canceled"]
[2022/06/28 14:31:12.419 +00:00] [WARN] [client_batch.go:638] ["wait response is cancelled"] [to=dm_worker_ip:20160] [cause="context canceled"]
4. The downstream tikv journal
[2022/06/28 14:31:12.585 +00:00] [WARN] [endpoint.rs:537] [error-response] [err="Region error (will back off and retry) message: \"peer is not leader for region 2641161, leader may Some(id: 2641164 store_id: 5)\" not_leader { region_id: 2641161 leader { id: 2641164 store_id: 5 } }"]
[2022/06/28 14:31:12.585 +00:00] [WARN] [endpoint.rs:537] [error-response] [err="Region error (will back off and retry) message: \"peer is not leader for region 2641165, leader may Some(id: 2641167 store_id: 4)\" not_leader { region_id: 2641165 leader { id: 2641167 store_id: 4 } }"]
[2022/06/28 14:31:12.585 +00:00] [WARN] [endpoint.rs:537] [error-response] [err="Region error (will back off and retry) message: \"peer is not leader for region 2709997, leader may Some(id: 2709999 store_id: 4)\" not_leader { region_id: 2709997 leader { id: 2709999 store_id: 4 } }"]
[2022/06/28 14:31:12.585 +00:00] [WARN] [endpoint.rs:537] [error-response] [err="Region error (will back off and retry) message: \"peer is not leader for region 2839445, leader may Some(id: 2839447 store_id: 4)\" not_leader { region_id: 2839445 leader { id: 2839447 store_id: 4 } }"]
[2022/06/28 14:31:20.400 +00:00] [WARN] [endpoint.rs:537] [error-response] [err="Region error (will back off and retry) message: \"peer is not leader for region 2957169, leader may Some(id: 2957170 store_id: 1)\" not_leader { region_id: 2957169 leader { id: 2957170 store_id: 1 } }"]
[2022/06/28 14:31:20.400 +00:00] [WARN] [endpoint.rs:537] [error-response] [err="Region error (will back off and retry) message: \"peer is not leader for region 2957169, leader may Some(id: 2957170 store_id: 1)\" not_leader { region_id: 2957169 leader { id: 2957170 store_id: 1 } }"]
[2022/06/28 14:31:20.400 +00:00] [WARN] [endpoint.rs:537] [error-response] [err="Region error (will back off and retry) message: \"peer is not leader for region 2957169, leader may Some(id: 2957170 store_id: 1)\" not_leader { region_id: 2957169 leader { id: 2957170 store_id: 1 } }"]
[2022/06/28 14:31:05.617 +00:00] [WARN] [endpoint.rs:537] [error-response] [err="Key is locked (will clean up) primary_lock: 748000000F000 lock_version: 434222311815512066 key: 748000009725552F000 lock_ttl: 3003 txn_size: 1"]
[2022/06/28 14:31:05.634 +00:00] [WARN] [endpoint.rs:537] [error-response] [err="Key is locked (will clean up) primary_lock: 7480000000092F000 lock_version: 434222311815512092 key: 748000000000 lock_ttl: 3018 txn_size: 5"]
[2022/06/28 14:31:15.389 +00:00] [ERROR] [kv.rs:931] ["KvService response batch commands fail"]
[2022/06/28 14:31:15.432 +00:00] [ERROR] [kv.rs:931] ["KvService response batch commands fail"]
5.pd journal
[2022/06/28 14:30:55.329 +00:00] [INFO] [operator_controller.go:424] ["add operator"] [region-id=2641161] [operator="\"transfer-hot-read-leader {transfer leader: store 1 to 5} (kind:hot-region,leader, region:2641161(25913,5), createAt:2022-06-28 14:30:55.329497692 +0000 UTC m=+8421773.911777457, startAt:0001-01-01 00:00:00 +0000 UTC, currentStep:0, steps:[transfer leader from store 1 to store 5])\""] ["additional info"=]
[2022/06/28 14:30:55.329 +00:00] [INFO] [operator_controller.go:620] ["send schedule command"] [region-id=2641161] [step="transfer leader from store 1 to store 5"] [source=create]
[2022/06/28 14:30:55.342 +00:00] [INFO] [cluster.go:567] ["leader changed"] [region-id=2641161] [from=1] [to=5]
[2022/06/28 14:30:55.342 +00:00] [INFO] [operator_controller.go:537] ["operator finish"] [region-id=2641161] [takes=12.961676ms] [operator="\"transfer-hot-read-leader {transfer leader: store 1 to 5} (kind:hot-region,leader, region:2641161(25913,5), createAt:2022-06-28 14:30:55.329497692 +0000 UTC m=+8421773.911777457, startAt:2022-06-28 14:30:55.329597613 +0000 UTC m=+8421773.911877386, currentStep:1, steps:[transfer leader from store 1 to store 5]) finished\""] ["additional info"=]
6. monitor cluster_tidb --> kv errors

3、 ... and 、 Conclusion
It can be seen that this alarm is caused by dm-worker There are errors invalid connection, This error is due to tidb There is wait response is cancelled, and tidb This kind of problem is caused by tikv There are locks and backoff As a result of , As for why locks and backoff, You can see pd My log is right hot-read-leader Scheduled , This is the production of backoff The key to , and lock The reason is from the business sql To find the
Official documents : Lock conflict description document
边栏推荐
- Functional interfaces and method references
- How does the whole network display IP ownership?
- Tick Data and Resampling
- LVM操作
- liftOver进行基因组坐标转换
- 函数式接口和方法引用
- js中给数组添加元素的方法有哪些
- Astparser parsing class files with enum enumeration methods
- 【云原生】2.5 Kubernetes 核心实战(下)
- Luogu p4281 [ahoi2008] emergency gathering / gathering (tree doubling LCA)
猜你喜欢

C#多维数组的属性获取方法及操作注意

The most detailed MySQL installation tutorial

MySQL比较运算符IN问题求解
![[play with FPGA learning 4 in simple terms ----- talk about state machine design]](/img/e0/95f8b8c5116c57455e54ad12372f12.png)
[play with FPGA learning 4 in simple terms ----- talk about state machine design]

Summary of data export methods in powerbi
![Binary tree topic -- Luogu p3884 [jloi2009] binary tree problem (DFS for binary tree depth BFS for binary tree width Dijkstra for shortest path)](/img/c2/bb85b681af0f78b380b1d179c7ea49.png)
Binary tree topic -- Luogu p3884 [jloi2009] binary tree problem (DFS for binary tree depth BFS for binary tree width Dijkstra for shortest path)

How does the whole network display IP ownership?

PLC-Recorder快速监控多个PLC位的技巧

V2x SIM dataset (Shanghai Jiaotong University & New York University)

Special topic of binary tree -- acwing 1589 Building binary search tree
随机推荐
Multi line display and single line display of tqdm
金山云——2023届暑期实习
Binary tree topic -- p1030 [noip2001 popularization group] find the first order
C# 文件与文件夹操作
Skills of PLC recorder in quickly monitoring multiple PLC bits
Thanos Receiver
From the perspective of attack surface, see the practice of zero trust scheme of Xinchuang
Xiao Sha's pain (double pointer
三.芯片啟動和時鐘系統
Mongodb learning and sorting (condition operator, $type operator, limit() method, skip() method and sort() method)
Special topic of binary tree -- acwing 47 Path with a certain value in binary tree (preorder traversal)
What are the methods of adding elements to arrays in JS
TIPC Getting Started6
Native method merge word
Huawei game failed to initialize init with error code 907135000
Use Huawei performance management service to configure the sampling rate on demand
CentOS8之mysql基本用法
QT learning diary 8 - resource file addition
ros缺少xacro的包
Gaode draws lines according to the track