当前位置:网站首页>Case analysis of data inconsistency caused by Pt OSC table change
Case analysis of data inconsistency caused by Pt OSC table change
2022-07-06 11:22:00 【wx5caecf2ed0645】
We usually solve our own problems , Sometimes I help people in the circle , Do some troubleshooting , This case is to help a company DBA Failure analysis conducted , Because it's typical , Let's share , But it's just sharing what happened , Do not make too much evaluation on the occurrence of this case and how to avoid it !
pt-online-schema-change: It is online for big tables alter operation , And try to avoid affecting online business , This is the best mysql One of management work , In normal work , Help us win more .
Environmental statement
pt-osc edition :percona-toolkit-2.2.14
mysql edition : percona-server-5.5
Database architecture : Double master replication ( This time pt-osc The table change is performed on the main database that is not online )
Problem description 
One day, I received help from friends in the circle , Feedback use pt-online-schema-change Adding a field but causing an unexpected deadlock , And there may be a problem with the data , Brother, I can't think of riding. Sister hopes I can help analyze . However, due to the online environment, it is impossible to test and reproduce , Therefore, only the engine log at the time of deadlock is given ( perform SHOW ENGINE innodb STATUS see ).
Let's take a look at the logs of the storage engine at that time , Only transaction related logs are intercepted here for convenience , Other log information is skipped , The specific logs are as follows :
TRANSACTION1
*** (1) TRANSACTION:
TRANSACTION 107BF2CDD, ACTIVE 1 sec setting auto-inc lock
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1248, 1 row lock(s), undo log entries 2
MySQL thread id 6, OS thread handle 0x7fd210190700, query id 1080843123 Reading event from the relay log
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
TABLE LOCK table `redcliff`.`_rider_new` trx id 107BF2CDD lock mode AUTO-INC waiting
Here we can read two messages :
1: The transaction is from relaylog Read log 2: Business 1( Business id by 107BF2CDD) Is waiting for _rider_new surface AUTO-INC lock
TRANSACTION2
*** (2) TRANSACTION:
TRANSACTION 107BF2CDC, ACTIVE 1 sec fetching rows
mysql tables in use 2, locked 2
253 lock struct(s), heap size 31160, 10864 row lock(s), undo log entries 10616
MySQL thread id 22433333, OS thread handle 0x7fc781b16700, query id 1080843120 127.0.0.1 dwbdba_mgr Sending data
INSERT LOW_PRIORITY IGNORE INTO `redcliff`.`_rider_new`
************************************( Omitted )
`frozen_provision`, `bloc…. LOCK IN SHARE MODE /*pt-online-schema-change 18153 copy nibble*/
*** (2) HOLDS THE LOCK(S):
TABLE LOCK table `redcliff`.`_rider_new` trx id 107BF2CDC lock mode AUTO-INC
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 636 page no 4599 n bits 112 index `PRIMARY` of table
`redcliff`.`rider` trx id 107BF2CDC lock mode S waiting
*** WE ROLL BACK TRANSACTION (1)
We can read the following information :
1、 Business 2( Business id by 107BF2CDC) Hold the watch _rider_new Of auto-inc Self increasing lock
2、 Business 2 wait for rider surface S lock
3、pt-osc Tool pass LOCK IN SHARE MODE To read the current read, you also need to ensure that other concurrent transactions cannot modify the currently read records , Ensure the new and old data 100% Agreement , Therefore, add S lock
Through the information read above, we analyze as follows :
Business 1
1、Reading event from the relay log To perform the rider The modification of table
( Here, through analysis afterwards relaylog Confirm that it is right rider Table changes )
Therefore, it is held rider On the table record x lock
2、 wait for _rider_new On the table auto-inc lock
( notes :pt-osc When the tool modifies the table, it will create three triggers for adding, deleting and modifying the table . so rider There are already three triggers on the table , And right rider Tabular update,insert After the action trigger is triggered, it will be converted to _rider_new On the table replace operation , There is self increase id On your watch replace Operation will generate new self increment id value )
Business 2
1、INSERT LOW_PRIORITY IGNORE INTO `redcliff`.`_rider_new` (`id`, `city_id`,
This statement needs to go to _rider_new The table writes data in batches , Here already hold _rider_new On the table auto-inc lock, From the above analysis, we can see that transactions need to wait rider Shared read lock on the table !
By cutting out the superfluous 
Business one :
hold :rider On the table record x lock
wait for :rider_new On the table auto-inc lock
Business two :
hold :_rider_new On the table auto-inc lock
wait for :rider On the table S lock
Perfect deadlock
Finally, rollback the transaction 1( That is, the copy update operation is rolled back , The master and slave data are inconsistent )
My point of view
In the above analysis , We come to the conclusion that ,pt-osc Tools in some cases , Data inconsistency may be caused by deadlock rollback , According to the principle , We can't avoid , Only try to alleviate ( for example : --chunk-size Parameters Set smaller , Or in TPS Great online does not use pt-osc), stay mysql online ddl The development is not perfect , Believe in the present mysql DBA The mainstream of table modification tools used online is still pt-online-schema-change , So I hope that through this sharing, we can reduce the number of pits , Go home early and sleep well .
边栏推荐
- Some notes of MySQL
- 數據庫高級學習筆記--SQL語句
- 报错解决 —— io.UnsupportedOperation: can‘t do nonzero end-relative seeks
- 虚拟机Ping通主机,主机Ping不通虚拟机
- 【博主推荐】C# Winform定时发送邮箱(附源码)
- Image recognition - pyteseract TesseractNotFoundError: tesseract is not installed or it‘s not in your path
- 虚拟机Ping通主机,主机Ping不通虚拟机
- Remember a company interview question: merge ordered arrays
- Classes in C #
- Rhcsa certification exam exercise (configured on the first host)
猜你喜欢

Use dapr to shorten software development cycle and improve production efficiency

基于apache-jena的知识问答

Dotnet replaces asp Net core's underlying communication is the IPC Library of named pipes

图像识别问题 — pytesseract.TesseractNotFoundError: tesseract is not installed or it‘s not in your path

Django running error: error loading mysqldb module solution

【博主推荐】SSM框架的后台管理系统(附源码)

Error connecting to MySQL database: 2059 - authentication plugin 'caching_ sha2_ The solution of 'password'

机器学习笔记-Week02-卷积神经网络

Knowledge Q & A based on Apache Jena

QT creator runs the Valgrind tool on external applications
随机推荐
[蓝桥杯2017初赛]方格分割
1. Mx6u learning notes (VII): bare metal development (4) -- master frequency and clock configuration
A trip to Macao - > see the world from a non line city to Macao
Knowledge Q & A based on Apache Jena
AcWing 1298. Solution to Cao Chong's pig raising problem
Classes in C #
QT creator specify editor settings
Data dictionary in C #
neo4j安装教程
Ansible实战系列三 _ task常用命令
Did you forget to register or load this tag 报错解决方法
[AGC009D]Uninity
LeetCode #461 汉明距离
Record a problem of raspberry pie DNS resolution failure
Detailed reading of stereo r-cnn paper -- Experiment: detailed explanation and result analysis
[Thesis Writing] how to write function description of jsp online examination system
One click extraction of tables in PDF
csdn-Markdown编辑器
[recommended by bloggers] C MVC list realizes the function of adding, deleting, modifying, checking, importing and exporting curves (with source code)
Navicat 導出錶生成PDM文件