当前位置:网站首页>Case analysis of data inconsistency caused by Pt OSC table change
Case analysis of data inconsistency caused by Pt OSC table change
2022-07-06 11:22:00 【wx5caecf2ed0645】
We usually solve our own problems , Sometimes I help people in the circle , Do some troubleshooting , This case is to help a company DBA Failure analysis conducted , Because it's typical , Let's share , But it's just sharing what happened , Do not make too much evaluation on the occurrence of this case and how to avoid it !
pt-online-schema-change: It is online for big tables alter operation , And try to avoid affecting online business , This is the best mysql One of management work , In normal work , Help us win more .
Environmental statement
pt-osc edition :percona-toolkit-2.2.14
mysql edition : percona-server-5.5
Database architecture : Double master replication ( This time pt-osc The table change is performed on the main database that is not online )
Problem description 
One day, I received help from friends in the circle , Feedback use pt-online-schema-change Adding a field but causing an unexpected deadlock , And there may be a problem with the data , Brother, I can't think of riding. Sister hopes I can help analyze . However, due to the online environment, it is impossible to test and reproduce , Therefore, only the engine log at the time of deadlock is given ( perform SHOW ENGINE innodb STATUS see ).
Let's take a look at the logs of the storage engine at that time , Only transaction related logs are intercepted here for convenience , Other log information is skipped , The specific logs are as follows :
TRANSACTION1
*** (1) TRANSACTION:
TRANSACTION 107BF2CDD, ACTIVE 1 sec setting auto-inc lock
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1248, 1 row lock(s), undo log entries 2
MySQL thread id 6, OS thread handle 0x7fd210190700, query id 1080843123 Reading event from the relay log
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
TABLE LOCK table `redcliff`.`_rider_new` trx id 107BF2CDD lock mode AUTO-INC waiting
Here we can read two messages :
1: The transaction is from relaylog Read log 2: Business 1( Business id by 107BF2CDD) Is waiting for _rider_new surface AUTO-INC lock
TRANSACTION2
*** (2) TRANSACTION:
TRANSACTION 107BF2CDC, ACTIVE 1 sec fetching rows
mysql tables in use 2, locked 2
253 lock struct(s), heap size 31160, 10864 row lock(s), undo log entries 10616
MySQL thread id 22433333, OS thread handle 0x7fc781b16700, query id 1080843120 127.0.0.1 dwbdba_mgr Sending data
INSERT LOW_PRIORITY IGNORE INTO `redcliff`.`_rider_new`
************************************( Omitted )
`frozen_provision`, `bloc…. LOCK IN SHARE MODE /*pt-online-schema-change 18153 copy nibble*/
*** (2) HOLDS THE LOCK(S):
TABLE LOCK table `redcliff`.`_rider_new` trx id 107BF2CDC lock mode AUTO-INC
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 636 page no 4599 n bits 112 index `PRIMARY` of table
`redcliff`.`rider` trx id 107BF2CDC lock mode S waiting
*** WE ROLL BACK TRANSACTION (1)
We can read the following information :
1、 Business 2( Business id by 107BF2CDC) Hold the watch _rider_new Of auto-inc Self increasing lock
2、 Business 2 wait for rider surface S lock
3、pt-osc Tool pass LOCK IN SHARE MODE To read the current read, you also need to ensure that other concurrent transactions cannot modify the currently read records , Ensure the new and old data 100% Agreement , Therefore, add S lock
Through the information read above, we analyze as follows :
Business 1
1、Reading event from the relay log To perform the rider The modification of table
( Here, through analysis afterwards relaylog Confirm that it is right rider Table changes )
Therefore, it is held rider On the table record x lock
2、 wait for _rider_new On the table auto-inc lock
( notes :pt-osc When the tool modifies the table, it will create three triggers for adding, deleting and modifying the table . so rider There are already three triggers on the table , And right rider Tabular update,insert After the action trigger is triggered, it will be converted to _rider_new On the table replace operation , There is self increase id On your watch replace Operation will generate new self increment id value )
Business 2
1、INSERT LOW_PRIORITY IGNORE INTO `redcliff`.`_rider_new` (`id`, `city_id`,
This statement needs to go to _rider_new The table writes data in batches , Here already hold _rider_new On the table auto-inc lock, From the above analysis, we can see that transactions need to wait rider Shared read lock on the table !
By cutting out the superfluous 
Business one :
hold :rider On the table record x lock
wait for :rider_new On the table auto-inc lock
Business two :
hold :_rider_new On the table auto-inc lock
wait for :rider On the table S lock
Perfect deadlock
Finally, rollback the transaction 1( That is, the copy update operation is rolled back , The master and slave data are inconsistent )
My point of view
In the above analysis , We come to the conclusion that ,pt-osc Tools in some cases , Data inconsistency may be caused by deadlock rollback , According to the principle , We can't avoid , Only try to alleviate ( for example : --chunk-size Parameters Set smaller , Or in TPS Great online does not use pt-osc), stay mysql online ddl The development is not perfect , Believe in the present mysql DBA The mainstream of table modification tools used online is still pt-online-schema-change , So I hope that through this sharing, we can reduce the number of pits , Go home early and sleep well .
边栏推荐
- Codeforces Round #771 (Div. 2)
- What does BSP mean
- Remember the interview algorithm of a company: find the number of times a number appears in an ordered array
- 解决安装Failed building wheel for pillow
- Generate PDM file from Navicat export table
- frp内网穿透那些事
- 数据库高级学习笔记--SQL语句
- AcWing 242. A simple integer problem (tree array + difference)
- Picture coloring project - deoldify
- 1. Mx6u learning notes (VII): bare metal development (4) -- master frequency and clock configuration
猜你喜欢

QT creator shape

02-项目实战之后台员工信息管理

Django running error: error loading mysqldb module solution

图像识别问题 — pytesseract.TesseractNotFoundError: tesseract is not installed or it‘s not in your path

How to configure flymcu (STM32 serial port download software) is shown in super detail

windows下同时安装mysql5.5和mysql8.0

Classes in C #

PyCharm中无法调用numpy,报错ModuleNotFoundError: No module named ‘numpy‘

Request object and response object analysis

A trip to Macao - > see the world from a non line city to Macao
随机推荐
[download app for free]ineukernel OCR image data recognition and acquisition principle and product application
打开浏览器的同时会在主页外同时打开芒果TV,抖音等网站
MySQL master-slave replication, read-write separation
【博主推荐】asp.net WebService 后台数据API JSON(附源码)
L2-006 树的遍历 (25 分)
AcWing 1294. Cherry Blossom explanation
AcWing 179.阶乘分解 题解
Windows下安装MongDB教程、Redis教程
[蓝桥杯2021初赛] 砝码称重
记一次某公司面试题:合并有序数组
Django运行报错:Error loading MySQLdb module解决方法
Some notes of MySQL
虚拟机Ping通主机,主机Ping不通虚拟机
FRP intranet penetration
Image recognition - pyteseract TesseractNotFoundError: tesseract is not installed or it‘s not in your path
Project practice - background employee information management (add, delete, modify, check, login and exit)
Pytorch基础
数数字游戏
[recommended by bloggers] background management system of SSM framework (with source code)
Did you forget to register or load this tag 报错解决方法