当前位置:网站首页>Case analysis of data inconsistency caused by Pt OSC table change
Case analysis of data inconsistency caused by Pt OSC table change
2022-07-06 11:22:00 【wx5caecf2ed0645】
We usually solve our own problems , Sometimes I help people in the circle , Do some troubleshooting , This case is to help a company DBA Failure analysis conducted , Because it's typical , Let's share , But it's just sharing what happened , Do not make too much evaluation on the occurrence of this case and how to avoid it !
pt-online-schema-change: It is online for big tables alter operation , And try to avoid affecting online business , This is the best mysql One of management work , In normal work , Help us win more .
Environmental statement
pt-osc edition :percona-toolkit-2.2.14
mysql edition : percona-server-5.5
Database architecture : Double master replication ( This time pt-osc The table change is performed on the main database that is not online )
Problem description
One day, I received help from friends in the circle , Feedback use pt-online-schema-change Adding a field but causing an unexpected deadlock , And there may be a problem with the data , Brother, I can't think of riding. Sister hopes I can help analyze . However, due to the online environment, it is impossible to test and reproduce , Therefore, only the engine log at the time of deadlock is given ( perform SHOW ENGINE innodb STATUS see ).
Let's take a look at the logs of the storage engine at that time , Only transaction related logs are intercepted here for convenience , Other log information is skipped , The specific logs are as follows :
TRANSACTION1
*** (1) TRANSACTION:
TRANSACTION 107BF2CDD, ACTIVE 1 sec setting auto-inc lock
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1248, 1 row lock(s), undo log entries 2
MySQL thread id 6, OS thread handle 0x7fd210190700, query id 1080843123 Reading event from the relay log
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
TABLE LOCK table `redcliff`.`_rider_new` trx id 107BF2CDD lock mode AUTO-INC waiting
Here we can read two messages :
1: The transaction is from relaylog Read log 2: Business 1( Business id by 107BF2CDD) Is waiting for _rider_new surface AUTO-INC lock
TRANSACTION2
*** (2) TRANSACTION:
TRANSACTION 107BF2CDC, ACTIVE 1 sec fetching rows
mysql tables in use 2, locked 2
253 lock struct(s), heap size 31160, 10864 row lock(s), undo log entries 10616
MySQL thread id 22433333, OS thread handle 0x7fc781b16700, query id 1080843120 127.0.0.1 dwbdba_mgr Sending data
INSERT LOW_PRIORITY IGNORE INTO `redcliff`.`_rider_new`
************************************( Omitted )
`frozen_provision`, `bloc…. LOCK IN SHARE MODE /*pt-online-schema-change 18153 copy nibble*/
*** (2) HOLDS THE LOCK(S):
TABLE LOCK table `redcliff`.`_rider_new` trx id 107BF2CDC lock mode AUTO-INC
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 636 page no 4599 n bits 112 index `PRIMARY` of table
`redcliff`.`rider` trx id 107BF2CDC lock mode S waiting
*** WE ROLL BACK TRANSACTION (1)
We can read the following information :
1、 Business 2( Business id by 107BF2CDC) Hold the watch _rider_new Of auto-inc Self increasing lock
2、 Business 2 wait for rider surface S lock
3、pt-osc Tool pass LOCK IN SHARE MODE To read the current read, you also need to ensure that other concurrent transactions cannot modify the currently read records , Ensure the new and old data 100% Agreement , Therefore, add S lock
Through the information read above, we analyze as follows :
Business 1
1、Reading event from the relay log To perform the rider The modification of table
( Here, through analysis afterwards relaylog Confirm that it is right rider Table changes )
Therefore, it is held rider On the table record x lock
2、 wait for _rider_new On the table auto-inc lock
( notes :pt-osc When the tool modifies the table, it will create three triggers for adding, deleting and modifying the table . so rider There are already three triggers on the table , And right rider Tabular update,insert After the action trigger is triggered, it will be converted to _rider_new On the table replace operation , There is self increase id On your watch replace Operation will generate new self increment id value )
Business 2
1、INSERT LOW_PRIORITY IGNORE INTO `redcliff`.`_rider_new` (`id`, `city_id`,
This statement needs to go to _rider_new The table writes data in batches , Here already hold _rider_new On the table auto-inc lock, From the above analysis, we can see that transactions need to wait rider Shared read lock on the table !
By cutting out the superfluous
Business one :
hold :rider On the table record x lock
wait for :rider_new On the table auto-inc lock
Business two :
hold :_rider_new On the table auto-inc lock
wait for :rider On the table S lock
Perfect deadlock
Finally, rollback the transaction 1( That is, the copy update operation is rolled back , The master and slave data are inconsistent )
My point of view
In the above analysis , We come to the conclusion that ,pt-osc Tools in some cases , Data inconsistency may be caused by deadlock rollback , According to the principle , We can't avoid , Only try to alleviate ( for example : --chunk-size Parameters Set smaller , Or in TPS Great online does not use pt-osc), stay mysql online ddl The development is not perfect , Believe in the present mysql DBA The mainstream of table modification tools used online is still pt-online-schema-change , So I hope that through this sharing, we can reduce the number of pits , Go home early and sleep well .
边栏推荐
- [蓝桥杯2017初赛]包子凑数
- neo4j安装教程
- 【博主推荐】C#MVC列表实现增删改查导入导出曲线功能(附源码)
- When you open the browser, you will also open mango TV, Tiktok and other websites outside the home page
- Error connecting to MySQL database: 2059 - authentication plugin 'caching_ sha2_ The solution of 'password'
- Request object and response object analysis
- 安全测试涉及的测试对象
- error C4996: ‘strcpy‘: This function or variable may be unsafe. Consider using strcpy_s instead
- In the era of DFI dividends, can TGP become a new benchmark for future DFI?
- Julia 1.6 1.7 common problem solving
猜你喜欢
解决安装Failed building wheel for pillow
MySQL主从复制、读写分离
QT creator test
Basic use of redis
Some problems in the development of unity3d upgraded 2020 VR
La table d'exportation Navicat génère un fichier PDM
Machine learning -- census data analysis
Swagger、Yapi接口管理服务_SE
打开浏览器的同时会在主页外同时打开芒果TV,抖音等网站
Did you forget to register or load this tag 报错解决方法
随机推荐
AI benchmark V5 ranking
Are you monitored by the company for sending resumes and logging in to job search websites? Deeply convinced that the product of "behavior awareness system ba" has not been retrieved on the official w
[recommended by bloggers] background management system of SSM framework (with source code)
JDBC原理
一键提取pdf中的表格
JDBC原理
Request object and response object analysis
What does usart1 mean
windows下同时安装mysql5.5和mysql8.0
Database advanced learning notes -- SQL statement
When you open the browser, you will also open mango TV, Tiktok and other websites outside the home page
Some notes of MySQL
[recommended by bloggers] C WinForm regularly sends email (with source code)
How to set up voice recognition on the computer with shortcut keys
Install mongdb tutorial and redis tutorial under Windows
Antlr4 uses keywords as identifiers
Test objects involved in safety test
Solution: log4j:warn please initialize the log4j system properly
Ansible practical Series II_ Getting started with Playbook
QT creator support platform