当前位置:网站首页>Record MySQL troubleshooting caused by disk sector damage
Record MySQL troubleshooting caused by disk sector damage
2022-07-01 06:23:00 【Great masses】
List of articles
Description of the incident
At night, mobile phone text messages suddenly received a large number of hosts unreachable Alarm of . Immediately log in to the relevant server for troubleshooting. No exceptions are found , The business test is normal .
Almost all hosts have been alarmed unreachable, So the suspicion is zabbix Self abnormality leads to , Then check .
Investigation thought
One 、 see zabbixserver journal , Directly found mysql Connection interruption exception
First, a large number of logs of database connection loss appear in the log
55187:20220629:035108.704 [Z3005] query failed: [2013] Lost connection to MySQL server at 'reading initial communication packet', system error: 104 [begin;]
55235:20220629:035108.704 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select h.hostid,h.status,h.tls_accept,h.tls_issuer,h.tls_subject,h.tls_psk_identity,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='testserver' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null]
55235:20220629:035108.704 slow query: 11.197553 sec, "select h.hostid,h.status,h.tls_accept,h.tls_issuer,h.tls_subject,h.tls_psk_identity,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='testserver' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null"
54930:20220629:035108.704 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select escalationid,actionid,triggerid,eventid,r_eventid,nextcheck,esc_step,status,itemid,acknowledgeid from escalations where triggerid is not null and nextcheck<=1656445826 order by actionid,triggerid,itemid,escalationid]
54930:20220629:035108.705 slow query: 45.073163 sec, "select escalationid,actionid,triggerid,eventid,r_eventid,nextcheck,esc_step,status,itemid,acknowledgeid from escalations where triggerid is not null and nextcheck<=1656445826 order by actionid,triggerid,itemid,escalationid"
55220:20220629:035108.705 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select h.hostid,h.status,h.tls_accept,h.tls_issuer,h.tls_subject,h.tls_psk_identity,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='testserver2' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null]
54918:20220629:035108.705 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select refresh_unsupported,discovery_groupid,snmptrap_logging,severity_name_0,severity_name_1,severity_name_2,severity_name_3,severity_name_4,severity_name_5,hk_events_mode,hk_events_trigger,hk_events_internal,hk_events_discovery,hk_events_autoreg,hk_services_mode,hk_services,hk_audit_mode,hk_audit,hk_sessions_mode,hk_sessions,hk_history_mode,hk_history_global,hk_history,hk_trends_mode,hk_trends_global,hk_trends,default_inventory_mode from config order by configid]
54922:20220629:035108.705 [Z3005] query failed: [2013] Lost connection to MySQL server during query [delete from history where itemid=44780 and clock<1655830340]
54922:20220629:035108.705 slow query: 101.087109 sec, "delete from history where itemid=44780 and clock<1655830340"
54922:20220629:035108.705 database is down: retrying in 10 seconds
55187:20220629:035108.706 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.2.99' (111)
After that, there will be a large number of mysql Database reconnection log
55174:20220629:041906.024 database connection re-established
55228:20220629:041906.024 database connection re-established
54925:20220629:041906.024 database connection re-established
55195:20220629:041906.024 database connection re-established
55291:20220629:041906.110 database connection re-established
55318:20220629:041906.137 database connection re-established
55248:20220629:041906.317 database connection re-established
55367:20220629:041906.898 database connection re-established
Two 、 Check mysql The discovery log is as follows , Locate the disk problem
2022-06-28T18:24:16.532664Z 23 [Warning] InnoDB: Retry attempts for reading partial data failed.
2022-06-28T18:24:16.532718Z 23 [ERROR] InnoDB: Tried to read 16384 bytes at offset 5098094592, but was only able to read 0
2022-06-28T18:24:16.532751Z 23 [ERROR] InnoDB: Operating system error number 5 in a file operation.
2022-06-28T18:24:16.532769Z 23 [ERROR] InnoDB: Error number 5 means 'Input/output error'
2022-06-28T18:24:16.532784Z 23 [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html
2022-06-28T18:24:16.532796Z 23 [ERROR] InnoDB: File (unknown): 'read' returned OS error 105. Cannot continue operation
2022-06-28T18:24:16.532806Z 23 [ERROR] InnoDB: Cannot continue operation.
2022-06-28T18:24:19.274054Z 0 [Note] InnoDB: FTS optimize thread exiting.
3、 ... and 、 see message journal , Find out mysql There are frequent restarts , The disk has a damaged sector
Jun 29 04:24:29 localhost systemd: mysqld.service: main process exited, code=exited, status=3/NOTIMPLEMENTED
Jun 29 04:24:29 localhost systemd: Unit mysqld.service entered failed state.
Jun 29 04:24:29 localhost systemd: mysqld.service failed.
Jun 29 04:24:29 localhost systemd: mysqld.service holdoff time over, scheduling restart.
Jun 29 04:24:29 localhost systemd: Cannot add dependency job for unit sshd.socket, ignoring: Unit not found.
Jun 29 04:24:29 localhost systemd: Stopped MySQL Server.
Jun 29 04:24:29 localhost systemd: Starting MySQL Server...
Jun 29 04:24:32 localhost systemd: Started MySQL Server.
Jun 29 04:24:39 localhost kernel: sd 0:0:1:0: [sdb] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jun 29 04:24:39 localhost kernel: sd 0:0:1:0: [sdb] Sense Key : Medium Error [current]
Jun 29 04:24:39 localhost kernel: sd 0:0:1:0: [sdb] Add. Sense: No additional sense information
Jun 29 04:24:39 localhost kernel: sd 0:0:1:0: [sdb] CDB: Read(10) 28 00 40 aa da 20 00 00 08 00
Jun 29 04:24:39 localhost kernel: blk_update_request: I/O error, dev sdb, sector 1084938784
Four 、 adopt badblocks Do a disk check
stay linux Terminal input command
badblocks -s -v /dev/sdb
A large number of bad blocks can be detected

So far, the root cause of the problem has been determined !
Processing results
Try to move the data out of the replacement disk , But some data can no longer be read . Finally, the latest snapshot Restore to new storage , But the price is lost history data . Fortunately, it is not a business production environment .
Test the replaced disk , Attempts to perform a logical repair failed , It is judged as physical damage .
learn from one's mistakes , Remember to keep copies of important business data .
Reference resources
blog.51cto.com/u_13236892/5278888
边栏推荐
- HDU - 1501 Zipper(记忆化深搜)
- SystemVerilog learning-09-interprocess synchronization, communication and virtual methods
- [postgraduate entrance examination advanced mathematics Wu Zhongxiang +880 version for personal use] advanced mathematics Chapter II Basic Stage mind map
- [automatic operation and maintenance] what is the use of the automatic operation and maintenance platform
- Using Baidu map to query national subway lines
- FPGA - clocking -02- clock wiring resources of internal structure of 7 Series FPGA
- Picture server project test
- Tidb database characteristics summary
- Golang panic recover custom exception handling
- 【Unity Shader 消融效果_案例分享】
猜你喜欢

Freeswitch dial the extension number

图片服务器项目测试
![kotlin位运算的坑(bytes[i] and 0xff 报错)](/img/2c/de0608c29d8af558f6f8dab4eb7fd8.png)
kotlin位运算的坑(bytes[i] and 0xff 报错)

Tidb single machine simulation deployment production environment cluster (closed pit practice, personal test is effective)

讓田頭村變甜頭村的特色農產品是仙景芋還是白菜

C语言课设学生选修课程系统(大作业)

连续四年入选Gartner魔力象限,ManageEngine卓豪是如何做到的?

To sort out the anomaly detection methods, just read this article!
![[postgraduate entrance examination advanced mathematics Wu Zhongxiang +880 version for personal use] advanced mathematics Chapter II Basic Stage mind map](/img/c0/299a406efea51f24b1701b66adc1e3.png)
[postgraduate entrance examination advanced mathematics Wu Zhongxiang +880 version for personal use] advanced mathematics Chapter II Basic Stage mind map

C语言课设学生信息管理系统(大作业)
随机推荐
三分钟带你快速了解网站开发的整个流程
手把手教你实现一个深度学习框架...
Record currency in MySQL
Forkjoin and stream flow test
webapck打包原理--启动过程分析
SQL中DML语句(数据操作语言)
Recueillir des trésors dans le palais souterrain (recherche de mémoire profonde)
Ant new village is one of the special agricultural products that make Tiantou village in Guankou Town, Xiamen become Tiantou village
JMM details
[leetcode] day91- duplicate elements exist
Uniapp tree level selector
IT服务管理(ITSM)在高等教育领域的应用
SQL语句
69 Cesium代码datasource加载geojson
make: g++:命令未找到
【#Unity Shader#自定义材质面板_第二篇】
地宮取寶(記憶化深搜)
C# ManualResetEvent 类的理解
【ManageEngine卓豪 】助力世界顶尖音乐学院--茱莉亚学院,提升终端安全
Elements of database ER diagram