当前位置:网站首页>MySQL index optimization in practice
MySQL index optimization in practice
2022-08-02 02:52:00 【LG_985938339】
MySQL索引优化实战
一、前言
When we use the database query,Usually in order to improve query efficiency,We tend to create indexes for commonly used query conditions,从而达到快速查找的目的.For some business tables with predictably small amounts of data,The impact of not having an index is not that big,And for some tables that continue to grow a lot of data,It is necessary to choose to create an appropriate index,Otherwise, it will seriously affect the query efficiency of the data..
二、项目情况
Here we use a common distributed open source projectxxl-job,Analyze with your own business.在xxl-jobThe scheduling records of daily tasks need to be recorded in the database,这也算是xxl-jobIt can be regarded as a table with the largest amount of data..它的表结构如下:
CREATE TABLE `XXL_JOB_QRTZ_TRIGGER_LOG` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`job_group` int(11) NOT NULL COMMENT '执行器主键ID',
`job_id` int(11) NOT NULL COMMENT '任务,主键ID',
`executor_address` varchar(255) DEFAULT NULL COMMENT '执行器地址,本次执行的地址',
`executor_handler` varchar(255) DEFAULT NULL COMMENT '执行器任务handler',
`executor_param` varchar(512) DEFAULT NULL COMMENT '执行器任务参数',
`executor_sharding_param` varchar(20) DEFAULT NULL COMMENT '执行器任务分片参数,格式如 1/2',
`executor_fail_retry_count` int(11) NOT NULL DEFAULT '0' COMMENT '失败重试次数',
`trigger_time` datetime DEFAULT NULL COMMENT '调度-时间',
`trigger_code` int(11) NOT NULL COMMENT '调度-结果',
`trigger_msg` text COMMENT '调度-日志',
`handle_time` datetime DEFAULT NULL COMMENT '执行-时间',
`handle_code` int(11) NOT NULL COMMENT '执行-状态',
`handle_msg` text COMMENT '执行-日志',
`alarm_status` tinyint(4) NOT NULL DEFAULT '0' COMMENT '告警状态:0-默认、1-无需告警、2-告警成功、3-告警失败',
PRIMARY KEY (`id`),
KEY `I_trigger_time` (`trigger_time`),
KEY `I_handle_code` (`handle_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The data volume of the query every seven days is26万多,An average of about 40,000 data volumes per day.Due to the large amount of data in the table,And the more you rely on previous data, the less meaningful it is,Therefore, we will regularly clean up the data two months ago,In this way, the amount of data in the entire table is265万左右.
Because we sometimes need to watch a certain period of time often,All scheduling information records of a task,比如:查看2022-06-01到2022-06-10a task data of,Frequent page timeouts,So this table needs to be optimized.如下所示:
This table already existstrigger_time的索引,Used to trigger the time to do the task of the index,The query conditions on the page also have executors、任务、state these conditions,分别对应 job_group、job_id、handle_code这几个字段.
In order to optimize the query speed of the table,Query and analyze the table first,从测试的结果看,It takes more than 20 seconds to query data for a week:
三、explain分析
进行explain分析(PG和mysql 8.0以上支持explain analyze,能够执行sqlreflect the actual cost,这里我们用的是5.7只能用explain):
当查询条件是6.1到6.7的时候,走的是trigger_time的索引,But since it is a range query,Index discrimination is too low,Through the index can query260744(explainestimated to be560046) 条数据出来,再对这260744A progressive scan the data filter data,查询出18行数据.
And when the query condition is6.1到6.10的时候,Because the query optimizer of the database thinks that usingtrigger_timeThe data discrimination of the index is too low,It is believed that performing a full table scan is more efficient than using an index to query,So go directly to the full table scan,The actual test full table scan takes a few minutes.
在进行explain分析时,我们着重注意type、key、rows、filtered字段,basically can seerows过高,filtered过低,存在很大的优化空间.
Our main problem here is spent low index to distinguish,That is, too much data is found through the index,Need to filter small batches of data by scanning large amounts of data row by row.
四、优化方案
因此,考虑到XXL_JOB_QRTZ_TRIGGER_LOGAlmost all of the queries from the above operation log query page,trigger_time、job_group、job_id和handle_codeare frequently used query conditions,Therefore, we can consider using a joint index to optimize the query here.
Briefly list the principles of creating a joint index:
- 由于最左前缀原则,最经常使用、The most essential fields should come first
- Higher degree of differentiation should be on the front
- For ranged queries(非等值查询)的字段,should be at the end,Because the field index on his right will not take effect
- The federated index should not be too long,尽量不超过3个
From these perspectives,trigger_timeis the most frequently used and required condition,but he is range query,so it should be put at the end,选了job_id就必有job_group作为查询条件,即job_group更加常用,因此job_group应该放在job_id前面,handle_codeis the least commonly used,and his value is less,Discrimination is also low,To avoid joint indexing of too many fields,因此handle_codeDoes not need to be added to the joint index.
综上,创建索引SQL如下:
alter table XXL_JOB_QRTZ_TRIGGER_LOG add index I_union_index(job_group,job_id,trigger_time) using btree;
五、Online DDL与注意事项
5.1 Online DDL介绍
需要注意的是,在使用MySQL作为数据库,其在5.6之后新增了一个Online DDL 的功能,It can make the addition, deletion, modification and query operation of the database compatible with theDDL语句并行执行,使得DDLOperations have less impact on the database,This feature is especially important for production environments..因为一般情况下,We should try our best to ensure the availability of the service,在MySQL 5.6以前,DDLIt is necessary to directly lock the table to operate,Therefore, it is necessary to stop the server and then operate the database.
操作 | In Place | Rebuilds Table | 保证并发DML | 仅修改元数据 |
---|---|---|---|---|
添加非主键索引 | √ | x | √ | x |
删除索引 | √ | x | √ | √ |
重命名索引 | √ | x | √ | √ |
添加全文索引 | √ | x | x | x |
添加空间索引 | √ | x | x | x |
改变索引类型 | √ | x | √ | √ |
5.2 遇到的问题
If you think there isonline ddl 就万事大吉了,那肯定是不对的,Before officially updating the production environment,We performed this indexing in both the test environment and the pre-production environmentSQL,并没有想象的那么顺利,在执行完SQLwhile waiting for it to finish,we started anothersessionOn a visit to this form,Found this common querySQLdid not return in time,may be blocked,于是使用show full processlist进行查看
这里可以看到,所有关于XXL_JOB_QRTZ_TRIGGER_LOGStatements in this table are waiting for the release of the lock:Waiting for table metadata lock.
再执行 select * from information_schema.innodb_trx
saw thissql在运行,然而这个sql是没有带trigger_time的索引的,Therefore, a full table scan will be taken to execute,We have also tested before,A full table scansqlIt will take about five or six minutes to complete.
(后面了解到,这个sql是xxl-job-adminUsed for task failure retry and alarm use,will be executed every ten seconds,This is also a need to improvesql,For now we project the quota,Each scan requires a full table scan,Tasks that take five or six minutes to fail to get a response,lag,且xxl-job-admin是一直在运行的,There will be no failed tasks that are too long and need to be retried,and retriggered the earlier failed task,it doesn't make any sense anymore,Can be changed to addtrigger_timeCompletely enough within the last three days or the last seven days).
5.3 Online DDL的注意事项
在online ddl操作完成之前,It must wait for the transaction holding the metadata lock on the table to commit or roll back.online ddl 操作会在preparePhase acquires an exclusive metadata lock on a table for a period of time,and at the end of the operation when updating the table definition(commit 阶段)Also needs to acquire an exclusive metadata lock.因此,Transactions that already hold metadata locks on this table may causeonline ddl操作阻塞.If the transaction running time is very long,还可能会导致online ddl 操作超时.尤其要注意的是,在线 DDL 操作请求的未决独占元数据锁会阻塞表上的后续事务,这会使得online ddlwaiting for that long running transaction,And all subsequent requests, etconline ddl,That's what happened to us right now.
So we'd better runonline ddlBefore you look at the current table to see if there are any long-running transactions,如果有的话,先killDrop this transaction and run it again.
5.4 生产环境操作
Since this issue has been discovered during testing and pre-production,Therefore, corresponding measures are required when operating in a production environment.,This is how we do it in production:
- prepare three in advancesql执行页面:ddl创建索引、show full processlist和select * from information_schema.innodb_trx
- 执行ddl创建索引sql,等待进入prepare阶段
- 执行show full processlist,直到出现waitting for metadata lock
- 执行select * from information_schema.innodb_trx,再kill掉该sql线程
- 执行show full processlist,waitting for metadata lock消失,表示prepareThe stage has acquired an exclusive lock
- 再次执行3步骤,等待进入commit阶段,执行4步骤kill掉该sql线程
- 等待sql完成
Here the main is to ensure thatonline ddl的prepare和commitThe stage can successfully acquire the exclusive lock,Minimize businesssql被阻塞的时间.
观察xxl-job-admin应用日志,except for twokill掉的sql的日志,No other error logs appear,Indicates that the business is basically not affected.
六、最终效果
After the CREATE INDEX statement is executed,Test the previous statement again,After watching and combined index,How much has changed:
Here query performance is almost from20Controlled in seconds100ms以内了
参考MySQL官网:
https://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl-operations.html
https://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl-limitations.html
https://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl-performance.html
边栏推荐
- (1) Redis: Key-Value based storage system
- GTK RGB图像绘制
- leetcode 143. 重排链表
- Nacos source code analysis topic (1) - environment preparation
- - daily a LeetCode 】 【 9. Palindrome
- What to study after the PMP exam?The soft exam ahead is waiting for you~
- 789. 数的范围
- Oracle19c安装图文教程
- 请教各位大佬,如果我代码里面设置了,这个id我在什么地方可以查到呢?连接到mysql cluste
- 【CNN记录】tensorflow slice和strided_slice
猜你喜欢
IMU预积分的简单理解
NAS和私有云盘的区别?1篇文章说清楚
ros多客户端请求服务
Nanoprobes纳米探针丨Nanogold偶联物的特点和应用
svm.SVC application practice 1--Breast cancer detection
FOFAHUB usage test
项目场景 with ERRTYPE = cudaError CUDA failure 999 unknown error
Nacos源码分析专题(一)-环境准备
feign调用不通问题,JSON parse error Illegal character ((CTRL-CHAR, code 31)) only regular white space (r
Docker-compose安装mysql
随机推荐
Nanoprobes纳米探针丨Nanogold偶联物的特点和应用
53. 最小的k个数
国标GB28181协议EasyGBS平台兼容老版本收流端口的功能实现
How ReentrantLock works
剑指 Offer 14- I. 剪绳子
- daily a LeetCode 】 【 9. Palindrome
Lombok
【LeetCode】83.删除排序链表中的重复元素
Chapter 10_Index Optimization and Query Optimization
OC中成员变量,实例变量和属性之间的区别和联系
feign调用不通问题,JSON parse error Illegal character ((CTRL-CHAR, code 31)) only regular white space (r
isa指针使用详情
51. 数字排列
机器人领域期刊会议汇总
esp32经典蓝牙和单片机连接,,,手机蓝牙作为主机
微信小程序异步回调函数恶梦和解决办法
* 比较版本号
字典常用方法
Flask入门学习教程
790. 数的三次方根