当前位置:网站首页>How to solve MySQL deep paging problem
How to solve MySQL deep paging problem
2022-07-28 03:37:00 【Yisu cloud】
How to solve mysql Deep paging problem
Today, I will share with you how to solve mysql Relevant knowledge points of deep paging problem , Detailed content , Clear logic , I believe most people still know too much about this knowledge , So share this article for your reference , I hope you will gain something after reading this article , Now let's take a look .

During daily requirement development , I'm sure you're interested in limit It must not be strange , But use limit when , When the offset (offset) Very big time , You will find that the query efficiency is getting slower and slower . In limine limit 2000 when , Probably 200ms, You can find the data you need , But when limit 4000 offset 100000 when , You will find that its query efficiency already needs 1S about , What if you were older , It will only get slower and slower .
Generalization
This article will discuss when mysql Table large amount of data , How to optimize deep paging problem , And attach the recent optimization slow sql The case pseudocode of the problem .
1、limit Deep paging problem description
First look at the structure of the watch ( Just give an example , Incomplete table structure , Useless fields are not displayed )
CREATE TABLE `p2p_detail_record` ( `id` varchar(32) COLLATE utf8mb4_bin NOT NULL DEFAULT '' COMMENT ' Primary key ', `batch_num` int NOT NULL DEFAULT '0' COMMENT ' Number of reports ', `uptime` bigint NOT NULL DEFAULT '0' COMMENT ' Reporting time ', `uuid` varchar(64) COLLATE utf8mb4_bin NOT NULL DEFAULT '' COMMENT ' meeting id', `start_time_stamp` bigint NOT NULL DEFAULT '0' COMMENT ' Starting time ', `answer_time_stamp` bigint NOT NULL DEFAULT '0' COMMENT ' Response time ', `end_time_stamp` bigint NOT NULL DEFAULT '0' COMMENT ' End time ', `duration` int NOT NULL DEFAULT '0' COMMENT ' The duration of the ', PRIMARY KEY (`id`), KEY `idx_uuid` (`uuid`), KEY `idx_start_time_stamp` (`start_time_stamp`) // Indexes ,) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin COMMENT='p2p Call log details table ';
Suppose we want to query the deep page SQL Long like this
select * from p2p_detail_record ppdr where ppdr .start_time_stamp >1656666798000 limit 0,2000

The query efficiency is 94ms, Isn't it fast ? Then if we limit 100000,2000 Well , The query efficiency is 1.5S, It's already very slow , What if there are more ?

2、sql Cause analysis of slow
Let's take a look at this sql Implementation plan of

Also came to the index , Then why is it still slow ? Let's review mysql Relevant knowledge points of .
Clustered index and non clustered index
Cluster index : The leaf node stores the entire row of data .
Nonclustered index : The leaf node stores the primary key value corresponding to the data of the whole row .

The process of using non clustered index queries
Through the non clustered index tree , Find the corresponding leaf node , Get the value of the primary key .
Then get the value of the primary key , go back to Clustered index tree , Find the corresponding whole row of data .( The whole process is called back to the table )
Back to this sql Why is it slow , Here's why
1、limit Statement will scan offset+n That's ok , And then throw it away offset That's ok , After the return n Row data . in other words limit 100000,10, It will scan 100010 That's ok , and limit 0,10, Just scan 10 That's ok . Here we need to go back to the table 100010 Time , A lot of time is spent on the back table .
The core idea of the scheme : Can you know in advance from which primary key ID Start , Reduce the number of times to return to the table
Common solutions
Optimize through subqueries
select * from p2p_detail_record ppdr where id >= (select id from p2p_detail_record ppdr2 where ppdr2 .start_time_stamp >1656666798000 limit 100000,1) limit 2000
Same query results , It's also 10W At the beginning of article 2000 strip , The query efficiency is 200ms, Is it a lot faster .

Label recording method
Label recording method : Actually, mark which one you found last time , Next time we check , Scan down from this bar . Similar to the role of bookmarks
select * from p2p_detail_record ppdrwhere ppdr.id > 'bb9d67ee6eac4cab9909bad7c98f54d4'order by id limit 2000 remarks :bb9d67ee6eac4cab9909bad7c98f54d4 It is the last result of the last query ID
Use label recording , The performance will be very good , Because I hit id Indexes . But there are several ways shortcoming .
1、 Only consecutive pages can be queried , Cannot query across pages .
2、 Need a similar Continuous autoincrement Field of ( have access to orber by id The way ).
Scheme comparison
Use Optimize through subqueries The way
advantage : Cross page query , Check the data on the page you want to check .
shortcoming : Not as efficient as Label recording method . reason : For example, you need to check 10W After data , The first 1000 strip , You also need to query the corresponding non clustered index 10W1000 Data , In the second place 10W At the beginning ID, The query .
Use Label recording method The way
advantage : Query efficiency is very stable , Very fast .
shortcoming :
Do not cross page query ,
Need a similar Continuous autoincrement Field of
On the second point : This point is generally easy to solve , You can use any non repeating field to sort . If you use May repeat The field of sorting , because mysql The sorting of fields with the same value is unordered , If it happens to be paging , The same data may exist in the previous and next pages .
Practical cases
demand : You need to query the amount of data in a certain period , Suppose there are hundreds of thousands of data that need to be queried , Do something .
Demand analysis 1、 Batch query ( Paging query ), Design deep paging problem , Resulting in slower efficiency .
CREATE TABLE `p2p_detail_record` ( `id` varchar(32) COLLATE utf8mb4_bin NOT NULL DEFAULT '' COMMENT ' Primary key ', `batch_num` int NOT NULL DEFAULT '0' COMMENT ' Number of reports ', `uptime` bigint NOT NULL DEFAULT '0' COMMENT ' Reporting time ', `uuid` varchar(64) COLLATE utf8mb4_bin NOT NULL DEFAULT '' COMMENT ' meeting id', `start_time_stamp` bigint NOT NULL DEFAULT '0' COMMENT ' Starting time ', `answer_time_stamp` bigint NOT NULL DEFAULT '0' COMMENT ' Response time ', `end_time_stamp` bigint NOT NULL DEFAULT '0' COMMENT ' End time ', `duration` int NOT NULL DEFAULT '0' COMMENT ' The duration of the ', PRIMARY KEY (`id`), KEY `idx_uuid` (`uuid`), KEY `idx_start_time_stamp` (`start_time_stamp`) // Indexes ,) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin COMMENT='p2p Call log details table ';
Pseudo code implementation :
// Minimum ID String lastId = null; // Number of entries on a page Integer pageSize = 2000; List<P2pRecordVo> list ;do{ list = listP2pRecordByPage(lastId,pageSize); // Label recording method , Record the last query Id lastId = list.get(list.size()-1).getId(); // Get the last of the last query data ID, Used to record // Operation logic of data XXXXX(); }while(isNotEmpty(list)); <select id ="listP2pRecordByPage"> select * from p2p_detail_record ppdr where 1=1 <if test = "lastId != null"> and ppdr.id > #{lastId} </if> order by id asc limit #{pageSize}</select>Here's a small optimization point : Some people may sort all the data first , Get the smallest ID, But this sort all the data , Then go to min(id), It takes a long time , In fact, the first query , No lastId The query , The same is true of the query results . Faster .
That's all “ How to solve mysql Deep paging problem ” All the content of this article , Thank you for reading ! I believe you will gain a lot after reading this article , Xiaobian will update different knowledge for you every day , If you want to learn more , Please pay attention to the Yisu cloud industry information channel .
边栏推荐
- D2dengine edible tutorial (4) -- draw text
- 2022-07-27:小红拿到了一个长度为N的数组arr,她准备只进行一次修改, 可以将数组中任意一个数arr[i],修改为不大于P的正数(修改后的数必须和原数不同), 并使得所有数之和为X的倍数。
- AIRIOT答疑第6期|如何使用二次开发引擎?
- 接口自动化测试,完整入门篇
- 「运维有小邓」网络设备监控
- 动态规划——416. 分割等和子集
- 光年(Light Year Admin)后台管理系统模板
- "Xiaodeng" network equipment monitoring in operation and maintenance
- 【OPENVX】对象基本使用之vx_pyramid
- Shell: one click deployment PXE
猜你喜欢

贪心——53. 最大子数组和

LabVIEW加载和使用树型控件项目中的定制符号

An article grasps the calculation and processing of date data in PostgreSQL

如何让外网访问内网IP(esp8266网页使用)

如何解决mysql深分页问题

动态内存管理中的malloc、free、calloc、realloc动态内存开辟函数

Redis persistence mechanism

Airiot Q & A issue 6 | how to use the secondary development engine?

"Xiaodeng" network equipment monitoring in operation and maintenance

单调栈——42. 接雨水——面大厂必须会的困难题
随机推荐
Malloc, free, calloc, realloc dynamic memory development functions in dynamic memory management
Summary of redis classic interview questions
Weekly recommended short video: how to correctly understand the word "lean"?
Log analysis tool (Splunk)
Mouse operation and response
TypeError: ufunc ‘bitwise_and‘ not supported for the input types, and the inputs could not be safely
A treasure simulates login and reduces the method of secondary verification
贪心——122. 买卖股票的最佳时机 II
服务器内存故障预测居然可以这样做!
xctf攻防世界 Web高手进阶区 PHP2
【论文笔记】基于深度学习的移动机器人自主导航实验平台
redis源码分析(谁说C语言就不能分析了?)
超好看的Nteam官网PHP程序源码
SAP UI5 FileUploader 控件深入介绍 - 为什么需要一个隐藏的 iframe 试读版
LabVIEW加载和使用树型控件项目中的定制符号
Acid characteristics of MySQL transactions and example analysis of concurrency problems
最新版宝塔安装zip扩展,php -m 不显示的处理方法
Win11 how to rename an audio device
Airiot Q & A issue 6 | how to use the secondary development engine?
Version compatibility issues