当前位置:网站首页>[MySQL] MySQL million level data paging query method and its optimization
[MySQL] MySQL million level data paging query method and its optimization
2022-06-26 05:25:00 【weixin_ forty-three million two hundred and twenty-four thousan】
Method 1: Use the database directly SQL sentence
Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table LIMIT M,N
Adaptive scene : It is suitable for small amount of data ( Tuples hundred / Thousand level )
reason / shortcoming : Full table scan , It's going to be slow And Some database result sets return unstable ( Like a return 1,2,3, Another return 2,1,3). Limit The limitation is from the result set M Take it out of position N Bar output , The rest is abandoned .
Method 2: Create a primary key or unique index , Using index ( Suppose that every page 10 strip )
Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table WHERE id_pk > (pageNum*10) LIMIT M
Adaptive scene : It is suitable for large amount of data ( There are tens of thousands of tuples )
reason : An index scan , It's going to be fast . Put forward by a friend : Because the data is not searched according to pk_id Sort of , So there will be cases of missing data , It can only be done by 3
Method 3: Reorder based on Index
Sentence style : MySQL in , The following methods can be used : SELECT * FROM The name of the table WHERE id_pk > (pageNum*10) ORDER BY id_pk ASC LIMIT M
Adaptive scene : It is suitable for large amount of data ( There are tens of thousands of tuples ). best ORDER BY The next column object is Primary key Or only so , bring ORDERBY The operation can be eliminated by using the index, but the result set is stable ( What stability means , See method 1)
reason : An index scan , It's going to be fast . but MySQL The sorting operation of , Only ASC No, DESC(DESC It's fake , The future will do real DESC, expect …).
Method 4: Use... Based on index prepare
The first question mark indicates pageNum, the second ? Represents the number of tuples per page
Sentence style : MySQL in , The following methods can be used : PREPARE stmt_name FROM SELECT * FROM The name of the table WHERE id_pk > (?* ?) ORDER BY id_pk ASC LIMIT M
Adaptive scene : big data The amount
reason : An index scan , It's going to be fast . prepare Statement is a little faster than the general query statement .
Method 5: utilize MySQL Support ORDER Operations can use the index to quickly locate partial tuples , Avoid full table scanning
such as : Read 1000 To 1019 Row tuple (pk It's the primary key / The only key ).
SELECT * FROM your_table WHERE pk>=1000 ORDER BY pk ASC LIMIT 0,20
- 1
Method 6: utilize “ Subquery / Connect + Indexes ” Quickly locate the position of a tuple , And then read tuples .
such as (id It's the primary key / The only key , Blue font is variable )
Use the subquery example :
SELECT * FROM your_table WHERE id <=
(SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize ORDER BY id desc
LIMIT $pagesize
- 1
- 2
- 3
Using the connection example :
SELECT * FROM your_table AS t1
JOIN (SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize AS t2
WHERE t1.id <= t2.id ORDER BY t1.id desc LIMIT $pagesize;
- 1
- 2
- 3
mysql Big data usage limit Pagination , As the page number increases , The less efficient the query is .
Test experiment
- Direct use limit start, count Paging statement , It's also the method I use in my program :
select * from product limit start, count
- 1
When the start page is small , There is no performance problem with the query , Let's look at it separately from 10, 100, 1000, 10000 Start paging execution time ( Take every page 20 strip ).
as follows :
select * from product limit 10, 20 0.016 second
select * from product limit 100, 20 0.016 second
select * from product limit 1000, 20 0.047 second
select * from product limit 10000, 20 0.094 second
- 1
- 2
- 3
- 4
We've seen that as the starting record increases , Time is also increasing , This explains paging statements limit It has a lot to do with the starting page number , So let's change the starting record to 40w look down ( That is to say, the general record is about )
select * from product limit 400000, 20 3.229 second
- 1
Let's take a look at the time when we took the last page of records
select * from product limit 866613, 20 37.44 second
- 1
It's obvious that this kind of time can't be tolerated for pages with the largest page size .
We can also sum up two things :
limit The query time of the statement is proportional to the position of the starting record
mysql Of limit Sentences are very convenient , But the tables with many records are not suitable for direct use .
2. Yes limit Performance optimization methods for paging problems
Use table coverage index to speed up paging query
We all know , If only that index column is included in the statement using index query ( Overlay index ), Then this situation will be inquired soon .
Because there is an optimization algorithm to use index search , And the data is on the query index , You don't have to look for the relevant data address anymore , This saves a lot of time . in addition Mysql There are also related index caches in , When the concurrency is high, it is better to use cache .
In our case , We know id Field is primary key , Naturally, it contains the default primary key index . Now let's take a look at the effect of using an overlay index .
This time we look up the data on the last page ( Use overlay index , Contains only id Column ), as follows :
select id from product limit 866613, 20 0.2 second
- 1
Relative to the 37.44 second , It's about 100 Multiple speed
So if we also want to query all columns , There are two ways , One is id>= In the form of , The other is to use join, Take a look at the actual situation :
SELECT * FROM product WHERE ID > =(select id from product limit 866613, 1) limit 20
- 1
The query time is 0.2 second !
Another way of writing
SELECT * FROM product a JOIN (select id from product limit 866613, 20) b ON a.ID = b.id
- 1
The query time is also very short !
3. Composite index optimization method
MySql How high the performance can be ?MySql This database is absolutely suitable for dba Level master to play , Usually do a little 1 You can write 10000 news in a small system , use xx Frameworks can be developed quickly . But the amount of data has arrived 10 ten thousand , Millions to tens of millions , Is his performance still that high ? A little mistake , It may cause the whole system to be rewritten , Even worse, the system can't work properly ! Okay , Not so much nonsense .
Speak with facts , Look at examples :
Data sheet collect (id, title ,info ,vtype) Is this 4 A field , among title Use fixed length ,info use text, id It's gradual ,vtype yes tinyint,vtype It's the index . This is a simple model of a basic news system . Now fill in the data , fill 10 Ten thousand news . Last collect by 10 Ten thousand records , Database tables take up hard 1.6G.
OK , Look at this one below sql sentence :
select id,title from collect limit 1000,10;
- 1
Soon ; Basically 0.01 In seconds OK, Look at the following
select id,title from collect limit 90000,10;
- 1
from 9 Ten thousand start to page , result ?
8-9 Seconds to complete ,my god What's wrong ? Actually, to optimize this data , You can find the answer on the Internet . Look at the following sentence :
select id from collect order by id limit 90000,10;
- 1
Soon ,0.04 In seconds OK. Why? ? Because in the id Of course, it's fast to index the primary key . The change on the Internet is :
select id,title from collect where id>=(select id from collect order by id limit 90000,1) limit 10;
- 1
That's how it works id The result of the index . But the problem is a little complicated , It's over . Look at the following sentence
select id from collect where vtype=1 order by id limit 90000,10; Very slowly , It was used 8-9 second !
I believe that many people will be like me here , There's a sense of collapse !vtype It's indexed ? How can it be slow ?vtype It's good to index , Your direct
select id from collect where vtype=1 limit 1000,10;
- 1
It's very fast , Basically 0.05 second , But improving 90 times , from 9 Ten thousand starts , That's it 0.05*90=4.5 At a rate of one second . And test results 8-9 Seconds to an order of magnitude .
From here on, someone put forward the idea of sub table , This and dis #cuz Forum is the same idea . Ideas as follows :
Build an index table : t (id,title,vtype) And set it to a fixed length , And then do pagination , Page out the results and go to collect Go inside info . Is it feasible ? We'll see in the experiment .
10 Ten thousand records to t(id,title,vtype) in , Data table size 20M about . use
select id from t where vtype=1 order by id limit 90000,10;
- 1
soon . Basically 0.1-0.2 You can run it in seconds . Why is this so ? I guess it's because collect Too much data , So it's a long way to go .limit It's all about the size of the data table . In fact, it's still full table scanning , Just because of the small amount of data , Only 10 Ten thousand talents are quick .OK, Let's do a crazy experiment , Add to 100 Ten thousand , Test performance . added 10 Times the data , immediately t Here's the watch 200 many M, And it's fixed length . It's the same query statement , Time is 0.1-0.2 Seconds to complete ! There is no problem with the performance of the sub meter ?
wrong ! Because of our limit still 9 ten thousand , So come on . Give me a big one ,90 Ten thousand starts
select id from t where vtype=1 order by id limit 900000,10;
- 1
Look at the results , Time is 1-2 second !why ?
It's still such a long time , Very depressed ! It is said that the growth will be improved limit Performance of , At first I thought , Because the length of a record is fixed ,mysql It should be possible to work out 90 Wan's position is right ? But we overestimate mysql The intelligence of , He's not a business database , It has been proved that fixed length and non fixed length are right limit The impact is not big ? No wonder someone said discuz here we are 100 Ten thousand records will be slow , I believe it's true , This is about database design !
Don't MySQL Can't break through 100 The limit of ten thousand ??? here we are 100 Ten thousand pages is really to the limit ?
The answer is : NO Why can't we break through 100 It's because I can't design mysql Caused by the . The following is the non - sub table method , A crazy test ! A list is done 100 Ten thousand records , also 10G database , How to quickly paginate !
Okay , Our test goes back to collect surface , At the beginning of the test, the conclusion is :
30 All the data , It is feasible to use the sub table method , exceed 30 You can't stand it ! Of course, if you use the sub table + I don't know this way , It's absolutely perfect . But in my way , It can be perfectly solved without sub table !
The answer is : Composite index ! There was a design mysql When indexing , Inadvertently found that the index name can be arbitrary , You can select a few fields to come in , What's the use ?
At the beginning
select id from collect order by id limit 90000,10;
- 1
It's so fast because of the index , But if you add where No index . With the idea of having a try, I added search(vtype,id) Such an index .
Then test
select id from collect where vtype=1 limit 90000,10;
- 1
Very fast !0.04 Seconds to complete !
Retest :
select id ,title from collect where vtype=1 limit 90000,10;
- 1
Very regret ,8-9 second , Didn't go search Indexes !
Retest :search(id,vtype), still select id This statement , And I'm sorry ,0.5 second .
Sum up : If there is where Conditions , I want to go again limit Of , You have to design an index , take where First place ,limit The primary key used is put in the second place 2 position , And only select Primary key !
Perfect solution to the paging problem . Quick return id There is hope to optimize limit , According to this logic , Millions of limit belong 0.0x You can finish in seconds . It seems mysql Statement optimization and indexing are very important !
边栏推荐
- Yunqi lab recommends experience scenarios this week, free cloud learning
- Supplementary course on basic knowledge of IM development (II): how to design a server-side storage architecture for a large number of image files?
- Sofa weekly | open source person - Yu Yu, QA this week, contributor this week
- Leetcode513.找出树的左下角的值
- First day of deep learning and tensorflow learning
- Tp5.0 framework PDO connection MySQL error: too many connections solution
- 第九章 设置结构化日志记录(一)
- CMakeLists. txt Template
- LSTM in tensorflow_ Layers actual combat
- 9 common classes
猜你喜欢

Redis usage and memory optimization
Briefly describe the pitfalls of mobile IM development: architecture design, communication protocol and client

Learn from small samples and run to the sea of stars
Protocol selection of mobile IM system: UDP or TCP?

The beautiful scenery is natural, and the wonderful pen is obtained by chance -- how is the "wonderful pen" refined?

《财富自由之路》读书之一点体会

localStorage浏览器本地储存,解决游客不登录的情况下限制提交表单次数。

Tp5.0框架 PDO连接mysql 报错:Too many connections 解决方法

FastAdmin Apache下设置伪静态

Learn cache lines and pseudo sharing of JVM slowly
随机推荐
Windows下安装Tp6.0框架,图文。Thinkphp6.0安装教程
MySQL source code reading (II) login connection debugging
Supplementary course on basic knowledge of IM development (II): how to design a server-side storage architecture for a large number of image files?
Redis usage and memory optimization
Leetcode114. 二叉树展开为链表
AutowiredAnnotationBeanPostProcessor什么时候被实例化的?
Introduction to alluxio
SDN based DDoS attack mitigation
GD32F3x0 官方PWM驱动正频宽偏小(定时不准)的问题
Chapter 9 setting up structured logging (I)
电机专用MCU芯片LCM32F037系列内容介绍
Experience of reading the road to wealth and freedom
Uni app ceiling fixed style
First day of deep learning and tensorflow learning
二次bootloader关于boot28.asm应用的注意事项,28035的
ECCV 2020 double champion team, take you to conquer target detection on the 7th
无线网络存在的安全问题及现代化解决方案
AD教程系列 | 4 - 创建集成库文件
cartographer_ pose_ graph_ 2d
thread priority