当前位置:网站首页>Issue 42: is it necessary for MySQL to have multiple column partitions
Issue 42: is it necessary for MySQL to have multiple column partitions
2022-06-29 17:44:00 【ActionTech】

In the previous chapters, we discussed partitioned tables based on single column , Is it necessary to create a partitioned table based on multiple columns ? Whether the partition table data is evenly distributed ? Are there any special application scenarios ? Is there any special optimization strategy ? This article focuses on the interpretation based on these questions .
MySQL Not only supports single column partition , It also supports partitioning based on multiple columns . For example, field based (f1,f2,f3) To create partition tables , The usage method and usage scenario are somewhat similar to the joint index . For example, the following query statement , Simultaneous alignment of columns (f1,f2,f3) To filter .
select * from p1 where f1 = 2 and f2 = 2 and f3 = 2;
The premise of a multi column partitioned table is that the columns participating in the partition have the same retrieval frequency , If it is not equal , There is no need to use multi column partitions .
Let's use specific examples to verify the advantages, disadvantages and applicable scenarios of multi column partitions , This is a more thorough understanding .
Create a table p1, Field r1,r2,r3 The values are respectively 1-8,1-5,1-5.
create table p1(r1 int,r2 int,r3 int,log_date datetime);
According to the field (r1,r2,r3) Distribution range of , Let me write a stored procedure to handle the following table p1, Become a partitioned table . The stored procedure code is as follows :
DELIMITER $$
USE `ytt_new`$$
DROP PROCEDURE IF EXISTS `sp_add_partition_ytt_new_p1`$$
CREATE DEFINER=`root`@`%` PROCEDURE `sp_add_partition_ytt_new_p1`()
BEGIN
DECLARE i,j,k INT UNSIGNED DEFAULT 1;
SET @stmt = '';
SET @stmt_begin = 'ALTER TABLE p1 PARTITION BY RANGE COLUMNS (r1,r2,r3)(';
WHILE i <= 8 DO
set j = 1;
while j <= 5 do
set k = 1;
while k <= 5 do
SET @stmt = CONCAT(@stmt,' PARTITION p',i,j,k,' VALUES LESS THAN (',i,',',j,',',k,'),');
set k = k + 1;
end while;
set j = j + 1;
end while;
SET i = i + 1;
END WHILE;
SET @stmt_end = 'PARTITION p_max VALUES LESS THAN (maxvalue,maxvalue,maxvalue))';
SET @stmt = CONCAT(@stmt_begin,@stmt,@stmt_end);
PREPARE s1 FROM @stmt;
EXECUTE s1;
DROP PREPARE s1;
SET @stmt = NULL;
SET @stmt_begin = NULL;
SET @stmt_end = NULL;
END$$
DELIMITER ;
Calling stored procedure , Change form p1 Partition tables for multiple columns , At this point, the table p1 Yes 201 Zones , The record number is 500W strip .
mysql> call sp_add_partition_ytt_new_p1;
Query OK, 0 rows affected (14.89 sec)
mysql> select count(partition_name) as partition_count from information_schema.partitions where table_schema = 'ytt_new' and table_name ='p1';
+-----------------+
| partition_count |
+-----------------+
| 201 |
+-----------------+
1 row in set (0.00 sec)
mysql> select count(*) from p1;
+----------+
| count(*) |
+----------+
| 5000000 |
+----------+
1 row in set (12.01 sec)
Create a partition table in the same way p2, To compare the performance of a single column partitioned table and a multi column partitioned table in some scenarios :
Partition table p2 According to the field r1 Partition , Only divided 9 individual .
mysql> CREATE TABLE `p2` (
`r1` int DEFAULT NULL,
`r2` int DEFAULT NULL,
`r3` int DEFAULT NULL,
`log_date` datetime DEFAULT NULL
) ENGINE=InnoDB
PARTITION BY RANGE COLUMNS(r1)
(PARTITION p1 VALUES LESS THAN (1) ,
PARTITION p2 VALUES LESS THAN (2) ,
PARTITION p3 VALUES LESS THAN (3) ,
PARTITION p4 VALUES LESS THAN (4) ,
PARTITION p5 VALUES LESS THAN (5) ,
PARTITION p6 VALUES LESS THAN (6) ,
PARTITION p7 VALUES LESS THAN (7) ,
PARTITION p8 VALUES LESS THAN (8) ,
PARTITION p_max VALUES LESS THAN (MAXVALUE)
)
1 row in set (0.00 sec)
mysql> insert into p2 select * from p1;
Query OK, 5000000 rows affected (1 min 37.92 sec)
Records: 5000000 Duplicates: 0 Warnings: 0
Performance comparison of equivalent filtering of multiple fields : The same query condition , surface p1( execution time 0.02 second ) Than p2( execution time 0.49 second ) Dozens of times faster .
mysql> select count(*) from p1 where r1 = 2 and r2 = 2 and r3 = 2;
+----------+
| count(*) |
+----------+
| 24992 |
+----------+
1 row in set (0.02 sec)
mysql> select count(*) from p2 where r1 = 2 and r2 = 2 and r3 = 2;
+----------+
| count(*) |
+----------+
| 24992 |
+----------+
1 row in set (0.49 sec)
View the comparison between the two execution plans : Same query , surface p1 The number of scan lines is only 2W many , And tables p2 The number of scanning lines is 62W That's ok , There's a huge difference .
mysql> explain select count(*) from p1 where r1 = 2 and r2 = 2 and r3 = 2\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p1
partitions: p223
type: ALL
...
rows: 24711
filtered: 0.10
Extra: Using where
1 row in set, 1 warning (0.00 sec)
mysql> explain select count(*) from p2 where r1 = 2 and r2 = 2 and r3 = 2\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p2
partitions: p3
type: ALL
...
rows: 623239
filtered: 0.10
Extra: Using where
1 row in set, 1 warning (0.00 sec)
What if the filter fields are incomplete ? For example, do not retrieve the last column , Make a comparison again : The same table p1(0.1 second ) Comparison table p2(0.52 second ) Several times less execution time .
mysql> select count(*) from p1 where r1 = 2 and r2 = 2;
+----------+
| count(*) |
+----------+
| 124649 |
+----------+
1 row in set (0.10 sec)
mysql> select count(*) from p2 where r1 = 2 and r2 = 2;
+----------+
| count(*) |
+----------+
| 124649 |
+----------+
1 row in set (0.52 sec)
The first column is only searched : This time p1 and p2 The execution time is about the same ,p2 Slightly dominant .
mysql> select count(*) from p1 where r1 = 2 ;
+----------+
| count(*) |
+----------+
| 624599 |
+----------+
1 row in set (0.56 sec)
mysql> select count(*) from p2 where r1 = 2 ;
+----------+
| count(*) |
+----------+
| 624599 |
+----------+
1 row in set (0.45 sec)
Take a look at the execution plan comparison : surface p1 The number of partitions scanned is 26 individual , surface p2 Scan only 1 Zones , The number of partitions is shown in the table above p2 A lot less .
mysql> explain select count(*) from p1 where r1 = 2 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p1
partitions: p211,p212,p213,p214,p215,p221,p222,p223,p224,p225,p231,p232,p233,p234,p235,p241,p242,p243,p244,p245,p251,p252,p253,p254,p255,p311
type: ALL
...
rows: 648074
filtered: 10.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
mysql> explain select count(*) from p2 where r1 = 2 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p2
partitions: p3
type: ALL
...
rows: 623239
filtered: 10.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
If the field r1 Take it off ? The execution time is almost the same , surface p1 And table p2 Will scan all partitions .
mysql> select count(*) from p1 where r2 = 2;
+----------+
| count(*) |
+----------+
| 998700 |
+----------+
1 row in set (3.87 sec)
mysql> select count(*) from p2 where r2 = 2;
+----------+
| count(*) |
+----------+
| 998700 |
+----------+
1 row in set (3.75 sec)
In view of this , Let's discuss another question : For multi column partitions , Whether the order of the fields is important ?
This order should be explained one by one with the filter conditions corresponding to our query statements . Similar to the following two categories SQL :
SQL 1: select * from p1 where r1 = 2 and r2 = 2 and r3 = 2;
about SQL 1, Order doesn't matter , Because all three columns have been included in the query ;
SQL 2: select * from p1 where r1 = 2 and r2 = 2;
about SQL 2 , (r1,r2,r3) and (r2,r1,r3) Can satisfy .
SQL 3: select * from p1 where r2 = 2 and r3 = 2;
about SQL 3, (r2,r3,r1) and (r3,r2,r1) Also can satisfy .
Create partition tables in the same way p3, The partition field order is (r2,r3,r1):
mysql> show create table p3\G
*************************** 1. row ***************************
Table: p3
Create Table: CREATE TABLE `p3` (
`r1` int DEFAULT NULL,
`r2` int DEFAULT NULL,
`r3` int DEFAULT NULL,
`log_date` datetime DEFAULT NULL
) ENGINE=InnoDB
/*!50500 PARTITION BY RANGE COLUMNS(r2,r3,r1)
(PARTITION p111 VALUES LESS THAN (1,1,1) ENGINE = InnoDB,
...
For tables p3 Speaking of : The next one SQL Execution time ratio table p1 Dozens of times faster , Due to the different order of partition fields , surface p1 You need to scan all partitions to get results .
mysql> select count(*) from p3 where r2 = 1 and r3 = 4 ;
+----------+
| count(*) |
+----------+
| 199648 |
+----------+
1 row in set (0.22 sec)
mysql> select count(*) from p1 where r2 = 1 and r3 = 4 ;
+----------+
| count(*) |
+----------+
| 199648 |
+----------+
1 row in set (5.05 sec)
So for a multi column partitioned table , As we said at the beginning , It and how to use the union index 、 matters needing attention 、 The usage scenarios are similar . For certain scenarios , Using multi column partitioning can significantly improve query performance .

边栏推荐
- R语言使用自定义函数编写深度学习线性激活函数、并可视化线性激活函数
- Face recognition 4- research on Baidu commercial solutions
- 第42期:MySQL 是否有必要多列分区
- 设置双击运行 jar 文件
- Split palindrome string [dp + DFS combination]
- Maidong Internet won the bid of Dajia Insurance Group
- SRM系统是什么系统?如何应用SRM系统?
- ISO 32000-2 国际标准7.7
- 剖析下零拷贝机制的实现原理,适用场景和代码实现
- Error:Connection refused: connect
猜你喜欢

Online sql to CSV tool

selenium 文件上传方法

mysql在linux中2003错误如何解决

Selenium file upload method

Selenium upload file

How to solve MySQL 1045 error in Linux

Inherit Chinese virtues, pay attention to the health of the middle-aged and the elderly, and Yurun milk powder has strong respect for the elderly

How to create a virtual image

剖析下零拷贝机制的实现原理,适用场景和代码实现

剑桥大学教授:经常吃早餐害处多,很危险 - 知乎
随机推荐
基于STM32F103ZET6库函数串口实验
The R language inputs the distance matrix to the hclust function for hierarchical clustering analysis. The method parameter specifies the distance calculation method between two combined data points,
Master slave replication of MySQL
[webdriver] upload files using AutoIT
How to create a virtual image
Set double click to run the jar file
剑桥大学教授:经常吃早餐害处多,很危险 - 知乎
剑指 Offer 13. 机器人的运动范围 (BFS)
Mysql中锁的使用场景是什么
sequential detector
What are the usage scenarios for locks in MySQL
Younger sister Juan takes you to learn JDBC - 2-day dash Day1
MATLAB 最远点采样(FPS)
mysql视图能不能创建索引
第42期:MySQL 是否有必要多列分区
L'intercepteur handlerinterceptor personnalisé permet l'authentification de l'utilisateur
Industry application of smart city based on GIS 3D visualization
SRM系统可以为企业带来什么价值?
R语言使用glm函数构建泊松对数线性回归模型处理三维列联表数据构建饱和模型、使用exp函数和coef函数获取模型所有变量的事件密度比(Incidence Density Ratio,IDR)并解读
Professor of Cambridge University: eating breakfast often is harmful and dangerous. - you know what