当前位置:网站首页>Issue 42: is it necessary for MySQL to have multiple column partitions
Issue 42: is it necessary for MySQL to have multiple column partitions
2022-06-29 17:08:00 【Aikesheng open source community】
In the previous chapters, we discussed partitioned tables based on single column , Is it necessary to create a partitioned table based on multiple columns ? Whether the partition table data is evenly distributed ? Are there any special application scenarios ? Is there any special optimization strategy ? This article focuses on the interpretation based on these questions .
MySQL Not only supports single column partition , It also supports partitioning based on multiple columns . For example, field based (f1,f2,f3) To create partition tables , The usage method and usage scenario are somewhat similar to the joint index . For example, the following query statement , Simultaneous alignment of columns (f1,f2,f3) To filter .
select * from p1 where f1 = 2 and f2 = 2 and f3 = 2;The premise of a multi column partitioned table is that the columns participating in the partition have the same retrieval frequency , If it is not equal , There is no need to use multi column partitions .
Let's use specific examples to verify the advantages, disadvantages and applicable scenarios of multi column partitions , This is a more thorough understanding .
Create a table p1, Field r1,r2,r3 The values are respectively 1-8,1-5,1-5.
create table p1(r1 int,r2 int,r3 int,log_date datetime);According to the field (r1,r2,r3) Distribution range of , Let me write a stored procedure to handle the following table p1, Become a partitioned table . The stored procedure code is as follows :
DELIMITER $$
USE `ytt_new`$$
DROP PROCEDURE IF EXISTS `sp_add_partition_ytt_new_p1`$$
CREATE DEFINER=`root`@`%` PROCEDURE `sp_add_partition_ytt_new_p1`()
BEGIN
DECLARE i,j,k INT UNSIGNED DEFAULT 1;
SET @stmt = '';
SET @stmt_begin = 'ALTER TABLE p1 PARTITION BY RANGE COLUMNS (r1,r2,r3)(';
WHILE i <= 8 DO
set j = 1;
while j <= 5 do
set k = 1;
while k <= 5 do
SET @stmt = CONCAT(@stmt,' PARTITION p',i,j,k,' VALUES LESS THAN (',i,',',j,',',k,'),');
set k = k + 1;
end while;
set j = j + 1;
end while;
SET i = i + 1;
END WHILE;
SET @stmt_end = 'PARTITION p_max VALUES LESS THAN (maxvalue,maxvalue,maxvalue))';
SET @stmt = CONCAT(@stmt_begin,@stmt,@stmt_end);
PREPARE s1 FROM @stmt;
EXECUTE s1;
DROP PREPARE s1;
SET @stmt = NULL;
SET @stmt_begin = NULL;
SET @stmt_end = NULL;
END$$
DELIMITER ;Calling stored procedure , Change form p1 Partition tables for multiple columns , At this point, the table p1 Yes 201 Zones , The record number is 500W strip .
mysql> call sp_add_partition_ytt_new_p1;
Query OK, 0 rows affected (14.89 sec)
mysql> select count(partition_name) as partition_count from information_schema.partitions where table_schema = 'ytt_new' and table_name ='p1';
+-----------------+
| partition_count |
+-----------------+
| 201 |
+-----------------+
1 row in set (0.00 sec)
mysql> select count(*) from p1;
+----------+
| count(*) |
+----------+
| 5000000 |
+----------+
1 row in set (12.01 sec)Create a partition table in the same way p2, To compare the performance of a single column partitioned table and a multi column partitioned table in some scenarios :
Partition table p2 According to the field r1 Partition , Only divided 9 individual .
mysql> CREATE TABLE `p2` (
`r1` int DEFAULT NULL,
`r2` int DEFAULT NULL,
`r3` int DEFAULT NULL,
`log_date` datetime DEFAULT NULL
) ENGINE=InnoDB
PARTITION BY RANGE COLUMNS(r1)
(PARTITION p1 VALUES LESS THAN (1) ,
PARTITION p2 VALUES LESS THAN (2) ,
PARTITION p3 VALUES LESS THAN (3) ,
PARTITION p4 VALUES LESS THAN (4) ,
PARTITION p5 VALUES LESS THAN (5) ,
PARTITION p6 VALUES LESS THAN (6) ,
PARTITION p7 VALUES LESS THAN (7) ,
PARTITION p8 VALUES LESS THAN (8) ,
PARTITION p_max VALUES LESS THAN (MAXVALUE)
)
1 row in set (0.00 sec)
mysql> insert into p2 select * from p1;
Query OK, 5000000 rows affected (1 min 37.92 sec)
Records: 5000000 Duplicates: 0 Warnings: 0Performance comparison of equivalent filtering of multiple fields : The same query condition , surface p1( execution time 0.02 second ) Than p2( execution time 0.49 second ) Dozens of times faster .
mysql> select count(*) from p1 where r1 = 2 and r2 = 2 and r3 = 2;
+----------+
| count(*) |
+----------+
| 24992 |
+----------+
1 row in set (0.02 sec)
mysql> select count(*) from p2 where r1 = 2 and r2 = 2 and r3 = 2;
+----------+
| count(*) |
+----------+
| 24992 |
+----------+
1 row in set (0.49 sec)View the comparison between the two execution plans : Same query , surface p1 The number of scan lines is only 2W many , And tables p2 The number of scanning lines is 62W That's ok , There's a huge difference .
mysql> explain select count(*) from p1 where r1 = 2 and r2 = 2 and r3 = 2\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p1
partitions: p223
type: ALL
...
rows: 24711
filtered: 0.10
Extra: Using where
1 row in set, 1 warning (0.00 sec)
mysql> explain select count(*) from p2 where r1 = 2 and r2 = 2 and r3 = 2\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p2
partitions: p3
type: ALL
...
rows: 623239
filtered: 0.10
Extra: Using where
1 row in set, 1 warning (0.00 sec)What if the filter fields are incomplete ? For example, do not retrieve the last column , Make a comparison again : The same table p1(0.1 second ) Comparison table p2(0.52 second ) Several times less execution time .
mysql> select count(*) from p1 where r1 = 2 and r2 = 2;
+----------+
| count(*) |
+----------+
| 124649 |
+----------+
1 row in set (0.10 sec)
mysql> select count(*) from p2 where r1 = 2 and r2 = 2;
+----------+
| count(*) |
+----------+
| 124649 |
+----------+
1 row in set (0.52 sec)The first column is only searched : This time p1 and p2 The execution time is about the same ,p2 Slightly dominant .
mysql> select count(*) from p1 where r1 = 2 ;
+----------+
| count(*) |
+----------+
| 624599 |
+----------+
1 row in set (0.56 sec)
mysql> select count(*) from p2 where r1 = 2 ;
+----------+
| count(*) |
+----------+
| 624599 |
+----------+
1 row in set (0.45 sec)Take a look at the execution plan comparison : surface p1 The number of partitions scanned is 26 individual , surface p2 Scan only 1 Zones , The number of partitions is shown in the table above p2 A lot less .
mysql> explain select count(*) from p1 where r1 = 2 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p1
partitions: p211,p212,p213,p214,p215,p221,p222,p223,p224,p225,p231,p232,p233,p234,p235,p241,p242,p243,p244,p245,p251,p252,p253,p254,p255,p311
type: ALL
...
rows: 648074
filtered: 10.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
mysql> explain select count(*) from p2 where r1 = 2 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p2
partitions: p3
type: ALL
...
rows: 623239
filtered: 10.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)If the field r1 Take it off ? The execution time is almost the same , surface p1 And table p2 Will scan all partitions .
mysql> select count(*) from p1 where r2 = 2;
+----------+
| count(*) |
+----------+
| 998700 |
+----------+
1 row in set (3.87 sec)
mysql> select count(*) from p2 where r2 = 2;
+----------+
| count(*) |
+----------+
| 998700 |
+----------+
1 row in set (3.75 sec)In view of this , Let's discuss another question : For multi column partitions , Whether the order of the fields is important ?
This order should be explained one by one with the filter conditions corresponding to our query statements . Similar to the following two categories SQL :
SQL 1: select * from p1 where r1 = 2 and r2 = 2 and r3 = 2;about SQL 1, Order doesn't matter , Because all three columns have been included in the query ;
SQL 2: select * from p1 where r1 = 2 and r2 = 2;about SQL 2 , (r1,r2,r3) and (r2,r1,r3) Can satisfy .
SQL 3: select * from p1 where r2 = 2 and r3 = 2;about SQL 3, (r2,r3,r1) and (r3,r2,r1) Also can satisfy .
Create partition tables in the same way p3, The partition field order is (r2,r3,r1):
mysql> show create table p3\G
*************************** 1. row ***************************
Table: p3
Create Table: CREATE TABLE `p3` (
`r1` int DEFAULT NULL,
`r2` int DEFAULT NULL,
`r3` int DEFAULT NULL,
`log_date` datetime DEFAULT NULL
) ENGINE=InnoDB
/*!50500 PARTITION BY RANGE COLUMNS(r2,r3,r1)
(PARTITION p111 VALUES LESS THAN (1,1,1) ENGINE = InnoDB,
...For tables p3 Speaking of : The next one SQL Execution time ratio table p1 Dozens of times faster , Due to the different order of partition fields , surface p1 You need to scan all partitions to get results .
mysql> select count(*) from p3 where r2 = 1 and r3 = 4 ;
+----------+
| count(*) |
+----------+
| 199648 |
+----------+
1 row in set (0.22 sec)
mysql> select count(*) from p1 where r2 = 1 and r3 = 4 ;
+----------+
| count(*) |
+----------+
| 199648 |
+----------+
1 row in set (5.05 sec)So for a multi column partitioned table , As we said at the beginning , It and how to use the union index 、 matters needing attention 、 The usage scenarios are similar . For certain scenarios , Using multi column partitioning can significantly improve query performance .
边栏推荐
- Why is informatization ≠ digitalization? Finally someone made it clear
- Picture and text show you how to thoroughly understand the atomicity of MySQL transaction undolog
- 解题元宇宙,网络游戏中的多元通信方案
- 「科普大佬说」AI与创造力
- 【Oracle】基础知识面试题
- 从居家办公中感悟适配器模式 | 社区征文
- 关于KALI使用xshell连接
- Flutter technology and Practice (1)
- controller、service、dao之间的关系
- 2022年软件评测师考试大纲
猜你喜欢

如何利用OpenMesh实现不同格式的3D文件间的转换

基于C语言开发实现的一个用户级线程库

最高81.98%!超百所“双一流”高校本科深造率公布

使用kalibr標定工具進行單目相機和雙目相機的標定

0基础自学STM32(野火)——使用寄存器点亮LED——GPIO功能框图讲解

ICML 2022 | 基于解耦梯度优化的可迁移模仿学习方法

About xampp unable to start MySQL database

After eight years of testing and opening experience and interview with 28K company, hematemesis sorted out high-frequency interview questions and answers

SpingMVC请求和响应

After reading the complete code
随机推荐
Calibration of monocular camera and binocular camera with kalibr calibration tool
ICML 2022 | 基于解耦梯度优化的可迁移模仿学习方法
研究所的这些优势真香!上岸率还极高!
微信小程序开发储备知识
如何创建虚拟形象
如何在 PowerPoint 中向幻灯片添加 SmartArt?
Actual combat | magical conic gradient
flink sql rownumber 报错。谁遇到过啊?怎么解决?
High landing pressure of "authorization and consent"? Privacy computing provides a possible compliance "technical solution"
基于汇编实现的流载体的LSB隐藏项目
epoll分析
【 OpenGL 】 Random Talk 1. The camera rotates around a point in the space by dragging the mouse
卷妹带你学数据库---5天冲刺Day1
Why does selenium become the first choice for web automated testing? (source code attached)
腾讯云发布CDW ClickHouse升级版,为海量数据实时分析场景提供极速体验
为什么信息化 ≠ 数字化?终于有人讲明白了
535. TinyURL 的加密与解密 / 剑指 Offer II 103. 最少的硬币数目
机器人不需要保养和出界也能拿金牌是一样一样的
PCB板框的绘制——AD19
What is the follow-up plan of infotnews | meta in the metauniverse?