当前位置:网站首页>Database storage - table partition

Database storage - table partition

2022-07-07 08:43:00 Blue sky ⊙ white clouds

        With the development of the project , There are more and more single table data in the database , Related operations are getting slower , At this time, how can we improve our relevant operation efficiency ? Many people have heard of Sub database and sub table , But there is another way of partitioning that may be ignored , When the data volume has not reached the level of tens of millions , We may also be able to use partitioning , Let the data of a table be distributed on different files , Of course, we should be clear that our table data is stored on the disk in the form of files , Partition and sub table mean different things , Partitioning refers to distributing the data of a table to different files according to conditions , Before partitioning, it is stored on a file , But it still points to the same table , Just spread the data to different files , But the sub table spreads the data to different tables , Although the structure is the same , But the name of the table has changed . Partitioning helps us reduce the data per operation , To improve performance .

        Encounter the problem of such a large amount of data , We can solve it through the following ideas :

       1. shunt ( The principle is : Try to reduce the cardinality of data for each operation ):

       1.1. With and without 、 Commonly used and infrequently used are separated .

       1.2. Data stored in the database : Partition 、 sub-treasury 、 table .

       1.3. Data stored in files : Open the file .

       1.4. Consider batching .

       2. Cache technology : Read more write less cache .

       3. Database optimization : Reasonably design the database structure 、 Build index reasonably 、 Database cluster .

       4. Processing optimization : Optimize Sql、 Consider using temporary tables 、 In the middle of table .

       5. The rational use of NoSql:Mongodb、Redis、HBase etc. .

       6. Distributed big data processing scheme :Hadoop、Spark、Storm etc. .

        Here we mainly need to know how to partition the table , What are the advantages and disadvantages of partitioning and what are the precautions for partitioning . Here we are mysql Give examples .

        Let's first look at the advantages and disadvantages of partitions :

       1. advantage :

        • Perform logical data segmentation , Split data can have multiple physical file paths

       • Can store more data , Break through the maximum limit of a single file in the system  
 
       • Lifting performance , Improve the read and write speed of each partition , Improve the speed of partition range query
 
       • You can delete data quickly by deleting related partitions
 
       • Distribute data queries across multiple disks , To improve the disk I/O Performance of
 
       • It involves, for example SUM() and COUNT() So aggregate function queries , It's easy to do parallel processing
 
       • Independent partitions can be backed up and restored , This is good for the volume of big data
 
       2. shortcoming :
       
      • MySQL Support most storage engines to create partitions , Such as MyISAM、InnoDB etc. .
 
      • I won't support it MERGE and CSV Wait to create the partition .
 
      • All partitions in the same partition table must be the same storage engine .

      Next, let's look at the partition method :

      1.RANGE Partition : Given the column value of a continuous interval .

        Create it as follows :


    
     
  1. CREATE TABLE tbl_users1 (
  2. uuid INT NOT NULL,
  3. name VARCHAR( 20),
  4. registerTime VARCHAR( 100)
  5. )
  6. PARTITION BY RANGE (uuid) (
  7. PARTITION p0 VALUES LESS THAN ( 5),
  8. PARTITION p1 VALUES LESS THAN ( 10),
  9. PARTITION p2 VALUES LESS THAN ( 15),
  10. PARTITION p3 VALUES LESS THAN MAXVALUE
  11. );

       2.LIST Partition :LIST It is the column value that matches a certain value in a discrete value set to select .

        Create it as follows :


    
     
  1. CREATE TABLE tbl_users2 (
  2. uuid INT NOT NULL,
  3. name VARCHAR( 20),
  4. registerTime VARCHAR( 100)
  5. )
  6. PARTITION BY List (uuid) (
  7. PARTITION p0 VALUES in ( 1, 2, 3, 5),
  8. PARTITION p1 VALUES in ( 7, 9, 10),
  9. PARTITION p2 VALUES in ( 11, 15)
  10. );

       3.HASH Partition : The return value of user-defined expression hash The partition selected after calculation , The expression evaluates with the column values of the rows to be inserted into the table , This function must produce non negative integer values .

        Create it as follows :


    
     
  1. CREATE TABLE tbl_users4 (
  2. uuid INT NOT NULL,
  3. name VARCHAR( 20),
  4. registerTime VARCHAR( 100)
  5. )
  6. PARTITION BY HASH (uuid) / /uuid You can add expressions , such as / 2, perhaps mod(uuid, 2), Low performance , Each data should be calculated before hash Then insert
  7. PARTITIONS 3;

       4.KEY Partition : It's like pressing HASH Partition , from MySQL The server provides its own hash function .

        Create it as follows :


    
     
  1. CREATE TABLE tbl_users5 (
  2. uuid INT NOT NULL,
  3. name VARCHAR( 20),
  4. registerTime VARCHAR( 100)
  5. )
  6. PARTITION BY LINEAR Key (uuid)
  7. PARTITIONS 3;

        In a later article, we will specifically introduce partition operations , No more details here . When partitioning, we need to pay attention to the following situations :

       1. If... Exists in the table primary key perhaps unique key when , Partitioned columns are part of one of two

       2. If there is no primary key perhaps unique key, You can specify any column as the partition column
 
       3.5.5 Before version Range、List、Hash Partitioning requires that the partitioning key must be int;MySQL5.5 And above , Support non integer Range and List Partition , namely :range columns and list columns.
      
       4. MySQL The partition in does not allow null values NULL There is no handling on , Whether it's a column value or a user-defined expression value , One In general , under these circumstances MySQL hold NULL As 0. If you want to avoid this practice , Columns should be declared when designing tables “NOT NULL”.
 
      Finally, take a look at its precautions :
      
       The maximum number of partitions cannot exceed 1024, It is generally recommended that the number of partitions for a single table should not exceed 150 individual .
       If it contains a unique index or primary key , The partition column must be included in all unique indexes or primary keys .
       Foreign key not supported .
       Full text indexing is not supported , Create an index on the partition key of the partitioned table , Then the index will also be partitioned .
       Partitioning by date is appropriate , Because many date functions can use . But there are not many suitable partitioning functions for Strings .
       Only RANG and LIST Partitions can be sub partitions ,HASH and KEY Partition cannot be sub partitioned .
       Partitioned tables have no advantage over single record queries .
       Pay attention to the cost of selecting partitions , Every time you insert a row of data, you need to filter the inserted partition according to the expression .
       The partition field should not be null.
        The name of the partition basically follows other MySQL The principles that identifiers should follow , For example, identifiers for table and database names . but Yes, it should be noted , Partition names are case insensitive .
       No matter what type of partition you use , Partitions are always numbered automatically when they are created , And from 0 Start recording .
原网站

版权声明
本文为[Blue sky ⊙ white clouds]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207070547398909.html