当前位置:网站首页>Doris' table creation and data division
Doris' table creation and data division
2022-06-13 03:28:00 【TRX1024】
Catalog
3. About Partition and Bucket Suggestions on the quantity and amount of data .
2.storage_medium & storage_cooldown_time
Basic concepts
Row & Column
A table includes rows (Row) And column (Column).Row That is, a row of data of the user .Column Used to describe different fields in a row of data .
Column It can be divided into two categories :Key and Value. From a business perspective ,Key and Value It can correspond to dimension column and indicator column respectively . From the perspective of the aggregation model ,Key Rows with the same columns , Will converge into one line . among Value The column aggregation method is specified by the user when creating the table . more : Refer to the Doris Data model .
Tablet & Partition
stay Doris In the storage engine of , User data is horizontally divided into several data slices (Tablet, Also called data Points barrels ). Every Tablet Contains several data rows , each Tablet There is no intersection between the data , And it's physically stored independently .
Multiple Tablet Logically belong to different partitions (Partition). One Tablet It belongs to only one Partition, And one Partition Contains several Tablet. because Tablet Physically independent storage , So it can be regarded as Partition Physically independent .Tablet It's data mobility 、 Copy and other operations Smallest physical storage unit .
Several Partition Form a Table.Partition It can be regarded as the smallest snap in in logic , Data import and deletion , Can or can only be for one Partition Conduct .
Data partitioning
This is illustrated by a table creation operation Doris Data division of .
CREATE TABLE IF NOT EXISTS example_db.expamle_tbl
(
`user_id` LARGEINT NOT NULL COMMENT " user id",
`date` DATE NOT NULL COMMENT " Data entry date and time ",
`timestamp` DATETIME NOT NULL COMMENT " Timestamp of data injection ",
`city` VARCHAR(20) COMMENT " User City ",
`age` SMALLINT COMMENT " User age ",
`sex` TINYINT COMMENT " User's gender ",
`last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT " Last time the user visited ",
`cost` BIGINT SUM DEFAULT "0" COMMENT " Total user consumption ",
`max_dwell_time` INT MAX DEFAULT "0" COMMENT " Maximum user residence time ",
`min_dwell_time` INT MIN DEFAULT "99999" COMMENT " User minimum residence time "
)
ENGINE=olap
AGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`)
PARTITION BY RANGE(`date`)
(
PARTITION `p202001` VALUES LESS THAN ("2020-02-01"),
PARTITION `p202002` VALUES LESS THAN ("2020-03-01"),
PARTITION `p202003` VALUES LESS THAN ("2020-04-01")
)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 16
PROPERTIES
(
"replication_num" = "3",
"storage_medium" = "SSD",
"storage_cooldown_time" = "2021-01-01 12:00:00"
);
Column definition
AGGREGATE KEY In the data model , All do not specify aggregation method (SUM、REPLACE、MAX、MIN) The columns of are treated as Key Column . And the rest are Value Column .
When defining Columns , Please refer to the following suggestions :
- Key Columns must be in all Value Before the column .
- Try to choose integer type . Because the calculation and search efficiency of integer type is much higher than that of string .
- The selection principle of integer types with different lengths , follow Enough can be used. .
- about VARCHAR and STRING The length of the type , follow Enough can be used. .
- Total byte length of all columns ( Include Key and Value) No more than 100KB.
Partition and barrel
Doris Support two-tier data division . The first layer is Partition, Support only Range The way of division . The second level is Bucket(Tablet), Support only Hash The way of division .
You can also use only one layer of partitions . When using one layer partition , Only support Bucket Divide .
1.Partition
- Partition Columns can specify one or more columns . Partition class must be KEY Column . The use of multi column partition is later Multi column partition Summary introduction .
Regardless of the type of partition column , When writing partition values , Double quotes are required .
Partition columns are usually time columns , To facilitate the management of old and new data .
There is theoretically no upper limit on the number of partitions .
When not used Partition Build table , The system will automatically generate a table with the same name as the table name , Of the full value range Partition. The Partition Invisible to users , And can't be deleted .
Partition Supported by
VALUES LESS THAN (...)
Specify only the upper bound , The system will take the upper bound of the previous partition as the lower bound of the partition , Generate a left closed right open interval . adopt , Also support passingVALUES [...)
Specify both upper and lower bounds , Generate a left closed right open interval .adopt
VALUES [...)
It is easy to understand that both upper and lower bounds are specified .
2.Bucket
If used Partition, be
DISTRIBUTED ...
The statement describes the data in Within each partition The division rules of . If not used Partition, It describes the division rules of the data of the whole table .Bucket columns can be multiple columns , But it has to be Key Column . The barrel column can be compared with Partition The columns are the same or different .
Selection of bucket column , Is in Query throughput and Query concurrency A trade-off between :
- If you select multiple bucket Columns , The data is more evenly distributed . If a query condition does not contain the equivalent conditions of all bucket Columns , Then the query will trigger all buckets to be scanned at the same time , In this way, the throughput of queries will increase , The latency of a single query is reduced . This method is suitable for query scenarios with high throughput and low concurrency .
- If only one or a few bucket columns are selected , Then the corresponding point query can trigger only one bucket sorting scan . here , When multiple point queries are concurrent , These queries have a high probability of triggering different bucket scanning , Between queries IO The impact is small ( Especially when different buckets are distributed on different disks ), Therefore, this method is suitable for high concurrency point query scenarios .
There is no upper limit on the number of barrels in theory .
3. About Partition and Bucket Suggestions on the quantity and amount of data .
- A watch of Tablet The total quantity is equal to (Partition num * Bucket num).
- A watch of Tablet Number , Without considering capacity expansion , The recommended number of disks is slightly more than that of the whole cluster .
- Single Tablet In theory, there is no upper and lower bounds on the amount of data , But the suggestion is in 1G - 10G Within the scope of . If single Tablet The amount of data is too small , Then the aggregation effect of data is poor , And metadata management is under great pressure . If the amount of data is too large , It is not conducive to the migration of replicas 、 A filling , And it will increase Schema Change perhaps Rollup The cost of retry when the operation fails ( The granularity of these failed retries is Tablet).
- When Tablet When the principle of data quantity and the principle of quantity conflict , It is suggested to give priority to the principle of data volume .
- Under table , For each division Bucket The quantity is specified uniformly . But when dynamically adding partitions (
ADD PARTITION
), You can specify the name of the new partition separately Bucket Number . This function can be used to deal with data shrinkage or expansion . - One Partition Of Bucket Once the quantity is specified , Non modifiable . So I'm making sure Bucket In quantity , Cluster expansion needs to be considered in advance . For example, there are only 3 platform host, Each station host Yes 1 Block plate . If Bucket The quantity of is only set to 3 Or less , Then even if we add more machines in the later stage , Nor can it improve concurrency .
- For example : Suppose there is 10 platform BE, Each station BE In the case of one disk . If the total size of a table is 500MB, Then we can consider 4-8 A shard .5GB:8-16 individual .50GB:32 individual .500GB: Suggest zoning , The size of each partition is 50GB about , Every section 16-32 A shard .5TB: Suggest zoning , The size of each partition is 50GB about , Every section 16-32 A shard .
notes : The amount of data in the table can be
show data
Command view , Result divided by number of copies , That is, the amount of data in the table .
PROPERTIES
At the end of the statement under construction PROPERTIES in , You can specify the following two parameters :
1.replication_num
- Every Tablet Number of copies of . The default is 3, It is suggested to keep the default . In the statement of creating a table , all Partition Medium Tablet The number of copies is specified uniformly . When adding new partitions , You can specify the new partition separately Tablet Number of copies of .
- The number of copies can be modified at run time . It is strongly recommended to keep odd numbers .
- The maximum number of replicas depends on the number of independent replicas in the cluster IP The number of ( Notice that it's not BE Number ).Doris The principle of replica distribution in is , The same is not allowed Tablet Copies of are distributed on the same physical machine , The physical machine is identified by IP. therefore , Even if deployed on the same physical machine 3 One or more BE example , If these BE Of IP identical , You can only set the number of copies to 1.
- For some small , And update the dimension table infrequently , You can consider setting more copies . In this way Join When inquiring , There is a greater probability of local data Join.
2.storage_medium & storage_cooldown_time
- BE The data storage directory of can be explicitly specified as SSD perhaps HDD( adopt .SSD perhaps .HDD Suffixes distinguish ). Build table , All can be specified uniformly Partition Initial storage media . Be careful , The suffix is used to explicitly specify the disk media , It does not check whether it matches the actual media type .
- The default initial storage media can be accessed through fe Configuration file for
fe.conf
It is specified indefault_storage_medium=xxx
, If not specified , The default is HDD. If specified as SSD, The data is initially stored in SSD On . - If not specified storage_cooldown_time, By default 30 Days later , The data will come from SSD Automatically migrate to HDD On . If you specify storage_cooldown_time, On arrival storage_cooldown_time After time , Data will be migrated .
- Be careful , When specifying storage_medium when , If FE Parameters
enable_strict_storage_medium_check
byTrue
This parameter is just a “ Do my best ” Set up . Even if there are no settings in the cluster SSD storage medium , No errors reported , It's automatically stored in the available data directory . Again , If SSD The media is not accessible 、 The space is insufficient , May lead to the initial data stored directly on other available media . And the data is due to migrate to HDD when , If HDD The media is not accessible 、 The space is insufficient , Migration may also fail ( But I'll keep trying ). If FE Parametersenable_strict_storage_medium_check
byFalse
If there is no setting in the cluster SSD When storing media , Will report a mistakeFailed to find enough host in all backends with storage medium is SSD
.
ENGINE
In this example ,ENGINE The type is olap, The default ENGINE type . stay Doris in , Only this ENGINE The type is by Doris Responsible for data management and storage . other ENGINE type , Such as mysql、broker、es wait , In essence, it is just a mapping of tables in other external databases or systems , In order to make sure Doris You can read these data . and Doris Does not create 、 Manage and store any non olap ENGINE Types of tables and data .
`IF NOT EXISTS` Indicates if the table has not been created , Create . Note that only the existence of the table name is judged here , It will not judge whether the new table structure is the same as the existing table structure . So if there is a table with the same name but different structure , The command will also return success , But it does not mean that new tables and new structures have been created .
边栏推荐
- Use of jstack
- Reading notes of effective managers
- Coal industry database - coal price, consumption, power generation & Provincial Civil and industrial power consumption data
- Capital digit to number format
- MySQL imports and exports multiple libraries at one time
- C language programming -- input a string (including letters, numbers, punctuation marks, and space characters) from the keyboard, calculate the actual number of characters and print out, that is, it d
- A data modeling optimization solution for graph data super nodes
- Unified scheduling and management of dataX tasks through web ETL
- Spark Foundation
- [azure data platform] ETL tool (6) -- re understanding azure data factory
猜你喜欢
MySQL transaction isolation level experiment
DTCC | 2021 China map database technology conference link sharing
Microservice practice based on rustlang
Economic panel topic 1: panel data of all districts and counties in China - more than 70 indicators such as population, pollution and agriculture (2000-2019)
Use of jstack
Panel for measuring innovation efficiency of 31 provinces in China (using Malmquist method)
brew工具-“fatal: Could not resolve HEAD to a revision”错误解决
Environmental pollution, enterprises, highways, fixed assets, foreign investment in all prefecture level cities in China - latest panel data
To resolve project conflicts, first-class project managers do so
Feign based remote service invocation
随机推荐
A data modeling optimization solution for graph data super nodes
Graph data modeling tool
Panel for measuring innovation efficiency of 31 provinces in China (using Malmquist method)
[figure data] how long does it take for the equity network to penetrate 1000 layers?
Onnx+tensorrt+yolov5: yolov5 deployment based on trt+onnx 1
Azure SQL db/dw series (11) -- re understanding the query store (4) -- Query store maintenance
Transaction processing in PDO
Neo4j auradb free, the world's leading map database
2000-2019 enterprise registration data of provinces, cities and counties in China (including longitude and latitude, number of registrations and other multi indicator information)
ONNX+TensorRT+YoloV5:基于trt+onnx得yolov5部署1
Yolov5 face+tensorrt: deployment based on win10+tensorrt8.2+vs2019
C语言程序设计——从键盘任意输入一个字符串(可以包含:字母、数字、标点符号,以及空格字符),计算其实际字符个数并打印输出,即不使用字符串处理函数strlen()编程,但能实现strlen()的功能。
视频播放屡破1000W+,在快手如何利用二次元打造爆款
Dish recommendation system based on graph database
MySQL imports and exports multiple libraries at one time
C simple understanding - generics
To resolve project conflicts, first-class project managers do so
How to write product requirements documents
Microservice practice based on rustlang
2-year experience summary to tell you how to do a good job in project management