当前位置：网站首页>Mysql: summary of common sub database and sub table schemes of Internet companies

Mysql: summary of common sub database and sub table schemes of Internet companies

2022-07-27 03:20:00 【Fat technology house】

One 、 Database bottleneck

IO bottleneck
CPU bottleneck

Two 、 Sub database and sub table

Horizontal sub database
Horizontal sub table
Vertical sub database
Vertical sub table

3、 ... and 、 Sub database and sub table tool

Four 、 Steps of sub database and sub table

5、 ... and 、 The problem of sub database and sub table

Not partition key Query questions for
Not partition key Cross database and cross table paging query
Capacity expansion

6、 ... and 、 Summary of sub database and sub table

7、 ... and 、 Example of sub database and sub table

One 、 Database bottleneck

Whether it's IO bottleneck , still CPU bottleneck , Eventually, the number of active connections to the database will increase , Furthermore, the threshold of the number of active connections that the database can carry is approached or even reached . In the business Service Come and see , Few or no database connections available . Now you can imagine （ Concurrency 、 throughput 、 collapse ）.

1、IO bottleneck

The first one is ： Disk read IO bottleneck , Too much hot data , The database cache can't hold , A large number of IO, Reduce query speed -> Sub database and vertical sub table .

The second kind ： The Internet IO bottleneck , Too much data requested , The network bandwidth is not enough -> sub-treasury .

2、CPU bottleneck

The first one is ：SQL problem , Such as SQL Contained in the join,group by,order by, Non index field condition query, etc , increase CPU The operation of operations -> SQL Optimize , Build the right index , In the business Service Layer for Business Computing .

The second kind ： The amount of data in a single table is too large , Too many rows scanned during query ,SQL Low efficiency ,CPU First bottleneck -> Horizontal sub table .

Two 、 Sub database and sub table

1、 Horizontal sub database

Concept ： Based on fields , According to a certain strategy （hash、range etc. ）, Split data from one library into multiple libraries .

result ：

The structure of each library is the same ;
The data in each library is different , There is no intersection ;
The union of all libraries is full data .

scene ： The absolute concurrency of the system is up , It is difficult to solve the problem fundamentally by dividing tables , And there is no obvious business ownership to divide the database vertically .

analysis ： There are more libraries ,io and cpu Of course, the pressure of .

2、 Horizontal sub table

Concept ： Based on fields , According to a certain strategy （hash、range etc. ）, Split the data in one table into multiple tables .

result ：

Every table has the same structure ;
The data for each table is different , There is no intersection ;
The union of all tables is full data ;

scene ： The absolute concurrent amount of the system is not up , Just too much data in a single table , Affected SQL efficiency , It's aggravating CPU burden , So that it becomes a bottleneck .

analysis ： There is less data in the table , A single SQL High execution efficiency , Nature lightens CPU The burden of .

3、 Vertical sub database

Concept ： Based on the table , According to the business ownership , Split different tables into different libraries .

result ：

The structure of each library is different ;
The data of each library is different , There is no intersection ;
The union of all libraries is full data ;

scene ： The absolute concurrency of the system is up , And separate business modules can be abstracted .

analysis ： To this step , Basically, it can be serviced . for example , Some common configuration tables with the development of business 、 There are more and more dictionaries , In this case, you can detach these tables into a separate library , It can even serve . There's more , With the development of business, a set of business model has been hatched , In this case, you can split the related tables into a separate library , It can even serve .

4、 Vertical sub table

Concept ： Based on fields , According to the activity of the field , Split the fields in the table into different tables （ Main table and extended table ） in .

result ：

Every watch has a different structure ;
Each table has different data , Generally speaking , At least one column intersection for each table's fields , It's usually a primary key , Used to correlate data ;
The union of all tables is full data ;

scene ： The absolute concurrent amount of the system is not up , There are not many records in the table , But there are many fields , And hot data and non hot data are together , Large storage space required for single row data . So that the number of data rows in the database cache is reduced , When querying, it will read the disk data and generate a large number of random reads IO, produce IO bottleneck .

analysis ： You can use the list page and the details page to help you understand . The splitting principle of vertical tables is to split hot data （ Data that may be redundant and often queried together ） Put it together as the main table , Non hot data together as extension table . So more hot data can be cached , And that reduces random reading IO. After the demolition , To get all the data, you need to associate two tables to get the data .

But remember , Never use it join, because join Not only will it increase CPU Burden and will say that two tables are coupled together （ Must be on a database instance ）. Linked data , Should be in business Service Layer to do article , Get the data of the main table and the extended table respectively, and then associate them with the associated fields to get all the data .

3、 ... and 、 Sub database and sub table tool

sharding-sphere：jar, Formerly known as sharding-jdbc;
TDDL：jar,Taobao Distribute Data Layer;
Mycat： middleware .

notes ： Advantages and disadvantages of tools , Please do your own research , Website and community priority .

Four 、 Steps of sub database and sub table

According to capacity （ Current capacity and growth ） Number of evaluation sub databases or sub tables -> choose key（ uniform ）-> Table rules （hash or range etc. ）-> perform （ General double writing ）-> Capacity expansion （ Minimize data movement ）.

5、 ... and 、 The problem of sub database and sub table

1、 Not partition key Query questions for

Based on horizontal sub base and sub table , Split strategy is common hash Law .

Except for partition key There is only one non partition key Query as a condition .

Mapping method ：

Gene method ：

notes ： When writing , Genetic generation user_id, Pictured . About xbit gene , For example, to divide 8 A watch ,23=8, so x take 3, namely 3bit gene . according to user_id When querying, the module can be directly routed to the corresponding sub database or sub table .
according to user_name When inquiring , Through the first user_name_code Generating function generation user_name_code Then the module is routed to the corresponding sub database or sub table .id Generate common snowflake Algorithm .

Except for partition key More than one non partition key Query as a condition

Mapping method ：

Redundancy method ：

notes ： according to order_id or buyer_id Route to on query db_o_buyer In the library , according to seller_id Route to on query db_o_seller In the library . I feel like putting the cart before the horse ！ Is there any other good way ？ Change the technology stack ？

Backstage except partition key And all kinds of non partition key Combined condition query

NoSQL Law ：

Redundancy method ：

2、 Not partition key Cross database and cross table paging query

Based on horizontal sub base and sub table , Split strategy is common hash Law .

notes ： use NoSQL Method to solve （ES etc. ）.

3、 Capacity expansion

Based on horizontal sub base and sub table , Split strategy is common hash Law .

Horizontal expansion warehouse （ Upgrade from library method ）

notes ： The expansion is multiple .

Horizontal expansion table （ Double write transfer ）

First step ：（ Synchronous double write ） Modify application configuration and code , Plus double writing , Deploy ;
The second step ：（ Synchronous double write ） Copy the old data in the old database to the new database ;
The third step ：（ Synchronous double write ） Proofread the old data in the new database according to the old database ;
Step four ：（ Synchronous double write ） Modify application configuration and code , Remove double write , Deploy ;

notes ： Double writing is a general solution .

6、 ... and 、 Summary of sub database and sub table

Sub database and sub table , First of all, we need to know where the bottleneck is , Then we can split it reasonably （ Sub database or sub table ？ Horizontal or vertical ？ Several ？）. And cannot be split for the sake of database and table splitting .
choose key Very important , It is necessary to consider the even split , We should also consider the non partition key Query for .
As long as it can meet the demand , The simpler the split rule, the better .

7、 ... and 、 Example of sub database and sub table

Example GitHub Address ：https://github.com/littlecharacter4s/study-sharding