当前位置:网站首页>[essence of the trilogy of sub database and sub table]

[essence of the trilogy of sub database and sub table]

2022-06-10 04:03:00 dj1540225203

1. What is sub database and sub table ? What are the types of sub database and sub table ? Which scenarios use ? Their advantages and disadvantages are ?

Thoroughly understand the sub database and sub table ( Vertical sub database , Vertical sub table , Horizontal sub database , Horizontal sub table )https://mp.weixin.qq.com/s?__biz=MzA5Mzg4MDg0Ng==&mid=2648851078&idx=1&sn=9bb2ad7e1e91179d6dd721fb860b5b30&chksm=8841f5e7bf367cf15fdf131786774d48d2f2e2e226e318c5d7a20bc366fc7fa25afd6d9397a5&scene=21#wechat_redirect

 summary 
 Vertical sub table : You can set the fields of a wide table according to the access frequency 、 The principle of whether a large field is split into multiple tables , This not only makes the business clear , It can also improve some performance . After break up , Try to avoid associated queries from a business perspective , Otherwise, the performance will not be worth it .

 Vertical sub database : Multiple tables can be classified by business coupling , Store them in different warehouses , These libraries can be distributed on different servers , Thus, the access pressure is loaded by multiple servers , Greatly improve performance , At the same time, it can improve the business clarity of the overall architecture , Different business libraries can customize the optimization scheme according to their own situation . But it needs to solve all the complex problems brought about by cross Library .

 Horizontal sub database : You can put the data of a table ( Press the data line ) It's divided into different warehouses , Each database has only part of the data of this table , These libraries can be distributed on different servers , Thus, the access pressure is loaded by multiple servers , Greatly improve performance . It not only needs to solve all the complex problems caused by cross Library , We also need to solve the problem of data routing ( Data routing problem is introduced later ).

 Horizontal sub table : You can put the data of a table ( Press the data line ) It is divided into multiple tables in the same database , Each table has only part of the data of this table , This can slightly improve performance , It is only used as a supplementary optimization of the horizontal sub database .

 Generally speaking , In the system design stage, we should determine the vertical sub base according to the business coupling tightness , Vertical table scheme , When the amount of data and access pressure is not particularly large , Consider caching first 、 Read / write separation 、 Index technology, etc . If the amount of data is huge , And continue to grow , Then consider the plan of horizontal sub warehouse level sub table .

2. How to divide the database and table ?【 For high concurrency, it is recommended to use snowflake algorithm to generate unique distributed ID】.

After sub warehouse and sub table ,id How to deal with primary key ? How to keep globally unique https://mp.weixin.qq.com/s/jAbUD-iGAcoZb0rMjsF8Awphp Simple version of snowflake algorithm Snowflake_myeye520 The blog of -CSDN Blog _php Snowflake algorithm Principle introduction :Snowflake The core idea is to 64bit The binary number of is divided into several parts , Each part stores data with a specific meaning , For example, time stamp 、 Computer room ID、 machine ID、 Serial number, etc , Finally, a globally unique order is generated ID. Its standard algorithm is like this :0 0000000000000000000000000000000000000000 0000000000 000000000000 Sign bit 41 A time stamp , About enough 69 year 10 position ( Computer room + machine ID) 12 A serial number How to allocate the specific number of digits https://blog.csdn.net/myeye520/article/details/122243057

 summary :
1、 Database self growth ID

 This means that you get one at a time in your system  id, It is to insert a piece of data with no business meaning into a table of a library , Then get a database auto increment one  id. Get this  id  Then write it into the corresponding sub database and sub table .

 The advantage of this scheme is convenience and simplicity , Anyone can use ;

 The disadvantage is that a single library is generated automatically  id, If it's highly concurrent , There will be bottlenecks ; If you insist on improving , Then open a special service , Every time this service gets the current  id  Maximum , Then I'll add a few more by myself  id, Return a batch of  id, And then put the current maximum  id  The value is modified to increase by several  id  The next value is ; But it is based on a single database anyway .

 Suitable scene : There are two reasons why you divide your database and your table , Or the concurrency of a single database is too high , Or the amount of data in a single database is too large ; Unless you don't have high concurrency , But too much data leads to the expansion of sub database and sub table , You can use this plan , Because maybe the highest concurrency per second is a few hundred , Then go to a separate database and table to generate an auto increment primary key .



2、Redis Generate ID

 adopt Redis Of INCR/INCRBY Auto increment atomic operation command , It can guarantee that ID It must be the only orderly , In essence, the implementation method is consistent with the database .

 Applicable scenario : It's more suitable for counting scenes , Such as user visits , Order serial number ( date + Serial number ) etc. .

 shortcoming :Redis After instance or cluster downtime , Find the latest ID Value is more troublesome .

 advantage : The overall throughput is higher than the database .



3. Snowflake algorithm generates distributed unique ID,snowflake  The algorithm is more reliable than the other three methods , If we are distributed  id  Generate , If it is highly concurrent , It is suggested that this one should have good performance , In general, tens of thousands of concurrent scenarios per second , Enough . There is a clock callback problem. You can determine whether the time of each request is greater than the last time , Small words indicate that you have called back , You need to wait for a while to judge whether it is greater than regeneration ID.

3. How to divide databases and tables in actual combat ?

MySQL After tabulation , You always care about : How to query ?https://mp.weixin.qq.com/s?__biz=MzA5Mzg4MDg0Ng==&mid=2648851087&idx=1&sn=b46ea3455fcd8b920df6ffdda4e9f23b&chksm=8841f5eebf367cf835b0a3cccf9ddef25f777f2466667196c58e5da06b617b96b35744376d8d&scene=21#wechat_redirect4. How to plan the common real sub databases and sub tables in the project ?

 sub-treasury :

1) By function 

 User class library 、 Commodity class library 、 Order class library 、 Log class 、 Statistical class library ...

2) By Region 

 Every city or province has the same library , Add a suffix or prefix such as :db_click_bj、db_click_sh...


 table :

1、 Horizontal table   Solve the problem that the table record is too large 

1) Divide by a certain field 

 Such as :discuz The attachment table of is divided into 10 Attachment sub table pre_forum_attachment_0 To pre_forum_attachment_9, also 1 Attachment index tables pre_forum_attachment Storage tid And accessories id Relationship .

 According to the theme tid The last one decides which sub table the attachment should be saved in .


2) Sub table by date 

 Some logs 、 Statistics can be calculated by year 、 month 、 Japan 、 Weekly minute table 

 Such as : Click statistics click_201601、click_201602



3) Use mysql Of merge

 First create the sub table , Then create a summary table to specify engine= MERGE UNION=(table1,table2) INSERT_METHOD = LAST;



2、 Vertical dividing table   Solve the problem of too many columns 

1) The columns that often combine queries are placed in a table , The table of common fields can consider Memory engine 

2) Infrequently used fields are listed separately 

3) hold text、blob Such large fields are split and placed in the attached table 

 Such as :phpcms The article table of is divided into the main table v9_news And watch v9_news_data, Main table save title 、 keyword 、 Views, etc , Save the specific contents from the table 、 Templates, etc 

The ultimate conclusion :

Precautions for sub warehouse and sub table :

1、 Dimension problem

If the user buys a product , Need to be To save and retrieve transaction records , If you divide the table according to the user's latitude , Then the transaction records of each user are saved in the same table , So quickly and easily find the purchase situation of a user , But when a commodity is purchased In other words, it is likely to be distributed in multiple tables , It's troublesome to find . conversely , Divide the table according to commodity dimension , You can easily find the purchase of this product , But it is troublesome to find the buyer's transaction records .

So common solutions are :

Solve the problem by scanning the table , This method is almost impossible , Too inefficient .

Record two copies of the data , A table by user latitude , A table by commodity dimension .

Solve... Through search engines , But if the real-time requirement is very high , It's about real-time search

2、 Avoid splitting tables join operation Because the associated tables may not be in the same database

3、 Avoid cross database transactions

Avoid modifying... In a transaction db0 Modify the table in at the same time db1 In the table , One is that the operation is more complex , Efficiency will also have an impact

4、 It is better to have more than one table ; This is mainly to avoid the possible secondary splitting in the later stage

5、 Try to put the same set of data into the same DB Server

For example, the seller a Our products and transaction information are put in db0 in , When db1 When I hang up , The seller a Related things can be used normally . In other words, avoid data in a database from relying on data in another database

原网站

版权声明
本文为[dj1540225203]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/161/202206100346508478.html