当前位置:网站首页>[essence of the trilogy of sub database and sub table]
[essence of the trilogy of sub database and sub table]
2022-06-10 04:03:00 【dj1540225203】
1. What is sub database and sub table ? What are the types of sub database and sub table ? Which scenarios use ? Their advantages and disadvantages are ?
summary
Vertical sub table : You can set the fields of a wide table according to the access frequency 、 The principle of whether a large field is split into multiple tables , This not only makes the business clear , It can also improve some performance . After break up , Try to avoid associated queries from a business perspective , Otherwise, the performance will not be worth it .
Vertical sub database : Multiple tables can be classified by business coupling , Store them in different warehouses , These libraries can be distributed on different servers , Thus, the access pressure is loaded by multiple servers , Greatly improve performance , At the same time, it can improve the business clarity of the overall architecture , Different business libraries can customize the optimization scheme according to their own situation . But it needs to solve all the complex problems brought about by cross Library .
Horizontal sub database : You can put the data of a table ( Press the data line ) It's divided into different warehouses , Each database has only part of the data of this table , These libraries can be distributed on different servers , Thus, the access pressure is loaded by multiple servers , Greatly improve performance . It not only needs to solve all the complex problems caused by cross Library , We also need to solve the problem of data routing ( Data routing problem is introduced later ).
Horizontal sub table : You can put the data of a table ( Press the data line ) It is divided into multiple tables in the same database , Each table has only part of the data of this table , This can slightly improve performance , It is only used as a supplementary optimization of the horizontal sub database .
Generally speaking , In the system design stage, we should determine the vertical sub base according to the business coupling tightness , Vertical table scheme , When the amount of data and access pressure is not particularly large , Consider caching first 、 Read / write separation 、 Index technology, etc . If the amount of data is huge , And continue to grow , Then consider the plan of horizontal sub warehouse level sub table .
2. How to divide the database and table ?【 For high concurrency, it is recommended to use snowflake algorithm to generate unique distributed ID】.
After sub warehouse and sub table ,id How to deal with primary key ? How to keep globally unique https://mp.weixin.qq.com/s/jAbUD-iGAcoZb0rMjsF8Awphp Simple version of snowflake algorithm Snowflake_myeye520 The blog of -CSDN Blog _php Snowflake algorithm Principle introduction :Snowflake The core idea is to 64bit The binary number of is divided into several parts , Each part stores data with a specific meaning , For example, time stamp 、 Computer room ID、 machine ID、 Serial number, etc , Finally, a globally unique order is generated ID. Its standard algorithm is like this :0 0000000000000000000000000000000000000000 0000000000 000000000000 Sign bit 41 A time stamp , About enough 69 year 10 position ( Computer room + machine ID) 12 A serial number How to allocate the specific number of digits
https://blog.csdn.net/myeye520/article/details/122243057
summary :
1、 Database self growth ID
This means that you get one at a time in your system id, It is to insert a piece of data with no business meaning into a table of a library , Then get a database auto increment one id. Get this id Then write it into the corresponding sub database and sub table .
The advantage of this scheme is convenience and simplicity , Anyone can use ;
The disadvantage is that a single library is generated automatically id, If it's highly concurrent , There will be bottlenecks ; If you insist on improving , Then open a special service , Every time this service gets the current id Maximum , Then I'll add a few more by myself id, Return a batch of id, And then put the current maximum id The value is modified to increase by several id The next value is ; But it is based on a single database anyway .
Suitable scene : There are two reasons why you divide your database and your table , Or the concurrency of a single database is too high , Or the amount of data in a single database is too large ; Unless you don't have high concurrency , But too much data leads to the expansion of sub database and sub table , You can use this plan , Because maybe the highest concurrency per second is a few hundred , Then go to a separate database and table to generate an auto increment primary key .
2、Redis Generate ID
adopt Redis Of INCR/INCRBY Auto increment atomic operation command , It can guarantee that ID It must be the only orderly , In essence, the implementation method is consistent with the database .
Applicable scenario : It's more suitable for counting scenes , Such as user visits , Order serial number ( date + Serial number ) etc. .
shortcoming :Redis After instance or cluster downtime , Find the latest ID Value is more troublesome .
advantage : The overall throughput is higher than the database .
3. Snowflake algorithm generates distributed unique ID,snowflake The algorithm is more reliable than the other three methods , If we are distributed id Generate , If it is highly concurrent , It is suggested that this one should have good performance , In general, tens of thousands of concurrent scenarios per second , Enough . There is a clock callback problem. You can determine whether the time of each request is greater than the last time , Small words indicate that you have called back , You need to wait for a while to judge whether it is greater than regeneration ID.3. How to divide databases and tables in actual combat ?
MySQL After tabulation , You always care about : How to query ?https://mp.weixin.qq.com/s?__biz=MzA5Mzg4MDg0Ng==&mid=2648851087&idx=1&sn=b46ea3455fcd8b920df6ffdda4e9f23b&chksm=8841f5eebf367cf835b0a3cccf9ddef25f777f2466667196c58e5da06b617b96b35744376d8d&scene=21#wechat_redirect4. How to plan the common real sub databases and sub tables in the project ?
sub-treasury :
1) By function
User class library 、 Commodity class library 、 Order class library 、 Log class 、 Statistical class library ...
2) By Region
Every city or province has the same library , Add a suffix or prefix such as :db_click_bj、db_click_sh...
table :
1、 Horizontal table Solve the problem that the table record is too large
1) Divide by a certain field
Such as :discuz The attachment table of is divided into 10 Attachment sub table pre_forum_attachment_0 To pre_forum_attachment_9, also 1 Attachment index tables pre_forum_attachment Storage tid And accessories id Relationship .
According to the theme tid The last one decides which sub table the attachment should be saved in .
2) Sub table by date
Some logs 、 Statistics can be calculated by year 、 month 、 Japan 、 Weekly minute table
Such as : Click statistics click_201601、click_201602
3) Use mysql Of merge
First create the sub table , Then create a summary table to specify engine= MERGE UNION=(table1,table2) INSERT_METHOD = LAST;
2、 Vertical dividing table Solve the problem of too many columns
1) The columns that often combine queries are placed in a table , The table of common fields can consider Memory engine
2) Infrequently used fields are listed separately
3) hold text、blob Such large fields are split and placed in the attached table
Such as :phpcms The article table of is divided into the main table v9_news And watch v9_news_data, Main table save title 、 keyword 、 Views, etc , Save the specific contents from the table 、 Templates, etc
The ultimate conclusion :
Precautions for sub warehouse and sub table :
1、 Dimension problem
If the user buys a product , Need to be To save and retrieve transaction records , If you divide the table according to the user's latitude , Then the transaction records of each user are saved in the same table , So quickly and easily find the purchase situation of a user , But when a commodity is purchased In other words, it is likely to be distributed in multiple tables , It's troublesome to find . conversely , Divide the table according to commodity dimension , You can easily find the purchase of this product , But it is troublesome to find the buyer's transaction records .
So common solutions are :
Solve the problem by scanning the table , This method is almost impossible , Too inefficient .
Record two copies of the data , A table by user latitude , A table by commodity dimension .
Solve... Through search engines , But if the real-time requirement is very high , It's about real-time search
2、 Avoid splitting tables join operation Because the associated tables may not be in the same database
3、 Avoid cross database transactions
Avoid modifying... In a transaction db0 Modify the table in at the same time db1 In the table , One is that the operation is more complex , Efficiency will also have an impact
4、 It is better to have more than one table ; This is mainly to avoid the possible secondary splitting in the later stage
5、 Try to put the same set of data into the same DB Server
For example, the seller a Our products and transaction information are put in db0 in , When db1 When I hang up , The seller a Related things can be used normally . In other words, avoid data in a database from relying on data in another database
边栏推荐
- [model compression pruning / quantification / distillation /automl]
- “阿里/字节“大厂自动化测试面试题一般会问什么?以及技巧和答案
- SSTI(模板注入) ——(8)
- Code writing method of wechat applet search box
- [team learning] [phase 38] details of team learning content!
- [singleshotmultiboxdetector (SSD)
- 【主流Nivida显卡深度学习/强化学习/AI算力汇总】
- [semi supervised classification] semi supervised web page classification based on K-means and label+propagation
- Online JSON to CSV tool
- 研究生工程伦理考试要点
猜你喜欢
随机推荐
MySQL - data type
SSTI (template injection) - (6)
Probabilistic programming tool: official introduction to tensorflow probability
Business card wechat applet error version 2
As a software testing engineer, give advice to young people (Part 1)
[mysql] database - View
为什么使用三层交换机
Vulnhub's hacksudo:thor
【TFLite, ONNX, CoreML, TensorRT Export】
Leetcode weekly buckle race 296
Lit(一):创建组件
How to open an account on your mobile phone? Is it safe to open an account online?
Using tensorflow in MATLAB
Informatics Aosai yibentong 1260 [example 9.4] interceptor missile (noip1999) | Luogu p1020 [noip1999 popularization group] missile interception
C encapsulates fluentvalidation. After using it, the whole article is still abstractvalidator. I really can't see it anymore
10n60-asemi medium and small power MOS transistor 10n60
[L1, L2 and smooth L1 loss functions]
研究生工程伦理考试要点
C#封装FluentValidation,用了之后通篇还是AbstractValidator,真的看不下去了
"Chinese characteristics" of Web3








