当前位置:网站首页>Mysql: summary of common sub database and sub table schemes of Internet companies
Mysql: summary of common sub database and sub table schemes of Internet companies
2022-07-27 03:20:00 【Fat technology house】
One 、 Database bottleneck
IO bottleneck
CPU bottleneck
Two 、 Sub database and sub table
Horizontal sub database
Horizontal sub table
Vertical sub database
Vertical sub table
3、 ... and 、 Sub database and sub table tool
Four 、 Steps of sub database and sub table
5、 ... and 、 The problem of sub database and sub table
Not partition key Query questions for
Not partition key Cross database and cross table paging query
Capacity expansion
6、 ... and 、 Summary of sub database and sub table
7、 ... and 、 Example of sub database and sub table

One 、 Database bottleneck
Whether it's IO bottleneck , still CPU bottleneck , Eventually, the number of active connections to the database will increase , Furthermore, the threshold of the number of active connections that the database can carry is approached or even reached . In the business Service Come and see , Few or no database connections available . Now you can imagine ( Concurrency 、 throughput 、 collapse ).
1、IO bottleneck
The first one is : Disk read IO bottleneck , Too much hot data , The database cache can't hold , A large number of IO, Reduce query speed -> Sub database and vertical sub table .
The second kind : The Internet IO bottleneck , Too much data requested , The network bandwidth is not enough -> sub-treasury .
2、CPU bottleneck
The first one is :SQL problem , Such as SQL Contained in the join,group by,order by, Non index field condition query, etc , increase CPU The operation of operations -> SQL Optimize , Build the right index , In the business Service Layer for Business Computing .
The second kind : The amount of data in a single table is too large , Too many rows scanned during query ,SQL Low efficiency ,CPU First bottleneck -> Horizontal sub table .
Two 、 Sub database and sub table
1、 Horizontal sub database

Concept : Based on fields , According to a certain strategy (hash、range etc. ), Split data from one library into multiple libraries .
result :
The structure of each library is the same ;
The data in each library is different , There is no intersection ;
The union of all libraries is full data .
scene : The absolute concurrency of the system is up , It is difficult to solve the problem fundamentally by dividing tables , And there is no obvious business ownership to divide the database vertically .
analysis : There are more libraries ,io and cpu Of course, the pressure of .
2、 Horizontal sub table

Concept : Based on fields , According to a certain strategy (hash、range etc. ), Split the data in one table into multiple tables .
result :
Every table has the same structure ;
The data for each table is different , There is no intersection ;
The union of all tables is full data ;
scene : The absolute concurrent amount of the system is not up , Just too much data in a single table , Affected SQL efficiency , It's aggravating CPU burden , So that it becomes a bottleneck .
analysis : There is less data in the table , A single SQL High execution efficiency , Nature lightens CPU The burden of .
3、 Vertical sub database

Concept : Based on the table , According to the business ownership , Split different tables into different libraries .
result :
The structure of each library is different ;
The data of each library is different , There is no intersection ;
The union of all libraries is full data ;
scene : The absolute concurrency of the system is up , And separate business modules can be abstracted .
analysis : To this step , Basically, it can be serviced . for example , Some common configuration tables with the development of business 、 There are more and more dictionaries , In this case, you can detach these tables into a separate library , It can even serve . There's more , With the development of business, a set of business model has been hatched , In this case, you can split the related tables into a separate library , It can even serve .
4、 Vertical sub table

Concept : Based on fields , According to the activity of the field , Split the fields in the table into different tables ( Main table and extended table ) in .
result :
Every watch has a different structure ;
Each table has different data , Generally speaking , At least one column intersection for each table's fields , It's usually a primary key , Used to correlate data ;
The union of all tables is full data ;
scene : The absolute concurrent amount of the system is not up , There are not many records in the table , But there are many fields , And hot data and non hot data are together , Large storage space required for single row data . So that the number of data rows in the database cache is reduced , When querying, it will read the disk data and generate a large number of random reads IO, produce IO bottleneck .
analysis : You can use the list page and the details page to help you understand . The splitting principle of vertical tables is to split hot data ( Data that may be redundant and often queried together ) Put it together as the main table , Non hot data together as extension table . So more hot data can be cached , And that reduces random reading IO. After the demolition , To get all the data, you need to associate two tables to get the data .
But remember , Never use it join, because join Not only will it increase CPU Burden and will say that two tables are coupled together ( Must be on a database instance ). Linked data , Should be in business Service Layer to do article , Get the data of the main table and the extended table respectively, and then associate them with the associated fields to get all the data .
3、 ... and 、 Sub database and sub table tool
sharding-sphere:jar, Formerly known as sharding-jdbc;
TDDL:jar,Taobao Distribute Data Layer;
Mycat: middleware .
notes : Advantages and disadvantages of tools , Please do your own research , Website and community priority .
Four 、 Steps of sub database and sub table
According to capacity ( Current capacity and growth ) Number of evaluation sub databases or sub tables -> choose key( uniform )-> Table rules (hash or range etc. )-> perform ( General double writing )-> Capacity expansion ( Minimize data movement ).
5、 ... and 、 The problem of sub database and sub table
1、 Not partition key Query questions for
Based on horizontal sub base and sub table , Split strategy is common hash Law .
Except for partition key There is only one non partition key Query as a condition .
Mapping method :

Gene method :

notes : When writing , Genetic generation user_id, Pictured . About xbit gene , For example, to divide 8 A watch ,23=8, so x take 3, namely 3bit gene . according to user_id When querying, the module can be directly routed to the corresponding sub database or sub table .
according to user_name When inquiring , Through the first user_name_code Generating function generation user_name_code Then the module is routed to the corresponding sub database or sub table .id Generate common snowflake Algorithm .
Except for partition key More than one non partition key Query as a condition
Mapping method :

Redundancy method :

notes : according to order_id or buyer_id Route to on query db_o_buyer In the library , according to seller_id Route to on query db_o_seller In the library . I feel like putting the cart before the horse ! Is there any other good way ? Change the technology stack ?
Backstage except partition key And all kinds of non partition key Combined condition query
NoSQL Law :

Redundancy method :

2、 Not partition key Cross database and cross table paging query
Based on horizontal sub base and sub table , Split strategy is common hash Law .
notes : use NoSQL Method to solve (ES etc. ).
3、 Capacity expansion
Based on horizontal sub base and sub table , Split strategy is common hash Law .
Horizontal expansion warehouse ( Upgrade from library method )

notes : The expansion is multiple .
Horizontal expansion table ( Double write transfer )

First step :( Synchronous double write ) Modify application configuration and code , Plus double writing , Deploy ;
The second step :( Synchronous double write ) Copy the old data in the old database to the new database ;
The third step :( Synchronous double write ) Proofread the old data in the new database according to the old database ;
Step four :( Synchronous double write ) Modify application configuration and code , Remove double write , Deploy ;
notes : Double writing is a general solution .
6、 ... and 、 Summary of sub database and sub table
Sub database and sub table , First of all, we need to know where the bottleneck is , Then we can split it reasonably ( Sub database or sub table ? Horizontal or vertical ? Several ?). And cannot be split for the sake of database and table splitting .
choose key Very important , It is necessary to consider the even split , We should also consider the non partition key Query for .
As long as it can meet the demand , The simpler the split rule, the better .
7、 ... and 、 Example of sub database and sub table
Example GitHub Address :https://github.com/littlecharacter4s/study-sharding
边栏推荐
- IDEA 连接数据库查询数据后控制台表头中文乱码的解决方法
- 关于OpenFeign的源码分析
- A test class understands beanutils.copyproperties
- Analysis of [paper] pointlanenet papers
- Yilingsi T35 FPGA drives LVDS display screen
- How to uniquely identify a user SQL in Youxuan database cluster
- Integrated water conservancy video monitoring station telemetry terminal video image water level water quality water quantity flow velocity monitoring
- 175. Combine two tables (very simple)
- 毕业2年转行软件测试获得12K+,不考研月薪过万的梦想实现了
- 优炫数据库集群如何唯一标识一条用户SQL
猜你喜欢

记录一次,php程序访问系统文件访问错误的问题

196. 删除重复的电子邮箱

After two years of graduation, I switched to software testing and got 12k+, and my dream of not taking the postgraduate entrance examination with a monthly salary of more than 10000 was realized

Comprehensive care analysis lyriq Ruige battery safety design

196. Delete duplicate email addresses

数模1232

Annotation summary of differences between @autowired and @resource

Abbkine AbFluor 488 细胞凋亡检测试剂盒特点及实验建议

window对象的常见事件

Worthington果胶酶的特性及测定方案
随机推荐
Yilingsi T35 FPGA drives LVDS display screen
抖音服务器带宽有多大,才能供上亿人同时刷?
impala 执行计划详解
Functions that should be selected for URL encoding and decoding
Marqueeview realizes sliding display effect
A math problem cost the chip giant $500million!
DNS记录类型及相关名词解释
window对象的常见事件
Localstorage and sessionstorage
身家破亿!86版「红孩儿」拒绝出道成学霸,已是中科院博士,名下52家公司
2649: segment calculation
be based on. NETCORE development blog project starblog - (16) some new functions (monitoring / statistics / configuration / initialization)
【flask】服务端获取客户端请求的文件
Okaleido tiger is about to log in to binance NFT in the second round, which has aroused heated discussion in the community
Social wechat applet of fanzhihu forum community
How big is the bandwidth of the Tiktok server for hundreds of millions of people to brush at the same time?
Make ppt timeline
2649: 段位计算
Worthington过氧化物酶活性的6种测定方法
Hcip day 14 notes