当前位置：网站首页>Extensible database (Part 2)

Extensible database (Part 2)

2022-06-28 03:33:00 【Ultipa】

The extension of the database layer is the fourth layer of the five layer architecture of a typical cloud application , And the most complicated layer （ Some people think that scalable storage systems are more complex , I think , It depends on the business application mode . For applications with complex transaction processing types , The challenge of its database layer implementation is obviously higher ; For simple massive data and simple event processing applications , The database tier doesn't even need to exist , The implementation of cloud storage layer is more complex ）.

There are four types of solutions for database expansion ：

·Scale-Up ·Master-Slaves（ One master and many servants ） Read agent mode
·Master-Master Pattern ·Sharding Pattern

【 The book follows 】 The distributed database in the following figure is taken as an example , We can design the database as follows CS To ensure that in San Francisco （SF）、 New York （NY） And Dallas （DL） Of Master Nodes can write at the same time without conflicts （ See the table below ）.

Three database nodes （ colony ） The synchronization between the

Three Master How do database nodes avoid overlapping write areas

Parameters	Master SF	Master NY	Master DL
start Id （ Such as primary key ）	1	2	3
Increment by	10	10	10
Possible values	1,11,21,31,41…	2,12,22,32…	3,13,23,33…

（4） Partition （Partitioning） Pattern

Database partitions usually have the following two modes ：

· Horizontal zoning （Horizontal Partitioning）
· Vertical zones （Vertical Partitioning）

Horizontal partitions are also often called Sharding（ table ）, Google engineers first worked in BigTable Technology used in the project . Briefly ,Sharding The main idea of is to put the rows in a table horizontally into multiple tables according to certain rules , These new tables are placed horizontally on different physical nodes , In order to achieve high I/O.

Vertical partition is to partition a table by column , Each table after partitioning usually has fewer columns . It's worth pointing out , Vertical partitioning is somewhat similar to database regularization （Normalization）, But the difference is , Even tables that have been regularized , It can still be partitioned vertically to achieve horizontal expansion to improve the overall performance of the system . So we usually call the vertical partition Row Splitting（ Line splitting operation ）.

Database partitions generally follow simple logical rules , The following four points are summarized ：

· Range partitioning （Range Partitioning）
· List partition （List Partitioning）
· Hash partition （Hash Partitioning）
· Combined zones （Composite Partitioning）

Range partitioning is easy to understand . For example, the e-commerce database is partitioned according to the sales price range of commodities ,0～10 element ,10～25 element ,25～50 element ,50～100 element ,100～250 element ,250～500 element ,500～1,000 element ,1,000 Yuan of above , You can take the price of the goods in the original table as key Divide into 8 A watch .

List partitioning is also very simple . For example, in the background of wechat database , If according to the information of the province or municipality where the registered user is located （ The data can be extracted from the registration information , You can also register by IP Address for automatic analysis ） table , It can be divided into Beijing 、 Shanghai 、 Guangdong, etc .

Hash partitioning is usually used to hash the primary key in a table （ Or surplus ） The operation then partitions the table , Refer to the three tables in the figure below ,users、group messages And photo albums After using hash operation, the level is divided into n A watch （ Fourth table events The typical range partition method is adopted ）.

Combined partitions are the above 3 A composite tabulation method composed of three methods .

The following figure shows how vertical and horizontal partitions work together . Database accessed by the application server Single First, it is partitioned vertically , Each table forms an independent database logical node , Each node can be divided horizontally to form multiple subdivision logical nodes . So the second floor （ Even more layers ） Partitioning can achieve sufficient horizontal expansion of the database layer system to obtain higher system concurrency performance .

chart ： database sharding（ Horizontal zoning ）

After partition , The physical and logical structure of the database system will become complex due to its high distribution , But in SQL And programming access API From an angle, there is no （ Should not be ） Make any changes . This ensures the backward compatibility of the system after the kernel is partitioned . In the way of implementation , The database seen by the application layer can still be a complete database and table , But this is only a logical concept （ for example ：view）, The database system kernel is responsible for concurrent access to partition nodes .

The partition implementation method is better than the previous Master-Slave or Master-Master One obvious advantage of the pattern is that the complexity of interaction between different layers is greatly reduced . stay Master-Slave/Master In the pattern , The application layer usually needs to have a clear logical judgment to ensure that the write and read are directed to which node , The partition implementation method can be completely transparent to the application layer logic .

Above, we introduced the distributed design method of traditional database , In the new database , For example, graph database and some NewSQL Class database , Their distributed architecture design and SQL Class databases have many differences , There are several design patterns in general ：

Hot backup mode ： for example Neo4j The enterprise version of is a typical adoption 3 Node hot backup mode . The advantage of this model is “ High availability ”, But the shortcomings are quite obvious , It does not increase the overall load capacity of the system .
Distributed consensus mode ：RAFT It is a typical distributed consensus algorithm , But the original RAFT There are still many defects , Therefore, there is a lot of room for mode enhancement and improvement . In the same cluster , Different nodes have different role assignments , And the linear load capacity of the system load can be improved with the increase of nodes .
Horizontal distributed system ： The earliest Google model GFS/BigTable/MapReduce, as well as Spanner System, etc. is the design idea of this large-scale horizontal distributed system , Include Hadoop/Spark Such as system , There are similar “ Divide and rule ” Design mode of . Simply put, you can set different in the cluster ” role “ node , For example, compute nodes 、 Storage nodes 、NameServer node 、ID-Server Nodes and so on . One of the more complex is the graph database , Because the data in the graph database is highly correlated , According to the traditional violence scale model , Whether it's partitioning still sharding, It is impossible to make the graph database after graph division have high performance and real-time data processing capability across nodes . At this time, a complete set of tool chains and logic are needed to realize more intelligent cut-off . In later articles , We will introduce the architecture design of horizontal distributed graph database in a separate chapter . Remember a little ： Now all open source graph databases on the market , There is no logic that is not violent , Whether it is point cut or edge cut , This system has no ability to do in-depth （ through 3 Layer above ） Data association query , Not to mention real-time computing .

原网站

版权声明
本文为[Ultipa]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/179/202206280238405289.html

当前位置：网站首页>Extensible database (Part 2)

Extensible database (Part 2)

（4） Partition （Partitioning） Pattern

边栏推荐

猜你喜欢

随机推荐