当前位置：网站首页>Inside database system distributed system

Inside database system distributed system

2022-07-28 09:32:00 【_ Su Mu】

The distributed part

Chapter 8
- Distributed system abstraction
Chapter 9
- fault
- - Fault detector without timeout
  - phi Incremental fault detector
Chapter 10
- Leader election
- - The election process
  - Election rules
Chapter 11

The second part introduces the terminology of distributed system （ Chapter 8）; fault , Including fault detection （ Chapter 9）、 Anti entropy and propagation （ Chapter 12）; Leader election （ Chapter 10） Rely on consensus algorithms （ Chapter 14）; Consistency in distribution , The consistency model of a single operation （ Chapter 11）、 Consistency of multiple operations , Distributed transactions （ Chapter 13）
The key is 9、10、11、14 Chapter

Distributed database system is relative to centralized database system , It is the product of the combination of database technology and network technology . Distributed database （Distributed DataBase,DDB） The more precise definition is ： Distributed database is composed of a set of data , This set of data is distributed on different computers in the computer network , Each node in the network has the ability to process independently , Become site autonomy , It can perform local applications , meanwhile , Each node can also execute global applications through the network communication subsystem . Responsible for the establishment of distributed database 、 Inquire about 、 to update 、 Copy 、 Management and maintenance software , It is called distributed database management system （Distributed DataBase Management System, DDBMS）.
Compared with centralized database , Distributed databases have the following advantages ：
（1） Good robustness . Because the distributed database system is composed of multiple computers in multiple locations , In case of failure of individual nodes or individual communication links , It can still lower the level and continue to work , If redundancy technology is adopted , You can also obtain certain fault tolerance . therefore , The robustness of the system is good , That is, the reliability and availability of the system are good .
（2） Good scalability . Nodes can be increased or decreased according to the needs of development , Or reconfigure the system , This is much easier than replacing an existing centralized database with a larger system .
（3） Can improve performance . In the distributed database, it can be distributed nearby , The principle of reasonable redundancy is used to distribute the data on each node , Construct distributed database , Make most of the data accessible nearby , It avoids the bottleneck problem in the centralized database , The response time of the system is reduced , Improve the efficiency of the system , And it also reduces the communication cost .
（4） Good autonomy . Data can be managed decentrally , Unified coordination , That is, the data manipulation and interaction of each node in the system are highly autonomous , There is no master-slave control , therefore , The distributed database can better meet the requirements that all departments in a unit want to have their own data , Manage your own data , At the same time, I want to share the data requirements of other departments .

Chapter 8

Understand the possible interleaving situation when different processes execute concurrently .
Understand the misconceptions in Distributed Computing ：
The target of the local queue of the process ：

decoupling ： Separate acceptance and processing in time , And occur independently
Pipelining ： Requests at different stages are handled by independent parts of the system
Absorb instant burst traffic ： The load will change at any time, but the processing time of components should be hidden

Network partition ： Two or more servers cannot communicate with each other
Cascading failure ： Spread from one part of the system to another , Expand the problem .
Back off strategy ： Try again through reasonable arrangement , Increase the time between subsequent requests to avoid problem expansion .

Distributed system abstraction

link ：
Fair loss link ： The sender is not sure whether the message is delivered ; The message will eventually arrive ; Messages sent will not be delivered infinite times ; The link will not generate messages by itself .
Obstinate link ： Unrecoverable message loss will not occur during transmission ;
Perfect link ： Every message sent will be delivered 、 Messages will not be sent many times 、 Deliver only messages sent by the sender .

Retransmission
We will retransmit the message before we know the status , But it will cause the problem of message duplication . So this part mentions an idempotent . When the operation to be performed is idempotent , Handling duplicate messages is safe .
idempotent ： Understand that an interface can be called repeatedly , In the case of multiple calls , The end result is the same .（ The same operation is performed many times, but the result remains the result of one execution , There will be no other side effects .）
For example, query function 、 Shutdown operation such .
In order to keep idempotence , We will introduce some operations .

Globally unique ID Generate global according to business operation and content ID, According to ID Whether there is a judgment whether it has been executed . If it doesn't exist, it will ID Put it in the storage area , To perform ; If it exists, it means that the operation has been executed .
De duplication Select the unique logo . We can build a de duplication table , And take the unique identification as the unique index . The serial number in the weight removal table , The receiver puts the reorder buffer . You can check the serial number n Whether the message of has been processed . Sign a n_consecutive Represents the maximum consecutive serial number ;n_processed Indicates the maximum processed serial number .
Multi version control This method is suitable for updating scenarios , For example, we need to update the name of the product , At this time, we can add a version number to the updated interface , To do idempotent .
State machine control This method is suitable for the case of state machine flow , Such as order creation and payment , The payment of the order must be before , At this time, when designing the status field , Use int type , And do idempotent by the size of the value type , For example, the order is created as 0, Payment success is 100, Payment failure is 99.

Two problems in distributed systems ：① Ensure the order of messages ;② Strict one pass ;
By removing an ordered array in the table above , The order of the messages can be guaranteed .
Due to link failure, the first attempt to deliver a message may not succeed , Therefore, most practical systems use at least one transmission （at-least-once delivery）, It ensures that the sender will retry until an acknowledgement is received , Otherwise, it will be considered that the other party has not received the message . There is also a kind of semantic transmission at most once （at-most-once）： The sender just sends the message without expecting any confirmation .
To establish a reliable link , It is impossible not to send some messages repeatedly . however , We can process each message only once and ignore duplicate messages , From the perspective of the sender, it is strictly one-time transmission .

A correct consensus agreement must have the following three attributes ：
Uniformity
The decision reached by agreement must be unanimous ： Each process has made a decision and all processes have the same value . Otherwise, we have not reached a consensus .
effectiveness
The value of consensus must be proposed by a participant , This means that the system itself cannot “ Put forward ” value . This also means that this value is not irrelevant （trivial） Of ： A process cannot always determine a predefined default .
Termination
Only when all processes reach the decision state , The agreement is completed .

Fault model
The fault model accurately describes how a process in a distributed system may crash , And develop algorithms based on these assumptions . Including crash failure 、 Missing fault 、 Any fault .

Chapter 9

fault

Fault detector is an important part in any distributed system . Help extend the model , It allows us to solve the problem of consistency by balancing accuracy and completeness .
Chapter 9 introduces several fault detection algorithms ： heartbeat 、ping、 By the time 、 Contact scope, etc .
heartbeat ： A process proactively informs its peers that it is still running by sending messages .
ping: The process sends messages to the remote process , Whether the response can be received in the specified time period to check whether it is active .

Fault detector without timeout

It can run under the assumption of asynchronous system , Only the heartbeat is counted and allows the application to detect process failures based on the data in the heartbeat counter vector . Each process maintains a neighbor list and its associated counters .
The process ：
The process sends heartbeat messages to neighbors , Each message contains the path the heartbeat has passed so far .（dp） The initial message contains the first sender and unique identifier in the path , This identifier can be used to prevent the same message from being broadcast multiple times . In this process, messages are propagated through different processes , And the heartbeat path contains aggregate information received from adjacent processes , Therefore, we can mark unreachable processes as active processes （ Even if the direct link between two processes fails ）.

phi Incremental fault detector

It includes the combination of three subsystems ：
monitor 、 explain 、 action .
The principle of the algorithm is ： Collection and sampling arrival time , Create an attempt that can be used to judge the status of nodes , Then calculate the value according to the sampling results , If the threshold is reached , Then the node is recorded as down . Because the threshold can be adjusted manually , So this fault detector can dynamically adapt to changing network conditions .

Chapter 10

Leader election

understand Raft Algorithm ： In essence ,Raft Algorithm passes everything A leader oriented approach , Achieve a series of consensus and consistency of logs of all nodes .
Membership ： The leader 、 follower 、 The candidate .
Followers wait until the leader's heartbeat times out to recommend themselves as candidates ;
Candidates send voting messages to other nodes , If more votes become leaders ;
The leader is responsible for writing the request 、 Manage log replication and continuously send heartbeat information .

The election process

The communication between server nodes adopts remote procedure call （RPC）： Request voting and log copying .
Log replication can only be initiated by leaders .
Raft Term number in the algorithm ：
After the follower has timed out waiting for the leader's heartbeat information , When recommending yourself, you will increase your tenure number ; Or a server node , You find that your tenure number is smaller than other nodes , Then your number will be updated to a larger number .
Raft Algorithm convention , If a server node finds that its tenure number is smaller than that of other nodes , Then it will return to the follower state .（ After partition error recovery , The term of office of the former leader 3, Received the heartbeat message of the new leader , The term of office is 4, At this point, the node will immediately become a follower .
If a node receives a request with a small tenure number , Then the request will be rejected directly .

Election rules

Leaders send heartbeat messages periodically , Prevent new elections from being launched .
Within a cycle , The follower did not receive the leader's heartbeat , They will introduce themselves to initiate the leadership election .
Get a majority of votes in an election , Promoted to leader .
In one term , Until the leader has problems or network delays , Before other nodes launch new elections , Leaders do not change .
In an election , Each server node will cast one more vote for a term number , And in accordance with the “ First First come, first serve ” The principle of voting .

Domineering Election Algorithm –> Sequential failover Algorithm –> Candidate node optimization –> Invitation Algorithm –> Ring algorithm