当前位置:网站首页>Project practice, redis cluster technology learning (12)
Project practice, redis cluster technology learning (12)
2022-07-02 10:05:00 【User 1289394】
Redis.6.2 Fault recovery
After the failure node becomes objective offline , If the downline node is the master node holding the slot, it needs to be in its slave node
Select one of the points to replace it , So as to ensure the high availability of the cluster . All slave nodes of the offline master node bear
Fault recovery obligations , When the slave node finds the master node copied by itself through the internal scheduled task, it enters the objective
When offline , Will trigger the recovery process .
1. Qualification check
Each slave node should check the last disconnection time with the master node , Determine whether it is qualified to replace the failed main section
spot . If the disconnection time between the slave node and the master node exceeds cluster-node-time*cluster-slavevalidity-factor, The current slave node is not eligible for failover . Parameters cluster-slavevalidity-factor Effective factor for slave nodes , The default is 10.
2. Time to prepare for the election
When the slave node is eligible for fail over , Update the time when the fault election is triggered , Only after reaching this time
To execute the subsequent process . The fields related to fault election time are as follows :
struct clusterState {
...
mstime_t failover_auth_time; /* Record the failure election time before or next time */
int failover_auth_rank; /* Record the current slave node ranking */
}The reason why delay trigger mechanism is adopted here , Mainly by using different delayed elections for multiple slave nodes
Time to support priority issues .( The specific pseudo code is otherwise documented )
3. Launch an election
When the timing task detection from the node reaches the fault election time (failover_auth_time) After arrival , The process of initiating an election is as follows :
(1) Update configuration era
The configuration era is an integer that only increases but not decreases , Each master node maintains its own configuration era
(clusterNode.configEpoch) Indicates the version of the current master node , Configuration of all master nodes
The eras are not equal , The slave node copies the configuration era of the master node .
The application scenarios for configuring the era are :
· New nodes join .
· Slot node mapping conflict detection .
· Voting conflict detection from nodes .
(2) Broadcast election news
Broadcast election news in the cluster (FAILOVER_AUTH_REQUEST), And record that it has been sent
The status of the message , Ensure that the slave node can only initiate one election in a configuration era . eliminate
The content of the message is like ping The news will just type Type changed to
FAILOVER_AUTH_REQUEST.
4. The election vote
Only the master node holding the slot will process the failure election message
(FAILOVER_AUTH_REQUEST), Because each node holding slots is in a configuration
There is only one vote in every yuan
The voting process is actually a leader election process , If there is N A master node holding slots
On behalf of N votes . Since the master node holding the slot in each configuration era can only vote for one slave node , So only one can get it from the node N/2+1 The votes of the , Make sure to find the only slave node .
For example, there are 5 A master node holding slots , Master node b After the failure, there are 4 individual , When one of them
Collected from nodes 3 When voting, the delegates get enough votes to replace the master node , The failed master node is also counted in the number of votes , Suppose the size of nodes in the cluster is 3 Lord 3 from , Among them is 2 A master node is deployed on one machine , When this machine goes down , Because... Cannot be collected from node
3/2+1 A primary node vote will result in failover failure . This problem also applies to the fault discovery link .
Therefore, when deploying a cluster, all primary nodes need to be deployed at least 3 A single point of failure can only be avoided on a physical machine .
边栏推荐
- 图像识别-数据清洗
- Read 30 minutes before going to bed every day_ day3_ Files
- Error reporting on the first day of work (incomplete awvs unloading)
- Attack and defense world web advanced area unserialize3
- The latest progress and development trend of 2022 intelligent voice technology
- 2837xd代码生成模块学习(1)——GPIO模块
- In SQL injection, why must the ID of union joint query be equal to 0
- 2837xd 代码生成——补充(2)
- Image recognition - data augmentation
- 【虚幻】过场动画笔记
猜你喜欢

MySQL default transaction isolation level and row lock

2837xd code generation module learning (4) -- idle_ task、Simulink Coder

【UE5】AI随机漫游蓝图两种实现方法(角色蓝图、行为树)

Read 30 minutes before going to bed every day_ day4_ Files

This monitoring system makes workers tremble: turnover intention and fishing can be monitored. After the dispute, the product page has 404

c语言编程题

2837xd code generation - Summary

保存视频 opencv::VideoWriter

【虚幻】武器插槽:拾取武器

Web security and defense
随机推荐
MySQL index
Bugkuctf-web24 (problem solving ideas and steps)
【虚幻】自动门蓝图笔记
What is the relationship between realizing page watermarking and mutationobserver?
Kinect DK obtains color RGB images in cv:: mat format (used in openpose)
2837xd 代码生成——补充(3)
Personal experience & blog status
Read Day5 30 minutes before going to bed every day_ All key values in the map, how to obtain all value values
Skywalking理论与实践
保存视频 opencv::VideoWriter
Data insertion in C language
Inverter Simulink model -- processor in the loop test (PIL)
[unreal] key to open the door blueprint notes
2837xd code generation - stateflow (1)
Junit4 runs MVN test test suite upgrade scheme
MySQL default transaction isolation level and row lock
ICLR 2022: how does AI recognize "things I haven't seen"?
Is the C language too fat
ue虚幻引擎程序化植物生成器设置——如何快速生成大片森林
2837xd 代码生成——补充(1)