当前位置:网站首页>2022-01-27 redis cluster brain crack problem analysis
2022-01-27 redis cluster brain crack problem analysis
2022-07-03 12:55:00 【a tracer】
Catalog
cluster Cluster fault detection and master-slave switching :
redis-cluster The cause of the brain crack problem :
Brain fissure resolution strategy :
Abstract :
analysis redis-cluster Cluster pattern , Strategies to solve the problem of cerebral fissure
cluster Cluster fault detection and master-slave switching :
NDB.Redis- Data security -cluster colony
redis-cluster The cause of the brain crack problem :
Set the cluster to have cluster1, Including one master and one slave , master1 and slave1. client And master1 Connect .
At this time, there are two network partitions :
- client And master1 Network partition
- cluster The rest of the cluster master Node and master1 Network partition
Preconditions for the occurrence of cerebral fissure :
- cluster More than half of the cluster master node , And master1 Between nodes , Network disconnection occurs
- client And master1 Your network connection is normal
cluster More than half of the cluster master node , And master1 Between nodes , Network disconnection occurs :
Lead to :
- cluster Cluster decision master1 Get into FAIL state
- slave1 Received master1 Get into FAIL Broadcast of status , Start applying for master-slave switching voting
- slave1 By more than half of master The node voted , slave1 Begin to enter master state , And broadcast new master
- To the whole cluser For clusters , The original master1 It's offline , master The nodes are made up of slave1 Switch complete
- about cluster1 Come on , Now the latest data , It's from the original slave1 Provide
- For the original mastee1
- The original master1 The data of the node will be lost
- If cluster The cluster decides to go online again , Is judged to be slave state , From the original slave1 After the master-slave switch , Make a full copy slave1 The data of
client And master1 Your network connection is normal :
here client There are the following conditions :
- client Don't ask again cluster Cluster routing information in the cluster
- client And master1 The network channel of is normal
Lead to :
- client The data of continues to be written master1
- But for cluster colony , Determined master1 Get into FAIL state , And has elected a new master node
- Lead to client write in master1 Your data will be lost
Brain fissure resolution strategy :
A strategy : client Heartbeat detection update cluster Cluster information , Reduce and judge as FAIL Of master1 The connection time of
specific working means :
- client The role of is represented by the cluster proxy be responsible for
- To shorten the proxy Node query cluster Time period of routing table status
- slave node , and proxy Of the query cluster node , Received at the same time master1 Of FAIL State change
- slave Time of master-slave switching , Include
- cluster One in the cluster mater determine master1 by FAIL, Then broadcast the message , It is considered that all nodes receive master1 Get into FAIL The state of
- You need to proxy The time period of heartbeat detection < Time of master-slave switching
- To all the master Send a vote request - Time consuming :
- Accept more than half of master The vote of the corresponding - Time consuming :
- It's up to you to switch - Time consuming :
- proxy The time to query the routing table from the static node
- ClusterServerPool::RefreshInterval Configure query cycle , It can be accurate to us, Microsecond
Strategy two : master1 The node determines whether it has been connected with cluster Cluster network isolation , If isolated , Write is rejected
principle : master Judge whether you have not exchanged data with the slave node , If you haven't done data exchange for a long time , It is considered that this node has been cluster The cluster decides to go offline , Then this master The node refused to write
redis.conf Configuration properties of :
- min-replicas-to-write: master Nodes at least slave Number of nodes , otherwise master Write denied
- min-replicas-max-lag: master Nodes are at least related to slave Node data heartbeat ping Time , Beyond that time , be master Write denied
Use min-replicas-max-lag Configuration feasibility analysis :
explain : This configuration is made up of master Judgment and slave Copy time , To decide whether you can be written , So as to avoid being judged by the cluster FAIL when , Continue to the old master write in
The problem is :
- master Of FAIL State determination , And master The rejection of the node is written , Are different judgment paths
- master1 Get into FAIL, By cluster Others in the cluster master The node is responsible for determining
- master1 The node refused to write , It is master Node judgment and slave Node ping interval
- cluster When one master and one slave are deployed , Master slave switchover occurs when the master node fails , The new main node will refuse to write
- master Downtime ,slave Switch to a master, Now the new master No, slave
- Because no slave, Lead to min-replicas-max-lag The decision timed out , Lead to a new master Write denied , This leads to cluster Fragmentation is not available
Options :
- Client pass proxy When the connection , To shorten the proxy The period of querying the status of the cluster
- Client direct connection cluster Of master Node time , temporary (630 edition ) Don't deal with it
边栏推荐
- Comprehensive evaluation of double chain notes · Siyuan notes: advantages, disadvantages and evaluation
- SSH登录服务器发送提醒
- 强大的头像制作神器微信小程序
- 01 three solutions to knapsack problem (greedy dynamic programming branch gauge)
- Node. Js: use of express + MySQL
- 【数据库原理及应用教程(第4版|微课版)陈志泊】【第四章习题】
- 剑指 Offer 16. 数值的整数次方
- 【习题七】【数据库原理】
- Keep learning swift
- Xctf mobile--rememberother problem solving
猜你喜欢

Powerful avatar making artifact wechat applet

GaN图腾柱无桥 Boost PFC(单相)七-PFC占空比前馈
![[ArcGIS user defined script tool] vector file generates expanded rectangular face elements](/img/39/0b31290798077cb8c355fbd058e4d3.png)
[ArcGIS user defined script tool] vector file generates expanded rectangular face elements

Attack and defense world mobile--ph0en1x-100

Drop down refresh conflicts with recyclerview sliding (swiperefreshlayout conflicts with recyclerview sliding)

Solve the problem of VI opening files with ^m at the end

我的创作纪念日:五周年

Differences between initial, inherit, unset, revert and all

Social community forum app ultra-high appearance UI interface

【数据库原理及应用教程(第4版|微课版)陈志泊】【第六章习题】
随机推荐
最新版抽奖盲盒运营版
最新版盲盒商城thinkphp+uniapp
GaN图腾柱无桥 Boost PFC(单相)七-PFC占空比前馈
Grid connection - Analysis of low voltage ride through and island coexistence
[review questions of database principles]
2022-01-27 research on the minimum number of redis partitions
如何在微信小程序中获取用户位置?
I'm too lazy to write more than one character
Two solutions of leetcode101 symmetric binary tree (recursion and iteration)
[data mining review questions]
C graphical tutorial (Fourth Edition)_ Chapter 15 interface: interfacesamplep271
有限状态机FSM
How to stand out quickly when you are new to the workplace?
[problem exploration and solution of one or more filters or listeners failing to start]
并网-低电压穿越与孤岛并存分析
Swift return type is a function of function
(latest version) WiFi distribution multi format + installation framework
【数据库原理复习题】
【数据库原理及应用教程(第4版|微课版)陈志泊】【SQLServer2012综合练习】
电压环对 PFC 系统性能影响分析