当前位置:网站首页>Redis core technology and practice - learning notes (VIII) sentinel cluster: sentinel hung up
Redis core technology and practice - learning notes (VIII) sentinel cluster: sentinel hung up
2022-07-03 17:51:00 【Tom Kong】
Once multiple instances form a sentinel cluster , Even if a sentinel instance fails and hangs up , Other sentinels can continue to work together to complete the master-slave switch , Whether the main database is offline includes whether it is in the offline state , Select a new master library , And notification of slave libraries and clients .
When configuring sentry information , Just use the following configuration items , Set up the master database IP And port , There is no connection information configured for other sentinels .
sentinel monitor <master-name> <ip> <redis-port> <quorum>be based on pub/sub The sentinel cluster of the mechanism consists of
Sentinel instances can protect each other and find , Thanks to Redis Provided pub/sub Mechanism , That is to say Release / Subscribe mechanism .
The sentinel needs to establish a connection to the main library , You can publish messages on the main library , Post his own connection information (IP And port ). You can also subscribe to messages from the main library , Get links from other sentinels . When multiple sentinel instances publish and subscribe to the main database , They can know each other's ip Address and port .
Redis Will channel In the form of , Manage messages by category . Be similar to The subject of the message queue .
Only apps that subscribe to the same channel , In order to exchange information through published messages .
"__sentinel_:hello" channel : Different sentinels use it to protect each other, discover and communicate with each other .
Example : The sentry put his IP(172.16.19.3) And port (26579) Publish to "__sentinel__:hello" On channel , Sentinel subscribed to the channel . Now the sentry 2 and 3 You can get sentinels directly from this channel 1 Of IP Address and port number .
sentry 2,3 You can talk to the sentry 1 Set up a network connection . Allied , sentry 2 and 3 A network connection can also be established between . In this way, the sentinel cluster is formed . They can communicate with each other through network connection , Judge and negotiate whether the main database is offline .

The Sentinels, in addition to establishing connections with each other to form a cluster , It also needs to be Establish a connection with the slave Library . The Sentinel's surveillance mission requires Determine the heartbeat of the master-slave database , Select the master and notify .
How does the sentry know from the library IP Address and port ?
The sentry sent it to the main vault INFO command complete . sentry 2 Send to the master database INFO command , After the main library accepts this command , Return from the library list to the sentinel . The sentry is based on From the library list Connection information in , Make a connection to each slave library , On this connection, the slave library is continuously connected monitor . sentry 1 and 3 In the same way Establish a connection with the slave Library .

adopt pub/sub Mechanism , Sentinels can form clusters before , meanwhile , The sentry passed again INFO command , Get the connection information from the library . You can also connect to the slave library , And monitor .
be based on pub/sub Mechanism for client event notification
A sentinel is one that runs in a specific mode Redis example , It just doesn't service request operations , It's just done monitor , Elector and notice The task of . So every sentinel instance provides pub/sub Mechanism . The client can subscribe to messages from the sentry . The sentry offers many news subscription channels , Different channels contain different key events in the process of master-slave switch .
| event | Related channels |
| Main library offline events | +sdown( The instance enters the subjective offline state ) |
| -sdown( The instance exits the subjective offline state ) | |
| +odown( The instance goes offline ) | |
| -odown( The instance exits the offline status ) | |
| Reconfigure events from library | +slave -reconf -sent( The sentry sent salveof Command to reconfigure the slave Library ) |
| +slave -reconf -inprog( A new master library has been configured from the library but has not yet been synchronized ) | |
| +slave -reconf -done( The slave library is configured with a new master library and synchronized with the new master library ) | |
| New main library switch | +switch -master( The main library address has changed ) |
The client reads the sentry configuration file , Get sentinel address and port , Network with the sentry , Then we execute the subscription command on the client , To get different event messages .
Which sentry performs the master-slave switch ?
Determine which sentry performs the master-slave switch process , And the main library " Objective offline " The process of judging is similar to , Also a " Vote for arbitration " The process of .
The process of judging the objective offline : Most sentinels believe that the main warehouse has been " Subjective offline ".
Any sentinel just needs to judge the main library " Subjective offline " after , Will be sent to other instances is-master-down-by-addr command . next , Other instances will make decisions according to their connection with the main database Y or N Response ,Y It's equivalent to a yes vote ,N It's equivalent to a negative vote .
After a sentry obtains the affirmative vote required for arbitration , You can mark the master database as " Objective offline ", The number of affirmative votes is quorum Configuration item settings .
This Sentry will send orders to other sentries , Indicates that you want to perform master-slave switching yourself , And let all the other sentinels vote , This voting process is called "Leader The election ". Because the sentinel who finally performs master-slave switching becomes Leader, The voting process is called determine Leader.
Leader The election
- Get more than half of the votes
- The number of votes you get also needs to be greater than or equal to quorum value
With 3 One sentry, for example , Suppose at this time quorum Set to 2, So anyone who wants to be Leader Our sentinels mainly get 2 Just a yes vote .
| Time | sentry 1 s1 | sentry 2 s2 | sentry 3 s3 |
| T1 | Vote for yourself Y, towards s2,s3 Send a request for a vote to become Leader | ||
| T2 | Vote for yourself Y, towards s1,s2 Send a request for a vote to become Leader | ||
| T3 | received s3 Reply to your request N | received s3 Reply to your request Y | |
| T4 | received s1 Reply to your request N | ||
| T5 | 1 ticket Y,1 ticket N | 2 ticket Y, Become Leader |
- T1 moment ,s1 Judge the master database as " Objective offline ", He wants to be Leader, Vote for yourself Y, towards s2,s3 Send a request for a vote to become Leader.
- T2 moment ,s3 Judge the master database as " Objective offline ", He wants to be Leader, Vote for yourself Y, towards s1,s2 Send a request for a vote to become Leader.
- T3 moment ,S1 received S3 Of Leader Request to vote . because S1 I've voted for myself Y, therefore It can no longer vote for other sentinels , therefore S1 reply N Disagree . meanwhile ,S2 received T2 when S3 Sent Leader Request to vote . because S2 I haven't voted before , It will reply to the first sentinel who sent it a vote request Y, Reply to the sentinel who will send a vote request later N, therefore , stay T3 when ,S2 reply S3, agree! S3 Become Leader.
- T4 moment ,s2 Just received T1 when s1 The voting order sent . because s2 Already in T3 Always agree s3 Request to vote , here ,s2 to s1 reply N, Disagree s1 Become leader.
- This happens because s3 Your request arrives first , Probably s1 And s2 The network transmission between is just congested , This caused slow transmission of voting requests .
- T5 moment ,s1 get 1 ticket Y,1 ticket N,s3 get 2 ticket Y It also reaches the preset quorum value (quorum by 2), Become Leader.
- If this round of voting does not produce leader. The sentinel group will wait for a while ( Twice the sentinel failover timeout ), In the re-election .
The successful voting of sentinel clusters depends largely on the normal network transmission of election orders . If the network is under high pressure or blocked for a short time , As a result, no Sentry can get more than half of the votes . So when the network congestion improves, the success profile will increase . If the sentinel cluster has only 2 An example , At this point, a sentry wants to be Leader, Must obtain 2 ticket , instead of 1 ticket . If one 1 If one sentry hangs up, the master-slave switch cannot be completed .
In order to achieve Master slave switch , We introduced sentry ; for fear of Single sentry failure The master-slave switch cannot be performed after , And to reduce the miscarriage of justice , And the introduction of The sentry cluster ; Sentinel cluster needs some mechanisms to support its normal operation .
Make sure that the configuration of all sentinel instances is consistent , Especially the subjective judgment value down-after-milliseconds. We used to step on one “ pit ”. at that time , In our project , Because this value is not configured consistently on different sentinel instances , As a result, the sentinel cluster has not reached a consensus on the failed main database , So we didn't switch the main database in time , The end result of cluster service instability . therefore , You must not ignore this seemingly simple experience .
After class questions
Let's say I have a Redis Clusters are " One master, four slaves ", At the same time, the configuration includes 5 A cluster of sentinel instances ,quorum The value is 2, So during the operation , If 3 Sentinel instance failed , here Redis If the master-slave fails , It can also correctly judge the main database " Objective offline " Well ? You can also switch between master and slave Libraries ? Whether the more sentinel instances, the better ? If you turn it up down-after-milliseconds value , Can you reduce misjudgment ?
- It can be judged objectively , because quorum=2, When a sentry judges the main library " Subjective offline " after , Ask another sentry , When 2 All the Sentinels decided " Subjective offline ", Satisfy quorum value , So the main library " Objective offline ".
- Can't finish the objective offline , More than half must be elected leader.
- The sentry is judging " Subjective offline " And elections " Sentry leader " It is necessary to communicate with other nodes when , Exchange information . The more sentinel instances , The more communication times , Deploy more sentinels on different machines , The more nodes, the greater the risk of machine failure , These will affect the Sentinel's communication and election . When something goes wrong, it will also mean longer election time , Switching between master and slave takes longer .
- Turn it up properly down-after-milliseconds value , When there is a short-term fluctuation in the network between the sentry and the main database , It can reduce the probability of misjudgment . But turn it up down-after-milliseconds Value also means that the master-slave switch time will be longer , The longer the impact on the business , We need to weigh it against the actual scenario , Set a reasonable threshold .
边栏推荐
- Postfix 技巧和故障排除命令
- supervisor监控Gearman任务
- Supervisor monitors gearman tasks
- Internet Hospital his Management Platform source, online Inquiry, appointment Registration Smart Hospital Small program source
- 毕业总结
- Design limitations of structure type (struct)
- STM32实现74HC595控制
- [combinatorics] recursive equation (case where the non-homogeneous part is exponential | example where the non-homogeneous part is exponential)
- [set theory] order relation: summary (partial order relation | partial order set | comparable | strictly less than | covering | hasto | total order relation | quasi order relation | partial order rela
- Fedora 21 安装 LAMP 主机服务器
猜你喜欢

MySQL grouping query

(8) HS corner detection

聊聊支付流程的设计与实现逻辑

Type conversion, variable

Wechat applet for the first time

vs2013已阻止安装程序,需安装IE10

QT adjust win screen brightness and sound size

Internet hospital his management platform source code, online consultation, appointment registration smart hospital applet source code

聊聊支付流程的設計與實現邏輯

互联网医院HIS管理平台源码,在线问诊,预约挂号 智慧医院小程序源码
随机推荐
First day of rhcsa study
Servlet specification Part II
ArrayList分析3 : 删除元素
Getting started with deops
Market demand survey and marketing strategy analysis report of global and Chinese pet milk substitutes 2022-2028
Enterprise custom form engine solution (XI) -- form rule engine 1
Ml (machine learning) softmax function to realize the classification of simple movie categories
Leetcode 669 pruning binary search tree -- recursive method and iterative method
Qt调节Win屏幕亮度和声音大小
[combinatorics] recursive equation (solution of linear non-homogeneous recursive equation with constant coefficients | standard form and general solution of recursive equation | proof of general solut
What is the difference between cloud server and cloud virtual machine
Create a new file from templates with bash script - create new file from templates with bash script
Research on Swift
Baiwen.com 7 days Internet of things smart home learning experience punch in the next day
Discussion sur la logique de conception et de mise en oeuvre du processus de paiement
Supervisor monitors gearman tasks
Internet Hospital his Management Platform source, online Inquiry, appointment Registration Smart Hospital Small program source
How to deploy applications on kubernetes cluster
TCP拥塞控制详解 | 3. 设计空间
Talk about the design and implementation logic of payment process