当前位置:网站首页>Redis core technology and practice - learning notes (VIII) sentinel cluster: sentinel hung up
Redis core technology and practice - learning notes (VIII) sentinel cluster: sentinel hung up
2022-07-03 17:51:00 【Tom Kong】
Once multiple instances form a sentinel cluster , Even if a sentinel instance fails and hangs up , Other sentinels can continue to work together to complete the master-slave switch , Whether the main database is offline includes whether it is in the offline state , Select a new master library , And notification of slave libraries and clients .
When configuring sentry information , Just use the following configuration items , Set up the master database IP And port , There is no connection information configured for other sentinels .
sentinel monitor <master-name> <ip> <redis-port> <quorum>be based on pub/sub The sentinel cluster of the mechanism consists of
Sentinel instances can protect each other and find , Thanks to Redis Provided pub/sub Mechanism , That is to say Release / Subscribe mechanism .
The sentinel needs to establish a connection to the main library , You can publish messages on the main library , Post his own connection information (IP And port ). You can also subscribe to messages from the main library , Get links from other sentinels . When multiple sentinel instances publish and subscribe to the main database , They can know each other's ip Address and port .
Redis Will channel In the form of , Manage messages by category . Be similar to The subject of the message queue .
Only apps that subscribe to the same channel , In order to exchange information through published messages .
"__sentinel_:hello" channel : Different sentinels use it to protect each other, discover and communicate with each other .
Example : The sentry put his IP(172.16.19.3) And port (26579) Publish to "__sentinel__:hello" On channel , Sentinel subscribed to the channel . Now the sentry 2 and 3 You can get sentinels directly from this channel 1 Of IP Address and port number .
sentry 2,3 You can talk to the sentry 1 Set up a network connection . Allied , sentry 2 and 3 A network connection can also be established between . In this way, the sentinel cluster is formed . They can communicate with each other through network connection , Judge and negotiate whether the main database is offline .

The Sentinels, in addition to establishing connections with each other to form a cluster , It also needs to be Establish a connection with the slave Library . The Sentinel's surveillance mission requires Determine the heartbeat of the master-slave database , Select the master and notify .
How does the sentry know from the library IP Address and port ?
The sentry sent it to the main vault INFO command complete . sentry 2 Send to the master database INFO command , After the main library accepts this command , Return from the library list to the sentinel . The sentry is based on From the library list Connection information in , Make a connection to each slave library , On this connection, the slave library is continuously connected monitor . sentry 1 and 3 In the same way Establish a connection with the slave Library .

adopt pub/sub Mechanism , Sentinels can form clusters before , meanwhile , The sentry passed again INFO command , Get the connection information from the library . You can also connect to the slave library , And monitor .
be based on pub/sub Mechanism for client event notification
A sentinel is one that runs in a specific mode Redis example , It just doesn't service request operations , It's just done monitor , Elector and notice The task of . So every sentinel instance provides pub/sub Mechanism . The client can subscribe to messages from the sentry . The sentry offers many news subscription channels , Different channels contain different key events in the process of master-slave switch .
| event | Related channels |
| Main library offline events | +sdown( The instance enters the subjective offline state ) |
| -sdown( The instance exits the subjective offline state ) | |
| +odown( The instance goes offline ) | |
| -odown( The instance exits the offline status ) | |
| Reconfigure events from library | +slave -reconf -sent( The sentry sent salveof Command to reconfigure the slave Library ) |
| +slave -reconf -inprog( A new master library has been configured from the library but has not yet been synchronized ) | |
| +slave -reconf -done( The slave library is configured with a new master library and synchronized with the new master library ) | |
| New main library switch | +switch -master( The main library address has changed ) |
The client reads the sentry configuration file , Get sentinel address and port , Network with the sentry , Then we execute the subscription command on the client , To get different event messages .
Which sentry performs the master-slave switch ?
Determine which sentry performs the master-slave switch process , And the main library " Objective offline " The process of judging is similar to , Also a " Vote for arbitration " The process of .
The process of judging the objective offline : Most sentinels believe that the main warehouse has been " Subjective offline ".
Any sentinel just needs to judge the main library " Subjective offline " after , Will be sent to other instances is-master-down-by-addr command . next , Other instances will make decisions according to their connection with the main database Y or N Response ,Y It's equivalent to a yes vote ,N It's equivalent to a negative vote .
After a sentry obtains the affirmative vote required for arbitration , You can mark the master database as " Objective offline ", The number of affirmative votes is quorum Configuration item settings .
This Sentry will send orders to other sentries , Indicates that you want to perform master-slave switching yourself , And let all the other sentinels vote , This voting process is called "Leader The election ". Because the sentinel who finally performs master-slave switching becomes Leader, The voting process is called determine Leader.
Leader The election
- Get more than half of the votes
- The number of votes you get also needs to be greater than or equal to quorum value
With 3 One sentry, for example , Suppose at this time quorum Set to 2, So anyone who wants to be Leader Our sentinels mainly get 2 Just a yes vote .
| Time | sentry 1 s1 | sentry 2 s2 | sentry 3 s3 |
| T1 | Vote for yourself Y, towards s2,s3 Send a request for a vote to become Leader | ||
| T2 | Vote for yourself Y, towards s1,s2 Send a request for a vote to become Leader | ||
| T3 | received s3 Reply to your request N | received s3 Reply to your request Y | |
| T4 | received s1 Reply to your request N | ||
| T5 | 1 ticket Y,1 ticket N | 2 ticket Y, Become Leader |
- T1 moment ,s1 Judge the master database as " Objective offline ", He wants to be Leader, Vote for yourself Y, towards s2,s3 Send a request for a vote to become Leader.
- T2 moment ,s3 Judge the master database as " Objective offline ", He wants to be Leader, Vote for yourself Y, towards s1,s2 Send a request for a vote to become Leader.
- T3 moment ,S1 received S3 Of Leader Request to vote . because S1 I've voted for myself Y, therefore It can no longer vote for other sentinels , therefore S1 reply N Disagree . meanwhile ,S2 received T2 when S3 Sent Leader Request to vote . because S2 I haven't voted before , It will reply to the first sentinel who sent it a vote request Y, Reply to the sentinel who will send a vote request later N, therefore , stay T3 when ,S2 reply S3, agree! S3 Become Leader.
- T4 moment ,s2 Just received T1 when s1 The voting order sent . because s2 Already in T3 Always agree s3 Request to vote , here ,s2 to s1 reply N, Disagree s1 Become leader.
- This happens because s3 Your request arrives first , Probably s1 And s2 The network transmission between is just congested , This caused slow transmission of voting requests .
- T5 moment ,s1 get 1 ticket Y,1 ticket N,s3 get 2 ticket Y It also reaches the preset quorum value (quorum by 2), Become Leader.
- If this round of voting does not produce leader. The sentinel group will wait for a while ( Twice the sentinel failover timeout ), In the re-election .
The successful voting of sentinel clusters depends largely on the normal network transmission of election orders . If the network is under high pressure or blocked for a short time , As a result, no Sentry can get more than half of the votes . So when the network congestion improves, the success profile will increase . If the sentinel cluster has only 2 An example , At this point, a sentry wants to be Leader, Must obtain 2 ticket , instead of 1 ticket . If one 1 If one sentry hangs up, the master-slave switch cannot be completed .
In order to achieve Master slave switch , We introduced sentry ; for fear of Single sentry failure The master-slave switch cannot be performed after , And to reduce the miscarriage of justice , And the introduction of The sentry cluster ; Sentinel cluster needs some mechanisms to support its normal operation .
Make sure that the configuration of all sentinel instances is consistent , Especially the subjective judgment value down-after-milliseconds. We used to step on one “ pit ”. at that time , In our project , Because this value is not configured consistently on different sentinel instances , As a result, the sentinel cluster has not reached a consensus on the failed main database , So we didn't switch the main database in time , The end result of cluster service instability . therefore , You must not ignore this seemingly simple experience .
After class questions
Let's say I have a Redis Clusters are " One master, four slaves ", At the same time, the configuration includes 5 A cluster of sentinel instances ,quorum The value is 2, So during the operation , If 3 Sentinel instance failed , here Redis If the master-slave fails , It can also correctly judge the main database " Objective offline " Well ? You can also switch between master and slave Libraries ? Whether the more sentinel instances, the better ? If you turn it up down-after-milliseconds value , Can you reduce misjudgment ?
- It can be judged objectively , because quorum=2, When a sentry judges the main library " Subjective offline " after , Ask another sentry , When 2 All the Sentinels decided " Subjective offline ", Satisfy quorum value , So the main library " Objective offline ".
- Can't finish the objective offline , More than half must be elected leader.
- The sentry is judging " Subjective offline " And elections " Sentry leader " It is necessary to communicate with other nodes when , Exchange information . The more sentinel instances , The more communication times , Deploy more sentinels on different machines , The more nodes, the greater the risk of machine failure , These will affect the Sentinel's communication and election . When something goes wrong, it will also mean longer election time , Switching between master and slave takes longer .
- Turn it up properly down-after-milliseconds value , When there is a short-term fluctuation in the network between the sentry and the main database , It can reduce the probability of misjudgment . But turn it up down-after-milliseconds Value also means that the master-slave switch time will be longer , The longer the impact on the business , We need to weigh it against the actual scenario , Set a reasonable threshold .
边栏推荐
- Codeforces Round #803 (Div. 2) C. 3SUM Closure
- 解决Zabbix用snmp监控网络流量不准的问题
- MinGW compile boost library
- TCP拥塞控制详解 | 3. 设计空间
- MySQL grouping query
- Web-ui automated testing - the most complete element positioning method
- A. Berland Poker &1000【简单数学思维】
- Wechat applet for the first time
- Design limitations of structure type (struct)
- Research Report on market demand and investment planning for the development of China's office chair industry, 2022-2028
猜你喜欢

Hongmeng fourth training

Talk about the design and implementation logic of payment process

Applet setting multi account debugging
![[combinatorics] recursive equation (summary of the solution process of recursive equation | homogeneous | double root | non-homogeneous | characteristic root is 1 | exponential form | the bottom is th](/img/f1/c96c4a6d34e1ae971f492f4aed5a93.jpg)
[combinatorics] recursive equation (summary of the solution process of recursive equation | homogeneous | double root | non-homogeneous | characteristic root is 1 | exponential form | the bottom is th

Leetcode 669 pruning binary search tree -- recursive method and iterative method

MySQL has been stopped in the configuration interface during installation

Notes on problems -- watching videos on edge will make the screen green

PHP MySQL create database

Leetcode Valentine's Day Special - looking for a single dog

Research on Swift
随机推荐
Enterprise custom form engine solution (XI) -- form rule engine 1
[combinatorics] recursive equation (special solution example 1 Hannover tower complete solution process | special solution example 2 special solution processing when the characteristic root is 1)
Global and Chinese health care OEM and ODM market status survey and investment planning recommendations report 2022-2028
OpenSSL的SSL/BIO_get_fd
Market demand survey and marketing strategy analysis report of global and Chinese pet milk substitutes 2022-2028
Talk about the design and implementation logic of payment process
ArrayList分析3 : 删除元素
Fedora 21 安装 LAMP 主机服务器
PHP processing - watermark images (text, etc.)
QT learning diary 9 - dialog box
Introduction to PHP MySQL
Implementation of Tetris in C language
Supervisor monitors gearman tasks
互聯網醫院HIS管理平臺源碼,在線問診,預約掛號 智慧醫院小程序源碼
link preload prefetch
Y is always discrete and can't understand, how to solve it? Answer: read it several times
Hongmeng third training
Comparison of kotlin collaboration + retro build network request schemes
Type conversion, variable
Where is the monitoring page of RDS database?