当前位置:网站首页>Sentinel sentinel mechanism of master automatic election in redis master-slave
Sentinel sentinel mechanism of master automatic election in redis master-slave
2022-07-05 11:44:00 【We've been on the road】
One 、 What is a sentry
seeing the name of a thing one thinks of its function , The sentry's role is to monitor Redis The operation of the system , Its functions include two
- monitor master and slave Is it working
- master In case of failure, it will automatically slave Database upgrade to master
Sentinels are an independent process , The structure after using Sentry is shown in the figure , At the same time, in order to ensure the high availability of sentinels , We will be right. Sentinel Cluster deployment , therefore Sentinel It's not just monitoring Redis All master-slave nodes ,Sentinel Mutual monitoring will also be realized .
Two 、 Configure sentinel cluster
Based on the previous master-slave replication , Add three sentinel node , To achieve the right redis in master The function of election .
192.168.221.128(sentinel)
192.168.221.129(sentinel)
192.168.221.130(sentinel)
sentinel The sentry is configured as follows :
- from redis-6.0.9 Copy in the source package sentinel.conf File to redis/bin Installation directory
cp /data/program/redis-6.0.9/sentinel.conf /data/program/redis/sentinel.conf
- Modify the following configuration
# among name Indicates the... To be monitored master Name , The name is self-defined ,ip and port Express master Of ip And port
Number , the last one 2 The minimum number of votes passed , That is, at least a few sentinel nodes are needed to think master Offline is really offline
sentinel monitor mymaster 192.168.221.128 6379 2
sentinel down-after-milliseconds mymaster 5000 # Said if 5s Inside mymaster No response ,
I think SDOWN
sentinel failover-timeout mymaster 15000 # Said if 15 Seconds later ,mysater Still not alive
To come over , Start up failover, From the rest slave Choose one to upgrade to master
logfile "/data/program/redis/logs/sentinels.log" # You need to create files in advance
- Start... With the following command sentinel sentry
./redis-sentinel ../sentinel.conf
- After successful startup , Get some information , Indicates that the sentinel has started successfully and started monitoring cluster nodes
103323:X 13 Jul 2021 15:16:28.624 # Sentinel ID is
2e9b0ac7ffbfca08e80debff744a4541a31b3951
103323:X 13 Jul 2021 15:16:28.624 # +monitor master mymaster 192.168.221.128
6379 quorum 2
103323:X 13 Jul 2021 15:16:28.627 * +slave slave 192.168.221.129:6379
192.168.221.129 6379 @ mymaster 192.168.221.128 6379
103323:X 13 Jul 2021 15:16:28.628 * +slave slave 192.168.221.130:6379
192.168.221.130 6379 @ mymaster 192.168.221.128 6379
103323:X 13 Jul 2021 15:16:48.765 * +fix-slave-config slave
192.168.221.130:6379 192.168.221.130 6379 @ mymaster 192.168.221.128 6379
103323:X 13 Jul 2021 15:16:48.765 * +fix-slave-config slave
192.168.221.129:6379 192.168.221.129 6379 @ mymaster 192.168.221.128 6379
The configuration of the other two nodes is exactly the same as above , All watch master The node can be , The main ,sentinel.conf In file master Node ip You can't lose 127.0.0.1, Or else sentinel The node cannot communicate with it
When others sentinel After the sentinel node is started , First activated sentinel The node will also output the following logs , It means there is something else sentinel Nodes join .
+sentinel sentinel d760d62e190354654490e75e0b427d8ae095ac5a 192.168.221.129
26379 @ mymaster 192.168.221.128 6379
103323:X 13 Jul 2021 15:24:31.421
+sentinel sentinel dc6d874fe71e4f8f25e15946940f2b8eb087b2e8 192.168.221.130
26379 @ mymaster 192.168.221.128 6379
3、 ... and 、 simulation master Node failure
Let's go straight to redis The master and slave replicate the cluster's master node , adopt ./redis-cli shutdown Order to stop , So we looked at three sentinel The sentry's log , Let's start with the first one sentinel journal , Get the following .
103625:X 13 Jul 2021 15:35:01.241 # +new-epoch 9
103625:X 13 Jul 2021 15:35:01.244 # +vote-for-leader
d760d62e190354654490e75e0b427d8ae095ac5a 9
103625:X 13 Jul 2021 15:35:01.267 # +odown master mymaster 192.168.221.128 6379
#quorum 2/2
103625:X 13 Jul 2021 15:35:01.267 # Next failover delay: I will not start a
failover before Tue Jul 13 15:35:31 2021
103625:X 13 Jul 2021 15:35:02.113 # +config-update-from sentinel
d760d62e190354654490e75e0b427d8ae095ac5a 192.168.221.129 26379 @ mymaster
192.168.221.128 6379
103625:X 13 Jul 2021 15:35:02.113 # +switch-master mymaster 192.168.221.128 6379
192.168.221.130 6379
103625:X 13 Jul 2021 15:35:02.113 * +slave slave 192.168.221.129:6379
192.168.221.129 6379 @ mymaster 192.168.221.130 6379
103625:X 13 Jul 2021 15:35:02.113 * +slave slave 192.168.221.128:6379
192.168.221.128 6379 @ mymaster 192.168.221.130 6379
103625:X 13 Jul 2021 15:35:07.153 # +sdown slave 192.168.221.128:6379
192.168.221.128 6379 @ mymaster 192.168.221.130 6379
+sdown It means that the sentry subjectively thinks master It's out of service .
+odown The sentinel objectively believes that master Service stopped ( About subjective and objective , I'll explain to you later ). Then the sentry began to recover , Pick one slave Upgrade to master, Logs of other sentinel nodes .
76274:X 13 Jul 2021 15:35:01.240 # +try-failover master mymaster 192.168.221.128
6379
76274:X 13 Jul 2021 15:35:01.242 # +vote-for-leader
d760d62e190354654490e75e0b427d8ae095ac5a 9
76274:X 13 Jul 2021 15:35:01.242 # d760d62e190354654490e75e0b427d8ae095ac5a
voted for d760d62e190354654490e75e0b427d8ae095ac5a 9
76274:X 13 Jul 2021 15:35:01.247 # dc6d874fe71e4f8f25e15946940f2b8eb087b2e8
voted for d760d62e190354654490e75e0b427d8ae095ac5a 9
76274:X 13 Jul 2021 15:35:01.247 # 2e9b0ac7ffbfca08e80debff744a4541a31b3951
voted for d760d62e190354654490e75e0b427d8ae095ac5a 9
76274:X 13 Jul 2021 15:35:01.309 # +elected-leader master mymaster
192.168.221.128 6379
76274:X 13 Jul 2021 15:35:01.309 # +failover-state-select-slave master mymaster
192.168.221.128 6379
76274:X 13 Jul 2021 15:35:01.400 # +selected-slave slave 192.168.221.130:6379
192.168.221.130 6379 @ mymaster 192.168.221.128 6379
76274:X 13 Jul 2021 15:35:01.400 * +failover-state-send-slaveof-noone slave
192.168.221.130:6379 192.168.221.130 6379 @ mymaster 192.168.221.128 6379
76274:X 13 Jul 2021 15:35:01.477 * +failover-state-wait-promotion slave
192.168.221.130:6379 192.168.221.130 6379 @ mymaster 192.168.221.128 6379
76274:X 13 Jul 2021 15:35:02.045 # +promoted-slave slave 192.168.221.130:6379
192.168.221.130 6379 @ mymaster 192.168.221.128 6379
76274:X 13 Jul 2021 15:35:02.045 # +failover-state-reconf-slaves master mymaster
192.168.221.128 6379
76274:X 13 Jul 2021 15:35:02.115 * +slave-reconf-sent slave 192.168.221.129:6379
192.168.221.129 6379 @ mymaster 192.168.221.128 6379
76274:X 13 Jul 2021 15:35:03.070 * +slave-reconf-inprog slave
192.168.221.129:6379 192.168.221.129 6379 @ mymaster 192.168.221.128 6379
76274:X 13 Jul 2021 15:35:03.070 * +slave-reconf-done slave 192.168.221.129:6379
192.168.221.129 6379 @ mymaster 192.168.221.128 6379
76274:X 13 Jul 2021 15:35:03.133 # +failover-end master mymaster 192.168.221.128
6379
76274:X 13 Jul 2021 15:35:03.133 # +switch-master mymaster 192.168.221.128 6379
192.168.221.130 6379
76274:X 13 Jul 2021 15:35:03.133 * +slave slave 192.168.221.129:6379
192.168.221.129 6379 @ mymaster 192.168.221.130 6379
76274:X 13 Jul 2021 15:35:03.133 * +slave slave 192.168.221.128:6379
192.168.221.128 6379 @ mymaster 192.168.221.130 6379
76274:X 13 Jul 2021 15:35:08.165 # +sdown slave 192.168.221.128:6379
192.168.221.128 6379 @ mymaster 192.168.221.130 6379
+try-failover Indicates that the sentinel has started fault recovery
+failover-end Indicates that the sentinel has completed fault recovery
+slave Means to list new master and slave The server , We can still see what has stopped master, Sentinel did not clear instances of the stopped services , This is because the stopped server may recover at some time , After recovery, it will be slave The role is added to the whole cluster .
Four 、 Realization principle
- Every Sentinel At a rate of once per second to what it knows Master/Slave And other things Sentinel Instance sends a PING command
- If an example (instance) Distance from the last valid reply PING The order took longer than down-aftermilliseconds The value specified by the option , Then this instance will be Sentinel Mark as subjective offline .
- If one Master Marked as subjective offline , Is monitoring this Master All of the Sentinel Confirm... At a rate of once per second Master It has entered the subjective offline state .
- When there are enough Sentinel( Greater than or equal to the value specified in the configuration file :quorum) Confirm... Within a specified time frame Master It has entered the subjective offline state , be Master Will be marked as objective offline .
- In general , Every Sentinel With every 10 Second frequency to all that it knows Master,Slave send out INFO command
- When Master By Sentinel When marked as objective offline ,Sentinel Go offline Master All of the Slave send out INFO The frequency of the command will be from 10 Once per second to once per second , If there is not enough Sentinel agree! Master It's offline , Master The objective offline status of will be removed .
- if Master Reorientation Sentinel Of PING Command returns a valid reply , Master Will be removed .
Subjective offline :Subjectively Down, abbreviation SDOWN, It means the present Sentinel Instance to a redis The server's offline judgment .
Objective offline :Objectively Down, abbreviation ODOWN, multiple Sentinel Examples are right Master Server To make a SDOWN Judge , And through SENTINEL After communicating with each other Master Offline judgment . Then open failover
5、 ... and 、 Who will complete the failover
When redis Medium master After the node is determined to be objectively offline , It needs to be renewed from slave Select a node as the new node master node , Now there are three sentinel node , Who should complete the failover process ? So these three sentinel Nodes must agree through some mechanism , stay Redis Have adopted the Raft Algorithm to achieve this function .
Every time master Failure time , Will trigger raft Algorithm to choose a leader complete redis In the master-slave cluster master Election function .
1. Common data consistency algorithms
- paxos,paxos It should be the earliest and most orthodox data consistency algorithm , It is also the most complex algorithm .
- raft,raft The algorithm should be the most easy to understand consistency algorithm , It's in nacos、sentinel、consul And other components .
- zab agreement , yes zookeeper Based on paxos A consistency algorithm evolved from the algorithm
- distro,Distro agreement .Distro It's Alibaba's private agreement , current Nacos The service management framework adopts Distro agreement .Distro The protocol is positioned as Consistency protocol for temporary data
2.Raft Agreement that
Raft Algorithm animation demonstration address : http://thesecretlivesofdata.com/raft/
Raft The core idea of the algorithm : First come first served basis , The minority is subordinate to the majority .
3. Failover process
How to make an original slave The node becomes the master node ?
- elect Sentinel Leader after , from Sentinel Leader Send... To a node slaveof no one command , Make it a stand-alone node .
- Then send... To other nodes replicaof x.x.x.x xxxx( Local service ), Make them children of this node , Failover complete .
How to choose the right one slave The node becomes master Well ?
- How long to disconnect , If the connection with the sentry is longer , Over a certain threshold , It's a direct loss of the right to vote
- Prioritization , If you have the right to vote , It depends on who has the highest priority , This can be set in the configuration file (replicapriority 100), The smaller the value, the higher the priority
- Number of copies , If the priority is the same , It depends on who comes from master Copy the most data in ( Copy offset max )
- process id, If the number of copies is the same , Just choose the process id The youngest one
6、 ... and 、Sentinel Function summary
monitor :Sentinel Check whether the master server and slave server are running normally .
notice : If there is a problem with a monitored instance ,Sentinel Can pass API A notice .
Automatic failover (failover): If the primary server fails ,Sentinel You can start the failover process . Upgrade a server to a primary server , And give notice .
Configuration Management : The client connects to Sentinel, Get current Redis The address of the primary server .
边栏推荐
- Open3D 网格(曲面)赋色
- 871. Minimum Number of Refueling Stops
- 7 themes and 9 technology masters! Dragon Dragon lecture hall hard core live broadcast preview in July, see you tomorrow
- Redis集群的重定向
- The most comprehensive new database in the whole network, multidimensional table platform inventory note, flowus, airtable, seatable, Vig table Vika, flying Book Multidimensional table, heipayun, Zhix
- Network five whip
- I used Kaitian platform to build an urban epidemic prevention policy inquiry system [Kaitian apaas battle]
- [crawler] Charles unknown error
- 以交互方式安装ESXi 6.0
- NFT 交易市场主要使用 ETH 本位进行交易的局面是如何形成的?
猜你喜欢
简单解决redis cluster中从节点读取不了数据(error) MOVED
Is it difficult to apply for a job after graduation? "Hundreds of days and tens of millions" online recruitment activities to solve your problems
【pytorch 修改预训练模型:实测加载预训练模型与模型随机初始化差别不大】
13. (map data) conversion between Baidu coordinate (bd09), national survey of China coordinate (Mars coordinate, gcj02), and WGS84 coordinate system
1 plug-in to handle advertisements in web pages
Harbor image warehouse construction
Yolov5 target detection neural network -- calculation principle of loss function
How can China Africa diamond accessory stones be inlaid to be safe and beautiful?
Cdga | six principles that data governance has to adhere to
Evolution of multi-objective sorting model for classified tab commodity flow
随机推荐
【上采样方式-OpenCV插值】
Sklearn model sorting
SET XACT_ ABORT ON
Install esxi 6.0 interactively
MySQL giant pit: update updates should be judged with caution by affecting the number of rows!!!
[calculation of loss in yolov3]
redis的持久化机制原理
Prevent browser backward operation
liunx禁ping 详解traceroute的不同用法
ibatis的动态sql
Unity xlua monoproxy mono proxy class
yolov5目標檢測神經網絡——損失函數計算原理
How did the situation that NFT trading market mainly uses eth standard for trading come into being?
Pytorch training process was interrupted
Harbor镜像仓库搭建
Implementation of array hash function in PHP
紫光展锐全球首个5G R17 IoT NTN卫星物联网上星实测完成
7.2 daily study 4
MySQL statistical skills: on duplicate key update usage
Mongodb replica set