当前位置:网站首页>Sentinel of redis

Sentinel of redis

2022-06-26 00:07:00 Just put a flower in heaven and earth

brief introduction

Sentinel yes Redis High availability solutions for : By one or more Sentinel Examples of Sentinel The system can monitor any number of primary servers , And all the slave servers under these master servers , And when the monitored master server goes offline , Automatically upgrade a slave server under the offline master server to a new master server , Then the new master server continues to process command requests instead of the offline master server .

Start and initialize Sentinel

Start a Sentinel You can use commands :

$ redis-sentinel /path/to/your/sentinel.conf  Or order 
$ redis-server /path/to/your/sentinel.conf --sentinel

When one Sentinel Startup time , It requires the following steps :

  • Initialize server .
  • Will be ordinary Redis Replace the code used by the server with Sentinel Special code .
  • initialization Sentinel state .
  • According to the given profile , initialization Sentinel The list of monitoring primary servers .
  • Create a network connection to the primary server .

In the detailed explanation Sentinel Before initialization , Let's start with a typical sentinel.conf To configure , Because the following process will also involve the content in the configuration :

// Sentry port number 
port 16000 
// Turn off protection mode 
protected-mode no 
// Daemon 
daemonize yes 
// The working path of the sentinel program 
dir /usr/local/redis/sentinel/ 
//master Access code for 
sentinel auth-pass mymaster 111222
//Sentinel To watch a man named mymaster The Lord of redis master service , This primary instance's IP The address port number is 192.168.1.161 and 17000,quorum by 2
sentinel monitor mymaster 192.168.1.161 6379 2
// If the sentry 5s Not received in mymaster Effective ping reply , Think mymaster be in down state 
sentinel down-after-milliseconds mymaster 5000
// The timeout for switching , If at that time (ms) Failed to complete failover operation , Think of it as failover Failure , The production environment needs to set this value according to the amount of data  
sentinel failover-timeout mymaster 300000 
// During failover , How many slaves can there be after the master-slave switch is completed Redis Instances can be added to new Redis Master Initiate synchronous data request , In from Redis In the case of many instances, the smaller the number is , The longer it takes to synchronize , The longer it takes to complete a failover 
sentinel parallel-syncs mymaster 2

sentinel auth-pass resque 333444
sentinel monitor resque 192.168.1.162 6380 4
sentinel down-after-milliseconds resque 5000
sentinel failover-timeout resque 300000 
sentinel parallel-syncs resque 2

For the above configuration file, we will focus on quorum The role of .quorum There are two meanings in sentry . The first meaning is : If a sentry thinks he's listening Master Being offline , This state is Redis The marked S_DOWN, namely Subjective offline . hypothesis quorum Configure to 2, When two sentinels consider one Master When you are offline , Will mark this Master by O_DOWN, namely Objective offline . only one Master The switch will be executed only when it is in the objective offline state . The second meaning is : Suppose there is 5 A sentinel ,quorum Configure to 4. First , Judging the objective offline needs 4 Only a Sentry can identify . secondly , When switching starts , From 5 Choose one of the sentinels leader Carry out the election , At this point a sentry must also get 4 To be elected leader, instead of 3 ticket ( Most of the sentinels ).

Initialize server

in fact ,Sentinel In essence, it's just a... Running in a special mode Redis The server , So start Sentinel The first step , Is to initialize a normal Redis The server .

however , because Sentinel Ordinary work and execution Redis The server does different work , therefore Sentinel The initialization process of Redis The initialization process of the server is not exactly the same , for example :

  • Ordinary Redis The server will be initialized by loading RDB Documents or AOF File to restore database state , But because Sentinel Do not use databases , So initialization Sentinel Will not be loaded RDB Documents or AOF file .
  • On initialization , The code used is also different . for example : Ordinary Redis Server usage redis.h/REDIS_SERVERPORT The value of the constant is used as the server port , and sentinel Then use sentinel.c/REDIS_SENTINEL_PORT The value of the constant is used as the server port ; Ordinary Redis Server usage redis.c/redisCommandTable As a command table ,Sentinel Server usage sentinel.c/sentinelcmds As the command table of the server , So in Sentinel In mode ,Redis The server cannot perform such as SET、DBSIZE Wait for these orders , Because the server doesn't load these commands in the command table at all .

initialization Sentinel state

In the application of Sentinel After the special code of , Next , The server will initialize a sentinel.c/sentinelState structure ( Later referred to as" **“Sentinel state ”**), This structure holds all and Sentinel Function related states ( The general state of the server is still by redis.h/redisServer structure-preservation ):

struct sentinelState {
    // Current era , For failover 
    uint64_t current_epoch;

    // Saved all by this sentinel  Primary server for monitoring 
    // The key of the dictionary is the name of the main server 
    // The value of the dictionary is a point sentinelRedisInstance  Pointer to structure 
    dict *masters;
 
    // Have you entered TILT  Pattern ?
    int tilt;
 
    // The number of scripts currently executing 
    int running_scripts;
 
    // Get into TILT  Pattern time 
    mstime_t tilt_start_time;
 
    // Time of last execution of time processor 
    mstime_t previous_time;

    //  One FIFO  queue , Contains all user scripts that need to be executed 
    list *scripts_queue;
} sentinel;

initialization Sentinel State of master attribute

Sentinel In state masters The dictionary records all the sentinel Information about the monitored master server , among : The key to the dictionary is the name of the monitored master server ; The value of the dictionary corresponds to the monitored master server sentinel.c/sentinelRedisInstance structure .

typedef struct sentinelRedisInstance {
    // Tag value , Record the type of instance , And the current state of the instance 
    int flags;
 
    // The name of the instance 
    // The name of the primary server is set by the user in the configuration file 
    // From the server as well as Sentinel  By Sentinel  Automatic setting 
    // The format is ip:port , for example "127.0.0.1:26379"
    char *name;
 
    // Running of instances ID
    char *runid;
 
    // All slave servers under the master server 
    dict *slaves;
 
    // Configuration Era , For failover 
    uint64_t config_epoch;
 
    // Instance address 
    sentinelAddr *addr;
 
    // SENTINEL down-after-milliseconds  Value of option setting 
    // The number of milliseconds after the instance has no response will be judged as the subjective offline (subjectively down )
    mstime_t down_after_period;

    // SENTINEL monitor <master-name> <IP> <port> <quorum>  Options quorum  Parameters 
    // Judge this instance as objective offline (objectively down ) The number of votes required 
    int quorum;
 
    // SENTINEL parallel-syncs <master-name> <number>  The value of the option 
    // When performing a failover operation , The number of slave servers that can synchronize the new master server at the same time 
    int parallel_syncs;
 
    // SENTINEL failover-timeout <master-name> <ms>  The value of the option 
    // Maximum time limit to refresh failover state 
    mstime_t failover_timeout;
    // ...
} sentinelRedisInstance;

Yes Sentinel The initialization of the state will cause a pair of masters Initialization of a dictionary , and masters The initialization of the dictionary is based on the loaded sentinel.conf Configuration file .

From the configuration file , Because of some sentinel Only the information of multiple primary servers is configured in the configuration file of the server , So in initialization Sentinel In the state of , The slave servers associated with these master servers and other servers that monitor these master servers are not initialized sentinel Node information , So from the server and other sentinel When is the node information loaded ? How is it loaded ? Let's talk about it in detail .

Create a network connection to the primary server

initialization Sentinel The final step of the is to create a network connection to the monitored primary service ,Sentinel Will be the client of the primary server , It can send commands to the master server , And get relevant information from the command reply .Sentinel Two asynchronous network connections to the primary server will be created :

  • Command connection , This connection is dedicated to sending commands to the master server , And take orders to reply .
  • Subscription connection , This link is dedicated to subscribing to the primary server _sentinel_:hello channel .

in fact ,sentinel It will not only establish with the primary server Command connection and Subscription connection , It will also be established with all slave servers Command connection and Subscription connection , And it will communicate with other sentinel Server setup Command connection , but sentinel Will not create Subscription connection .
It should be noted that ,sentinel With the master server 、 From the server 、 other sentinel The time for the server to establish a network connection is different .

Question 1 :Sentinel Why create two network connections to the primary server ?
answer : This is because , On the one hand Redis In the current publish and subscribe function , The information sent will not be saved in the Redis In the server , If the message is sent , The client that wants to receive information is not online or disconnected , Then the client will lose the message . So in order not to lose _sentinel_:hello Any information on the channel ,Sentinel There must be a dedicated subscription connection to receive information from that channel . On the other hand , In addition to subscription channels ,Sentinel You must also send commands to the master server , To communicate with the master server , therefore Sentinel You must also create a command connection to the primary server .
because Sentinel Multiple network connections need to be created with multiple instances , therefore Sentinel Asynchronous connection is used .

Question two :sentinel Why not create a subscription connection between ?
answer :Sentinel When connecting to the primary or secondary server , Command connection and subscription connection will be created at the same time , But connecting the others Sentinel when , It only creates command connections , Instead of creating a subscription connection . This is because Sentinel You need to find unknown new channels by receiving channel information from the master server or from the server Sentinel, So you need to establish a subscription connection , And what we know about each other Sentinel It is sufficient to communicate using a command connection .

Sentinel Time task

Before we start Sentinel when , Let's see main() The specific code will find , The main process just does some initialization , When did you establish command connection and message connection ? The answer is Redis Time task serverCron in .
Every time in the sentry serverCron when , Will be called sentinelTimer() function . This function establishes the connection , And regularly send heartbeat packets and collect information . The main functions of this function are as follows :

  • Establish command connection and subscription connection . After the subscription connection is established, you will subscribe to Redis Service _sentinel_:hello channel
  • On the command connection every 10s send out INFO Command to collect information ; Every time 1s Send on command connection ping Command to detect survivability ; Every time 2s Publish a message on the command connection , contain : Sentinel's ip, The sentry's port 、 Sentinel's ID(40 Random string of bytes )、 Current era ( Used for election and master-slave switching )、Redis Master The name of 、Redis Master Of ip、Redis Master The port of 、Redis Master Configuration era ( Used for election and master-slave switching ).
  • Check whether the service is in the subjective offline state .
  • Check whether the service is in an objective offline state and requires master-slave switching .

send out INFO command

sentinel It defaults to every ten seconds , Send... Via command connection to the monitored master server INFO command , And by analyzing INFO Command to get the current information of the main service . It is mainly divided into two parts :

  • On the one hand is the information about the master server itself
    Sentinel The instance structure of the primary server will be updated according to the returned information of the primary server .
  • On the other hand is the information about all slave servers under the master server
    For the returned slave server information , Will be used to update the master server instance structure slaves Dictionaries , This dictionary records the list of subordinate servers of the master server .
    When Sentinel When a new slave server appears on the master server ,Sentinel In addition to creating the corresponding instance structure for this new slave server ,Sentinel Command and subscription connections to and from the server are also created .
    After creating a command connection ,Sentinel By default, every 10s The frequency of one time is sent to the slave server through a command connection INFO command , And through the returned content : Run from server ID namely run_id、 From the role of the server 、 Of the primary server ip And port 、 Priority from server 、 Copy offset and other information from the server , Update the instance structure of the slave server .

send out _sentinel__:hello

By default ,Sentinel At a rate of every two seconds , Connect to all monitored master and slave servers by command _sentinel__:hello The first order .

We know Sentinel Yes _sentinel__:hello Subscription to the channel will continue until Sentinel Disconnect from the server until .

That means , For each and Sentinel Connected servers ,Sentinel That is, connect to the server through the command _sentinel__:hello Channel sending information , And connect from the server through subscription _sentinel__:hello Channel receiving information .

For monitoring multiple of the same server Sentinel Come on , One Sentinel The message sent will be sent by other Sentinel Received , This information will be used to update other Sentinel For sending messages Sentinel The cognitive , It will also be used to update other Sentinel Knowledge of the monitored server .
let me put it another way , One Sentinel from _sentinel__:hello When the channel receives a message ,Sentinel This information will be analyzed , Extract... From the information Sentinel IP Address 、Sentinel Port number 、Sentinel function ID Wait for eight parameters , And perform the following checks :

  • If recorded in the message Sentinel function ID And receive information Sentinel Operation of ID identical , Then it means that this message was sent by yourself ,Sentinel This message will be discarded , No further treatment .
  • By contraries , If recorded in the message Sentinel function ID And receive information Sentinel Operation of ID inequality , This indicates that this message is used to monitor other servers of the same server Sentinel Sent , Receiving information Sentinel According to the parameters in the information , Update the instance structure of the corresponding master server .

It should be noted that ,Sentinel In the instance structure created for the master server sentinels The dictionary keeps the division Sentinel Beyond itself , All other servers that also monitor this master server Sentinel Information .

When Sentinel Discover a new... Through channel information Sentinel when , It will not only be new Sentinel stay sentinels Create the corresponding instance structure in the dictionary , And create a new company Sentinel Command connection for , And new Sentinel It will also create a connection to this Sentinel Command connection for , Finally, monitor multiple servers of the same primary server Sentinel Will form a network of interconnections .

send out PING command

Detect the subjective offline status
By default ,Sentinel Will create a command connection to all instances of it once per second ( Including the main server 、 From the server 、 other sentinel) send out PING command , And returned by instance PING Command reply to determine if the instance is online .

Sentinel In the configuration file down-after-milliseconds Option specifies Sentinel Determine the length of time it takes for the instance to enter the subjective logoff : If an example is in down-after-milliseconds In milliseconds , In succession Sentinel Return invalid reply , that Sentinel The instance structure corresponding to this instance will be modified , In structural flags Open in properties SRI_S_DOWN identification , This indicates that this instance has entered Subjective offline status .

Pay attention to one : The scope of the subjective referral duration option
User set down-after-milliseconds The value of the option , Not only will be Sentinel To determine the subjective offline status of the master server , It can also be used to judge all slave servers under the master server , And all the others that also monitor this master server Sentinel Of the subjective offline status .
Attention two : Multiple Sentinel The subjective length of referrals set may vary
For monitoring multiple of the same primary server sentinel Come on , They are set down-after-milliseconds The value of may be different , thus , When one Sentinel The main server is judged as a subjective offline , Other Sentinel You may still think that this primary server is online .
Detect objective offline
When Sentinel After a master server is judged as subjective offline , In order to confirm whether the main server is really offline , It will also monitor other Sentinel ask , See if they also think that the master server has entered the offline state ( Can be subjective offline or objective offline ). When Sentinel From the other Sentinel After receiving enough offline judgments ,Sentinel This master server will be determined as Objective offline , And perform a failover operation on the primary server .

Pay attention to one : Conditions for judging objective downline
When it is considered that the primary server has entered the offline state Sentinel The number of , exceed Sentinel Configuration quorum The value of the parameter , Then the Sentinel It will be considered that this master server has entered Objective offline status .
Attention two : Different Sentinel The conditions for judging the objective offline may be different
For monitoring multiple of the same primary server Sentinel Come on , The conditions under which they judge the primary server as an objective offline may also be different : When one Sentinel When the primary server is judged to be offline objectively , other Sentinel Maybe not .

Elections in the lead Sentinel

When a master server is judged to be objectively offline , Monitor each of the referrals master servers Sentinel There will be negotiations , Elect a leader Sentinel, And by the leading Sentinel Perform a failover operation on the offline primary server .
Elections in the lead Sentinel The rules and methods of :

  • All online Sentinel Have been elected as leaders Sentinel Qualifications , let me put it another way , Monitor multiple online servers on the same primary server Sentinel Any one of them could be the leader Sentinel.
  • Lead every time Sentinel After the election , Whether the election is successful or not , all Sentinel The value of the configuration era of will increase by one time . The configuration era is actually a counter .
  • In a configuration era , all Sentinel There was a time when someone Sentinel Set to local leader Sentinel The opportunity of , And once the local leader is set , You can't change it in this configuration era .
  • Each primary server is found to be offline Sentinel Would ask for something else Sentinel Set yourself as a local leader Sentinel.
  • When one sentinel( Source Sentinel) To the other Sentinel( The goal is Sentinel) When sending a command , Will bring the source Sentinel Of runID, This means that the source Sentinel Ask for goals Sentinel Set yourself as the local leader of the latter Sentinel.
  • Sentinel Set local leader Sentinel The rule is first come, first served : First to target Sentinel Send the source of the setup request Sentinel Will be the target Sentinel Partial lead of Sentinel, And then all the settings received will be targeted Sentinel Refuse .
  • The goal is Sentinel After receiving the command , Will go to the source Sentinel Return a command reply , In reply leader_runid Parameters and leader_epoch Parameters are recorded separately Sentinel Partial lead of Sentinel Operation of ID And configuration Era .
  • Source Sentinel On receiving the target Sentinel After the returned command is answered , Will check the reply leader_epoch Whether the value of is the same as its own configuration era , If it's the same , Then source Sentinel Continue to take out leader_runid Parameters , If leader_runid The value of the parameter and the source Sentinel Operation of ID Agreement , So, target Sentinel Will source Sentinel Set to local leader Sentinel.
  • If there is one Sentinel By more than half Sentinel Set to local header Sentinel, So this Sentinel Become a leader Sentinel.
  • Because the leader Sentinel It takes more than half Sentinel Support for , And each Sentinel Only one local leader can be set in each configuration era Sentinel, So in a configuration era , There will only be one leader Sentinel
  • If within a given time limit , None of them Sentinel Elected to lead Sentinel, So each Sentinel There will be another election after a period of time , Until the leader is chosen Sentinel until .

Master slave switch

To lead in an election Sentinel after , The lead Sentinel Failover will be performed on the offline primary server .

Fail over

Failover consists of the following three steps :

  • In all slave servers under the offline master service , Pick out a slave server , And convert it to the master server .
  • Change all slaves under the offline master server to copy the new master server .
  • Set the offline primary server as the secondary server of the new primary server , When the old master server comes back online , It will become the slave of the new master server .

Choose a new primary server

The lead Sentinel All slave servers of the offline master server will be saved in a list , Then follow the following rules , Filter the list item by item :

  • Delete all the slave servers in the offline or disconnected state in the list , This ensures that the remaining slave servers in the list are online .
  • Delete all the leaders in the list who haven't replied in the last five seconds Sentinel Of INFO Command from server , This ensures that the remaining slave servers in the list have been successfully communicating recently .
  • Delete all disconnection timeouts from the offline primary server down-after-milliseconds10 Millisecond from the server :down-after-milliseconds Option specifies the time required to determine whether the primary server is offline , And delete disconnect time is longer than down-after-milliseconds10 Millisecond from the server , You can ensure that the remaining slave servers in the list are not disconnected from the master server prematurely , let me put it another way , The remaining data in the list saved from the server is relatively new .
  • after , The lead Sentinel Will be based on the priority of the slave server , Sort the rest of the list from the server , Select the slave server with the highest priority . If there are multiple slave servers with the same highest priority , So the leader Sentinel Will be based on the replication offset from the server , Select the slave server with the largest copy offset . If there are multiple slave servers with the same maximum priority and replication offset , From which to run ID The smallest slave .

Modify the replication target from the server

When the new master server appears , The lead Sentinel The next step is to let all slave servers under the offline master server copy the new master server , This action can be done by sending SLAVEOF [server_ip] [server_port] Command to implement .

Change the old master server to the slave server

The last thing to do in a failover operation is , Set the offline master server as the slave of the new master server .
Because the old master server is offline , All such settings are saved in the instance structure corresponding to the old master server , When the old master server comes back online ,Sentinel I'm going to send it SLAVEOF command , Make it a slave of the new master server .

Learning links

Redis A detailed explanation of

原网站

版权声明
本文为[Just put a flower in heaven and earth]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/177/202206252118150843.html