当前位置：网站首页>Redis core technology and practice - learning notes (VII) sentinel mechanism

Redis core technology and practice - learning notes (VII) sentinel mechanism

2022-07-03 17:51:00 【Tom Kong】

One . The main library is down , How to provide uninterrupted service ？

The main library is down , Need to run a new master library ： Switch from library to main library . There are three issues involved ：

Does the main library really hang up ？

Select which slave library to use as the master library ？

How to notify the slave library and client about the new master library

Redis Master slave cluster , sentry It is the key mechanism to realize the automatic switching between master and slave libraries , Solve effectively Master slave replication mode Next Fail over The above questions .

Two . The basic process of sentry mechanism

Sentinel is a device that runs in a special mode Redis process , Master slave library instance runtime , He's running, too .

The sentry is responsible for three tasks ： monitor , Elector （ Select the master library ） and notice .

monitor

Monitoring is when the sentinel process is running , periodic To all Master-slave library send out PING command , Check if they are still running online .

From the library is not in Respond to the sentry's... Within the specified time PING command , The sentinel will mark it as " Offline status ";

The main library is not in The specified time is to respond to the sentry's PING command , The Sentry will judge When the main warehouse goes offline, start the main selection process .

Elector

After the sentry hung up in the main warehouse , according to Certain rules Select from the library as the new main library .

notice

notice ： The sentinel sends the selected new master database connection information to other slave databases , Let them execute replicaof command , and Establish a connection with the new master database , Copy the data . meanwhile , The sentinel will send the connection information of the new main library Notify client , Let them send the operation request to the new main library .

The notification task is simple , The sentinel only needs to send the new master database information to the slave database and the client . Tell them to establish a connection with the new main database , No decision logic involved .

Monitoring and choosing the master requires sentinels to make decisions ：

The surveillance mission sentry needs Judge whether the main library is offline ;

Choose the main task sentinel to decide Select the master slave instance as the master database .

Two . Supervisor offline and objective offline

The sentinel process will Use PING The command detects the connection between the master and slave Libraries , Used to judge the instance status ;

If the sentry finds the main library or the slave Library PING Command response timeout , that The sentinel will mark it as " Subjective offline ".

Slave Library , Sentinels can be simply marked as " Subjective offline ", Because offline from the library has little impact , The external services of the cluster will not be interrupted .

Main library , Sentinels cannot simply be marked as " Subjective offline ", Turn on the master-slave switch . Because there may be a situation ： The sentry misjudged , There is no fault in the main library , However, once the master selection and notification operations are started, the subsequent master selection and notification operations will bring additional computing and communication overhead . It may also produce Split brain .

What is miscalculation

The main library is not actually offline , But the sentry thought it was offline . Causes of misjudgment ： The cluster network is under great pressure , network Network congestion , The pressure of the main reservoir itself is high .

Once the sentry judges that the main warehouse is offline , It will start to reselect the main library , And let the slave database and the new master database to synchronize data , There will be overhead in this process .

The sentry also takes time to choose a new main library . So we need to reduce miscalculations .

The minority is subordinate to the majority

The sentry cluster ： Sentinels are deployed in a cluster mode composed of multiple instances . Introduce sentinel instance to judge , Avoid single sentinels because of their poor network conditions , Misjudge the offline situation of the main warehouse . meanwhile , Search more and the network is less likely to be unstable at the same time , They make decisions together to reduce the misjudgment rate .

Only majority Sentinel instance judges that the main database has been " Subjective offline ", The main library will be marked " Objective offline "----- The minority is subordinate to the majority

" Objective offline "：N A sentinel example , It's better to have N/2+1 The main database is " Subjective offline ", To judge as " Objective offline ".

Reduce the probability of misjudgment , Avoid unnecessary master-slave switching caused by misjudgment .

3、 ... and . Select the new master library

Screening + Scoring

filter ：

Check the current online status of the slave Library , Judge his previous network connection status

If the slave database is always disconnected from the master database , The number of disconnections exceeds a certain threshold , The slave network is in bad condition .

The way of judging ：

Use configuration items down-after-milliseconds*10.down-after-milliseconds It is the maximum connection timeout time of the short chain of the master-slave library .

The number of disconnections exceeds 10 Time , It indicates that the slave library network is in bad condition , Not suitable as a new master library .

Score from the library ：

Three rounds of scoring shall be carried out according to the three rules , The three rules are From library priority , Copy progress from library and Slave Library ID Number . As long as you get the highest score from the library in a certain round , He is the main library . End of the main selection process . If not There is a high score Slave Library , Then it will Go on to the next round .

The first round ： The one with the highest priority gets the highest score from the database

adopt To configure slave-priority Configuration item , Set different priorities for different slave Libraries . For example, the memory size of two slave libraries is different , You can manually set instances with large memory to a high priority . When selecting the master, the Sentry will select the one with the highest priority to score high as the new master database , If you keep scoring , Then start the second round of scoring .

The second round ： The slave database with the closest degree of synchronization with the old master database gets a high score

If you choose the slave library closest to the old master library as the master library , Then there will be the latest data on the new main database .

How to judge the synchronization progress between slave database and old master database ？

From library slave_repl_offset Closest to the old main library master_repl_offset, Then it has the highest score , It can be used as a new main library .

If two from the library slave_repl_offset The value is the same , Then you need to enter the third round of scoring .

The third round ：ID The smaller the number, the higher the score

Each instance will have a ID, at present Redsi When selecting the master-slave Library , There is a default rule ： stay priority and Replication progress In the same situation ,ID The one with the smallest number gets the highest score from the library , Will be selected as the new master library . Master selection completed ！

After class questions ：

Sentinel mechanism can realize the free switching between master and slave libraries , This is the key support to realize uninterrupted service , The master-slave switch takes some time , Whether the client can normally perform the requested operation during the switching process , How to realize gray switching without perception .

Client side usage Read / write separation , Then the read request can be in Normal execution from the library , Not affected . But because the main database has hung up , The sentry hasn't chosen a new main library yet , At this time, the write request will fail , Failure time = Sentinel switching master-slave time + When the client perceives the new master database .

programme 1：

The client will Write failed request cache perhaps Write Message Queuing Middleware in , After the sentinel has switched master-slave , Then execute these commands , But this scenario is only suitable for Write request return value insensitive service , And it also needs The business layer adapts . If the master-slave switching time is too long , Lead to The client or Message Queuing Middleware has too many cache write requests , After switching Replay request took too long .

programme 2：

The sentinel detects how long the master database does not respond and then switches between master and slave , You can configure the down-after-milliseconds ginseng Count ;

The shorter the configuration time , The more sensitive the sentry is , May lead to miscalculation . But when the main library really fails , Because the switch is timely , It has the least impact on the reaction , If the configuration time is longer , The more conservative the sentry is , It can reduce the probability of sentinel misjudgment , When the main library fails , Business write failures also take longer .

programme 3;

The sentry informs the client , So that the client can perceive the changes of the main database in time , Write cached write requests to the new library , Ensure that the post renewal request will not be affected .

After the sentinel has upgraded a slave library to a new master library , The Sentry will write the address of the new master library to his instance pubsub. The client subscribes to this topic , When pubsub When there is data, the latest main database address will be push To the client , Write the request to this new main database , This mechanism belongs to the sentry's active notification client .

If the client misses the push notification for some reason , Customers also need to take the initiative to go back to this topic information .

When the client accesses the master-slave Library , You can't write the address of the master-slave database directly , You need to get the latest address from the sentinel cluster （sentinel get-master-addr-by-name command ）, So when the sentry switches , The client can get the latest instance address from the sentinel cluster .
commonly Redis Of SDK All provide the instance address through the sentry , How to access the instance again , We can use it directly , You don't have to implement the logic yourself . In the case of only master-slave instances , The client needs to work with the sentry , And in the fragmentation cluster mode , All this logic can be done in proxy layer , In this way, the client does not need to care about the logic .

原网站

版权声明
本文为[Tom Kong]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202150325488893.html