当前位置:网站首页>Redis core technology and practice - learning notes (VII) sentinel mechanism
Redis core technology and practice - learning notes (VII) sentinel mechanism
2022-07-03 17:51:00 【Tom Kong】
One . The main library is down , How to provide uninterrupted service ?
The main library is down , Need to run a new master library : Switch from library to main library . There are three issues involved :
Does the main library really hang up ?
Select which slave library to use as the master library ?
How to notify the slave library and client about the new master library
Redis Master slave cluster , sentry It is the key mechanism to realize the automatic switching between master and slave libraries , Solve effectively Master slave replication mode Next Fail over The above questions .
Two . The basic process of sentry mechanism
Sentinel is a device that runs in a special mode Redis process , Master slave library instance runtime , He's running, too .
The sentry is responsible for three tasks : monitor , Elector ( Select the master library ) and notice .
monitor
Monitoring is when the sentinel process is running , periodic To all Master-slave library send out PING command , Check if they are still running online .
From the library is not in Respond to the sentry's... Within the specified time PING command , The sentinel will mark it as " Offline status ";
The main library is not in The specified time is to respond to the sentry's PING command , The Sentry will judge When the main warehouse goes offline, start the main selection process .
Elector
After the sentry hung up in the main warehouse , according to Certain rules Select from the library as the new main library .
notice
notice : The sentinel sends the selected new master database connection information to other slave databases , Let them execute replicaof command , and Establish a connection with the new master database , Copy the data . meanwhile , The sentinel will send the connection information of the new main library Notify client , Let them send the operation request to the new main library .
The notification task is simple , The sentinel only needs to send the new master database information to the slave database and the client . Tell them to establish a connection with the new main database , No decision logic involved .
Monitoring and choosing the master requires sentinels to make decisions :
The surveillance mission sentry needs Judge whether the main library is offline ;
Choose the main task sentinel to decide Select the master slave instance as the master database .
Two . Supervisor offline and objective offline
The sentinel process will Use PING The command detects the connection between the master and slave Libraries , Used to judge the instance status ;
If the sentry finds the main library or the slave Library PING Command response timeout , that The sentinel will mark it as " Subjective offline ".
Slave Library , Sentinels can be simply marked as " Subjective offline ", Because offline from the library has little impact , The external services of the cluster will not be interrupted .
Main library , Sentinels cannot simply be marked as " Subjective offline ", Turn on the master-slave switch . Because there may be a situation : The sentry misjudged , There is no fault in the main library , However, once the master selection and notification operations are started, the subsequent master selection and notification operations will bring additional computing and communication overhead . It may also produce Split brain .
What is miscalculation
The main library is not actually offline , But the sentry thought it was offline . Causes of misjudgment : The cluster network is under great pressure , network Network congestion , The pressure of the main reservoir itself is high .
Once the sentry judges that the main warehouse is offline , It will start to reselect the main library , And let the slave database and the new master database to synchronize data , There will be overhead in this process .
The sentry also takes time to choose a new main library . So we need to reduce miscalculations .
The minority is subordinate to the majority
The sentry cluster : Sentinels are deployed in a cluster mode composed of multiple instances . Introduce sentinel instance to judge , Avoid single sentinels because of their poor network conditions , Misjudge the offline situation of the main warehouse . meanwhile , Search more and the network is less likely to be unstable at the same time , They make decisions together to reduce the misjudgment rate .
Only majority Sentinel instance judges that the main database has been " Subjective offline ", The main library will be marked " Objective offline "----- The minority is subordinate to the majority
" Objective offline ":N A sentinel example , It's better to have N/2+1 The main database is " Subjective offline ", To judge as " Objective offline ".
Reduce the probability of misjudgment , Avoid unnecessary master-slave switching caused by misjudgment .
3、 ... and . Select the new master library
Screening + Scoring
filter :
Check the current online status of the slave Library , Judge his previous network connection status
If the slave database is always disconnected from the master database , The number of disconnections exceeds a certain threshold , The slave network is in bad condition .
The way of judging :
Use configuration items down-after-milliseconds*10.down-after-milliseconds It is the maximum connection timeout time of the short chain of the master-slave library .
The number of disconnections exceeds 10 Time , It indicates that the slave library network is in bad condition , Not suitable as a new master library .
Score from the library :
Three rounds of scoring shall be carried out according to the three rules , The three rules are From library priority , Copy progress from library and Slave Library ID Number . As long as you get the highest score from the library in a certain round , He is the main library . End of the main selection process . If not There is a high score Slave Library , Then it will Go on to the next round .
The first round : The one with the highest priority gets the highest score from the database
adopt To configure slave-priority Configuration item , Set different priorities for different slave Libraries . For example, the memory size of two slave libraries is different , You can manually set instances with large memory to a high priority . When selecting the master, the Sentry will select the one with the highest priority to score high as the new master database , If you keep scoring , Then start the second round of scoring .
The second round : The slave database with the closest degree of synchronization with the old master database gets a high score
If you choose the slave library closest to the old master library as the master library , Then there will be the latest data on the new main database .
How to judge the synchronization progress between slave database and old master database ?
From library slave_repl_offset Closest to the old main library master_repl_offset, Then it has the highest score , It can be used as a new main library .
If two from the library slave_repl_offset The value is the same , Then you need to enter the third round of scoring .
The third round :ID The smaller the number, the higher the score
Each instance will have a ID, at present Redsi When selecting the master-slave Library , There is a default rule : stay priority and Replication progress In the same situation ,ID The one with the smallest number gets the highest score from the library , Will be selected as the new master library . Master selection completed !
After class questions :
Sentinel mechanism can realize the free switching between master and slave libraries , This is the key support to realize uninterrupted service , The master-slave switch takes some time , Whether the client can normally perform the requested operation during the switching process , How to realize gray switching without perception .
Client side usage Read / write separation , Then the read request can be in Normal execution from the library , Not affected . But because the main database has hung up , The sentry hasn't chosen a new main library yet , At this time, the write request will fail , Failure time = Sentinel switching master-slave time + When the client perceives the new master database .
programme 1:
The client will Write failed request cache perhaps Write Message Queuing Middleware in , After the sentinel has switched master-slave , Then execute these commands , But this scenario is only suitable for Write request return value insensitive service , And it also needs The business layer adapts . If the master-slave switching time is too long , Lead to The client or Message Queuing Middleware has too many cache write requests , After switching Replay request took too long .
programme 2:
The sentinel detects how long the master database does not respond and then switches between master and slave , You can configure the down-after-milliseconds ginseng Count ;
The shorter the configuration time , The more sensitive the sentry is , May lead to miscalculation . But when the main library really fails , Because the switch is timely , It has the least impact on the reaction , If the configuration time is longer , The more conservative the sentry is , It can reduce the probability of sentinel misjudgment , When the main library fails , Business write failures also take longer .
programme 3;
The sentry informs the client , So that the client can perceive the changes of the main database in time , Write cached write requests to the new library , Ensure that the post renewal request will not be affected .
After the sentinel has upgraded a slave library to a new master library , The Sentry will write the address of the new master library to his instance pubsub. The client subscribes to this topic , When pubsub When there is data, the latest main database address will be push To the client , Write the request to this new main database , This mechanism belongs to the sentry's active notification client .
If the client misses the push notification for some reason , Customers also need to take the initiative to go back to this topic information .
When the client accesses the master-slave Library , You can't write the address of the master-slave database directly , You need to get the latest address from the sentinel cluster (sentinel get-master-addr-by-name command ), So when the sentry switches , The client can get the latest instance address from the sentinel cluster .
commonly Redis Of SDK All provide the instance address through the sentry , How to access the instance again , We can use it directly , You don't have to implement the logic yourself . In the case of only master-slave instances , The client needs to work with the sentry , And in the fragmentation cluster mode , All this logic can be done in proxy layer , In this way, the client does not need to care about the logic .
边栏推荐
- [Yu Yue education] family education SPOC class 2 reference materials of Shanghai Normal University
- ArrayList分析3 : 删除元素
- 聊聊支付流程的設計與實現邏輯
- PUT vs. POST for Uploading Files - RESTful API to be Built Using Zend Framework
- How to purchase Google colab members in China
- Deops入门
- 鸿蒙第三次培训
- TCP拥塞控制详解 | 3. 设计空间
- Analyse ArrayList 3: suppression d'éléments
- Leetcode 108 converts an ordered array into a binary search tree -- recursive method
猜你喜欢
Discussion sur la logique de conception et de mise en oeuvre du processus de paiement
QT learning diary 9 - dialog box
一入“远程”终不悔,几人欢喜几人愁。| 社区征文
聊聊支付流程的設計與實現邏輯
Baiwen.com 7 days Internet of things smart home learning experience punch in the next day
Leetcode 538 converts binary search tree into cumulative tree -- recursive method and iterative method
How to deploy applications on kubernetes cluster
Global and Chinese pediatric palliative care drug market development research and investment planning recommendations report 2022-2028
BFS - topology sort
How to read the source code [debug and observe the source code]
随机推荐
Interviewer: why is the value nil not equal to nil?
The third day of writing C language by Yabo people
A. Berland Poker &1000【简单数学思维】
AcWing 3438. Number system conversion
解决Zabbix用snmp监控网络流量不准的问题
Global and Chinese health care OEM and ODM market status survey and investment planning recommendations report 2022-2028
小程序 多tab 多swiper + 每个tab分页
Golang unit test, mock test and benchmark test
OpenSSL的SSL/BIO_get_fd
supervisor监控Gearman任务
Deops入门
How to deploy applications on kubernetes cluster
Introduction to SolidWorks gear design software tool geartrax
Codeforces Round #803 (Div. 2) C. 3SUM Closure
模块九作业
[combinatorics] generating function (shift property)
企业级自定义表单引擎解决方案(十一)--表单规则引擎1
The difference between i++ and ++i: tell their differences easily
Inheritance of ES6 class
Leetcode 538 converts binary search tree into cumulative tree -- recursive method and iterative method