当前位置:网站首页>Redis cache avalanche, breakdown and penetration
Redis cache avalanche, breakdown and penetration
2022-06-30 04:12:00 【Winter dream chaser】
Text
That mentioned Redis I believe you are interviewing , Or cache in the actual development process An avalanche , through , breakdown It's not strange , I don't know, but you've heard of it , What is the difference between the three , How can we prevent this from happening , We have the next victim .
The interview begins
A paunch , A middle-aged man in a plaid shirt , Take one full of scratches mac Coming to you , Look at the Balding Hair , I thought it must be NIMA's top architect ! But we are full of poetry and calligraphy , Empty is not empty .

Boy, I see it's on your resume Redis, So let's get to the point , Several common problems of direct connection ,Redis Avalanche understand ?
Hello, handsome and charming interviewer , I know , At present, e-commerce home page and hot data will be cached , Generally, the cache is refreshed by scheduled tasks , Or it can't be found and then it can be updated , There is a problem in refreshing scheduled tasks .
A simple example : If the Key The expiration time is 12 Hours , At noon, 12 Point refreshed , I have a second kill event at zero time, and a large number of users rush in , Assuming that at that time 6000 A request , Originally, the cache could hold every second 5000 A request , But cache all Key All failed . here 1 second 6000 Requests all drop to database , Database can't bear it , It will call the police , The truth may DBA I didn't respond, so I just hung up . here , If there is no special plan to deal with this fault ,DBA Very anxious , Restart the database , But the database was immediately killed by new traffic . This is what I understand as a cache avalanche .
I have deliberately looked at the projects I have done, and I feel that it is not allowed to hoist any more QPS Direct hit DB Go to , But not slow SQL Plus sub Treasury , It's possible that a large watch can be divided into several parts , But it's used Redis There is still a big gap

Large area failure at the same time , That moment Redis It's the same as nothing , So it's almost catastrophic for this number of requests to hit the database directly , Think about it. If you hang up, it's a library of user services , Almost all interfaces that depend on his library will report errors , If you don't do the strategy of fusing, it's basically the rhythm of hanging a piece in a moment , How do you restart the user will hang you up , When you can restart it , Users have been sleeping for a long time , And lost faith in your product , What waste products .
The interviewer touched his hair , Well, not bad , How about that ? How do you deal with it ?
Simple to deal with cache avalanche , In batch to Redis When saving data , Put each Key It's good to add a random value to the failure time of , This ensures that the data will not fail in large areas at the same time , I Believe ,Redis This flow can still hold up .
setRedis(Key,value,time + Math.random() * 10000);
If Redis Is a cluster deployment , Distribute hotspot data evenly among different Redis All failures can also be avoided in the library , But when I operate the cluster in the production environment , A single service is a corresponding single Redis Fragmentation , To facilitate data management , But it also has the disadvantage that it may fail , Random failure time is a good strategy .
Or set the hotspot data to never expire , Update cache with update operation ( For example, the operation and maintenance department has updated the home page products , Then you swipe the cache and it's done , Do not set expiration time ), This operation can also be used for data of e-commerce Homepage , insurance .
Do you know cache penetration and breakdown , What's the difference between them and avalanches ?
Um. , understand , Let me start with cache penetration , Cache penetration refers to data that does not exist in cache or database , And users are constantly requesting , Our database id All are 1 Starting to grow , If initiated as id The value is -1 Data or id For very large nonexistent data . The user at this time is likely to be the attacker , The attack will cause too much pressure on the database , Serious database crash .
Small stand-alone system , Basically used postman You're going to die , For example, the alicloud service I bought myself

Like this, if you don't check the parameters , database id Are greater than 0 Of , I always use less than 0 Parameters to ask you , You can get around it every time Redis Directly to the database , We can't find it in the database , Every time , Concurrent highs are prone to collapse .
as for Cache breakdown Well , This heel Cache avalanche It's kind of like , But it's a little different , Cache avalanche is due to large cache failures , It's broken DB, The difference between cache breakdown Cache breakdown It means a Key Very hot , Constantly carrying big concurrency , Large concurrent centralized access to this point , When this Key At the moment of failure , Continuous large concurrency breaks through the cache , Direct request database , It's like cutting a hole in a perfect bucket .

The interviewer showed a gratifying look , How do they deal with each other
Cache penetration I will add verification in the interface layer , For example, user authentication verification , Parameter verification , Illegal parameter direct code Return, such as :id Do basic verification ,id <=0 Direct interception, etc .
One thing I want to mention here is , We all need one when we develop programs “ Distrust ” The heart of , Just don't trust any caller , For example, you provided API Interface out , You have these parameters , Then I think as the callee , Any possible parametric case should be considered , Make a check , Because you don't believe the people who call you , You don't know what parameters he'll send you .
A simple example , Your interface is paged , But you don't limit the size of paging parameters , In case the caller checks in one breath Integer.MAX_VALUE One request will take you a few seconds , How many concurrent are you going to hang up ? It's a call from a colleague of the company. It's a big deal. It's a change , But what about hackers or competitors ? What happens when you switch your interface on your double 11 day , I don't need to say it . This is before Leader Tell me , I think everyone should understand .
Data not available from cache , In the database, there is no access to , In this case, the corresponding Key Of Value Write as null、 Wrong position 、 Try again later. What's the specific value of the product , Or see the specific scene , Cache effective time can be set as a short point , Such as 30 second ( Setting too long will cause normal conditions to be unavailable ).
This prevents the attacker from repeatedly using the same id Violent attack , But we need to know that normal users will not make so many requests in a single second , Gateway layer Nginx I also remember the configuration of this slag , It can make the operation and maintenance of a single IP The number of accesses per second exceeds the threshold IP Dura black .
Do you have any other way ?
And I remember Redis There's also an advanced use The bloon filter (Bloom Filter) This can also prevent the occurrence of cache penetration , His principle is also very simple is to use efficient data structure and algorithm to quickly judge your Key Whether it exists in the database , You don't exist return Just fine , You'll find out if it exists DB Refresh KV Again return.
Then there's a little buddy who says that if there's a lot of hackers IP Attack at the same time ? I haven't always thought it would work , But the average hacker doesn't have so many chickens , And then the normal level Redis The cluster can resist this level of access , Small companies. I don't think they're interested . Make the system highly available , Cluster is still very strong .
Cache breakdown Words , Set hotspot data never to expire . Or you can add the mutex
As a warm man , Code, I'm sure I'm ready for you

summary
Let's play , Make a noise , Don't make fun of the interview .
This article briefly introduces ,Redis Of An avalanche , breakdown , through , All three are similar , But there are some differences , In the interview, it's a must for cache , Let's not confuse the three , Because of cache avalanche 、 Penetration and breakdown , The biggest problem with caching , Or it doesn't show up , When it does, it's fatal , So the interviewer will ask you .
You must understand that How did it happen , And how to get there avoid Of , How to go after it happens Rescue , You don't know , But you can't stop thinking about it , Sometimes interview is not necessarily a question of knowledge , Maybe it's a question of your attitude , If you think clearly , then Know what it is and know why it is That would be great , You know how to prevent it. Come to work .
Finally, warm man, I will continue to make a small technical summary for you :
Generally, to avoid the above situation, we will analyze it from three periods of time :
In advance :Redis High availability , Master-slave + sentry ,Redis cluster, Avoid total collapse .
In the matter : Local ehcache cache + Hystrix Current limiting + Downgrade , avoid MySQL Killed .
After the event :Redis Persistence RDB+AOF, Once the restart , Automatically load data from disk , Quick recovery of cached data .
I'll be in the sling series Redis I'll tell you all about this month Redis Finish , Current limiting components , Can set requests per second , How many can pass through the components , Remaining failed requests , What do I do ? Take the demotion ! You can return some default values , Or friendship tips , Or a blank value .
benefits :
Database will never die , The current limiting component ensures how many requests can pass each second . As long as the database doesn't die , That is to say , For users ,3/5 All requests of can be processed . As long as there is 3/5 Can be processed , That means your system is not dead , For users , Maybe it's just that you can't swipe the page after clicking several times , But a few more times , You can brush it out once .
This is the most common in the mainstream Internet factories , Are you curious , What happened to a star , You find that you can swipe the blank interface when you go to Weibo , But some people went straight in , You brush a few times and come out , Now you know , It was a downgrade , Sacrifice the experience of some users for the security of the server , It's OK ?
边栏推荐
- Sql语句遇到的错误,求解
- 进程间通信之匿名管道
- base64.c
- Green new power and "zero" burden of computing power -- JASMINER X4 series is popular
- 声网自研传输层协议 AUT 的落地实践丨Dev for Dev 专栏
- Day 9 script and resource management
- 毕业设计EMS办公管理系统(B/S结构)+J2EE+SQLserver8.0
- Solve the problem of Navicat connecting to the database
- Daily summary of code knowledge
- Geometric objects in shapely
猜你喜欢

Node red series (28): communication with Siemens PLC based on OPC UA node
![[note] on May 27, 2022, MySQL is operated through pychart](/img/34/36a3765683b2af485ca7c3e366da59.png)
[note] on May 27, 2022, MySQL is operated through pychart

第十二天 进阶编程技术

dotnet-exec 0.5.0 released

When easycvr deploys a server cluster, what is the reason why one is online and the other is offline?

About manipulator on Intelligent Vision Group

AI落地的新范式,就“藏”在下一场软件基础设施的重大升级里

深度融合云平台,对象存储界的“学霸”ObjectScale来了

基于ROS的SLAM建图、自动导航、避障(冰达机器人)

【图像融合】基于交叉双边滤波器和加权平均实现多焦点和多光谱图像融合附matlab代码
随机推荐
DO280私有仓库持久存储与章节实验
绿色新动力,算力“零”负担——JASMINER X4系列火爆热销中
SQL server2005中SUM函数中条件筛选(IF)语法报错
进程间通信之匿名管道
Solve the problem of Navicat connecting to the database
The school training needs to make a registration page. It needs to open the database and save the contents entered on the registration page into the database
数据链路层详解
Technology sharing | broadcast function design in integrated dispatching
Everyone, Flink 1.13.6, mysql-cdc2.2.0, the datetime (6) class extracted
RPC correction
Error encountered in SQL statement, solve
[cloud native] AI cloud development platform - Introduction to AI model foundry (developers can experience AI training model for free)
Pytorch Profiler+ Tensorboard + VS Code
GIS related data
(03).NET MAUI实战 基础控件
Machine learning notes
487-3279(POJ1002)
Day 11 script and game AI
JS proxy
Day 12 advanced programming techniques