当前位置:网站首页>PHP solves the problems of cache avalanche, cache penetration and cache breakdown of redis
PHP solves the problems of cache avalanche, cache penetration and cache breakdown of redis
2022-07-05 10:18:00 【Emma'】
One : Preface
Design a caching system , The question that has to be considered is : Cache penetration 、 Avalanche effect in cache breakdown and failure .
Two : Cache penetration
Cache penetration refers to querying a certain nonexistent data , Because the cache is written passively on Miss , And for the sake of fault tolerance , If no data can be found from the storage tier, it will not be written to the cache , This will cause the non-existent data to be queried by the storage layer every time it is requested , It loses the meaning of caching . A large amount of requested data is not obtained from the cache , Cause the database to go , It's possible to bring down the database , Paralyze the whole service . When the flow is large , Probably DB It's gone , If someone takes advantage of something that doesn't exist key Attack our applications frequently , This is the loophole .
For example, the article list , Generally, our primary key ID Are unsigned self increasing types , Some people want to destroy your database , Use negative numbers for each request ID, and ID Negative records don't exist in the database at all .
Solution
There are many ways to effectively solve the problem of cache penetration , The most common is the use of bloon filter , Hash all possible data to a large enough bitmap in , A certain nonexistent data will be This bitmap Intercept , Thus, the query pressure on the underlying storage system is avoided . There's also a simpler and rougher way ( This is what we use ), If the data returned by a query is empty ( No matter how many There is no such thing as , Or a system failure ), We still cache this empty result , But its expiration time will be very short , Up to five minutes .
- For things like ID Illegal requests with negative numbers are directly filtered out , Use of Blum filter (Bloom Filter). The bloon filter :
<?php
class Bloom {
// The number of hash functions
protected $hashNum = 3;
// The size of the digit group
protected $bitArrayCount = 1024*10;
// Digit group
protected $bitArray = [];
public function __construct()
{
// Build default digit group , All set to false
$this->bitArray = array_pad([], $this->bitArrayCount, false);
}
/** * obtain hash function ; That is, in the bit array , Need to be changed to true The index of * @param string $key Elements * @return array */
protected function getIndexes($key)
{
$indexes = [];
for ($i = 0; $i < $this->hashNum; $i ++) {
$index = sprintf('%u', crc32($key . $i)); // Use crc32 hash
$index = $index % $this->bitArrayCount; // obtain In digit group Location
$indexes[] = $index;
}
return $indexes;
}
/** * Add elements to the filter * @param string $key Elements to add */
public function addItem($key)
{
$indexes = $this->getIndexes($key);
// take hash The corresponding bit of the result is modified to true
foreach ($indexes as $index) {
$this->bitArray[$index] = true;
}
}
/** * Whether this element exists in the filter ; true Indicates that it is likely to exist ,false It must not exist * @param string $key Elements * @return array */
public function mightExist($key)
{
$indexes = $this->getIndexes($key);
foreach ($indexes as $index) {
if (! $this->bitArray[$index]) {
return false;
}
}
return true;
}
}
class Test
{
public function run()
{
$bloom = new Bloom();
// Add... To the filter 1000 Elements
for ($i = 0; $i < 100; $i ++) {
$bloom->addItem($i);
}
// test Filter judgment result
for ($i = 90; $i < 110; $i ++) {
$mightExist = $bloom->mightExist($i);
if ($mightExist) {
echo "might exist ", $i, PHP_EOL;
} else {
echo "not exist ", $i, PHP_EOL;
}
}
}
}
(new Test())->run();
- For records that cannot be found in the database , We still store the empty data in the cache , Of course, a shorter expiration time is usually set .
// Set up articles ID by -10000 The cache of is empty
$id = -10000;
$redis->set('article_content_' . $id, '', 60);
var_dump($redis->get('article_content_' . $id));
3、 ... and : Cache avalanche
Cache avalanche is when we set up the cache with the same expiration time , Causes the cache to fail at the same time at a certain time , Forward all requests to DB,DB Avalanche with excessive instantaneous pressure .
The reason for invalidating the cache set :
- redis The server is down .
- Set the same expiration time for cached data , This causes the cache set to fail in a certain period of time .
Solution
The avalanche effect of cache failure has a terrible impact on the underlying system . Most system designers consider using lock or queue to guarantee the single line of cache cheng ( process ) Write , So as to avoid a large number of concurrent requests falling on the underlying storage system in case of failure . Here's a simple solution to spread the cache failure time , For example, we can add a random value to the original failure time , such as 1-5 Minutes at random , In this way, the repetition rate of each cache expiration time will be reduced , It's hard to trigger a collective failure .
How to solve cache set invalidation :
- For the reason 1, Can achieve redis High availability ,Redis Cluster perhaps Redis Sentinel( sentry ) And so on .
- For the reason 2, Add a random value when setting the cache expiration time , Avoid cache expiration at the same time .
<?php
$redis = new Redis();
$redis->connect('127.0.0.1', 6379, 60);
$redis->auth('');
// Set the expiration time plus a random value
$redis->set('article_content_1', ' Article content ', 60 + mt_rand(1, 60));
$redis->set('article_content_2', ' Article content ', 60 + mt_rand(1, 60));
- Use a dual cache strategy , Set up two caches , Raw cache and standby cache , When the original cache expires , Access alternate cache , Set the standby cache expiration time to be longer .
// Raw cache
$redis->set('article_content_2', ' Article content ', 60);
// Set the standby cache , Set the expiration time longer
$redis->set('article_content_backup_2', ' Article content ', 1800);
Four : Cache breakdown
For some with expiration set key, If these key It may be accessed at some point in time with super high concurrency , It's a very “ hotspot ” The data of . This is the time , A question needs to be considered : The cache is “ breakdown ” The problem of , The difference between this and cache avalanche is that this is for a certain key cache , The former is a lot of key.
The difference between cache breakdown and cache avalanche is that this is for a certain hot key cache , The avalanche is aimed at the centralized invalidation of a large number of caches .
When the cache expires at a certain point in time , Right at this point in time Key There are a lot of concurrent requests coming , These requests usually find that the cache is expired from the back end DB Load data and reset to cache , At this time, a large number of concurrent requests may instantly put the back end DB Overwhelmed .
Solution
1、 Make this popular key Your cache never expires .
there “ Never expire ” It has two meanings :
(1) from redis Look up , It's true that the expiration time is not set , That's the guarantee , There will be no hot spots key Overdue problem , That is to say “ Physics ” Not overdue .
(2) functionally , If it doesn't expire , That's static ? So we keep the expiration date key Corresponding value in , If it's found to be overdue , Build the cache through a background asynchronous thread , That is to say “ Logic ” Be overdue
From the perspective of actual combat , This method is very performance friendly , The only drawback is when building the cache , The rest of the threads ( Threads that are not building the cache ) Maybe it's old data , But for general Internet functions, this is tolerable .
2、 Use mutexes , adopt redis Of setnx Implement mutually exclusive lock .
A common practice in the industry , It's using mutex. To put it simply , When the cache fails ( Judge that the value is empty ), Not immediately load db, Instead, use some operations with the return value of the successful operation of the caching tool first ( such as Redis Of SETNX perhaps Memcache Of ADD) Go to set One mutex key, When the operation returns success , Proceed again load db And reset the cache ; otherwise , Just try the whole thing again get Caching method .
SETNX, yes 「SET if Not eXists」 Abbreviation , That is, it can only be set when it does not exist , It can be used to achieve the lock effect .
<?php
function getRedis()
{
$redis = new Redis();
$redis->connect('127.0.0.1', 6379, 60);
return $redis;
}
// Lock
function lock($key, $random)
{
$redis = getRedis();
// Set the lock timeout , Avoid lock release failure ,del() operation failed , Generate deadlock .
$ret = $redis->set($key, $random, ['nx', 'ex' => 3 * 60]);
return $ret;
}
// Unlock
function unLock($key, $random)
{
$redis = getRedis();
// The function of random numbers here is , Prevent the update cache operation from taking too long , The effective time of the lock has been exceeded , Cause other requests to get the lock .
// But after the last request to update the cache , Delete the lock without judgment , The locks created by other requests will be deleted by mistake .
if ($redis->get($key) == $random) {
$redis->del($key);
}
}
// Get article data from cache
function getArticleInCache($id)
{
$redis = getRedis();
$key = 'article_content_' . $id;
$ret = $redis->get($key);
if ($ret === false) {
// Generate lock key
$lockKey = $key . '_lock';
// Generate random number , Used to set the value of the lock , It will be used when releasing the lock later
$random = mt_rand();
// Get the mutex
if (lock($lockKey, $random)) {
// This is pseudo code , Means to get article data from the database
$value = $db->getArticle($id);
// Update cache , The expiration time can be adjusted according to the situation
$redis->set($key, $value, 2 * 60);
// Release the lock
unLock($lockKey, $random);
} else {
// wait for 200 millisecond , Then get the cache value again , Let other processes that get the lock get the data and set the cache
usleep(200);
getArticleInCache($id);
}
} else {
return $ret;
}
}
3、" advance " Use mutexes (mutex key):
stay value Internal settings 1 Timeout value (timeout1), timeout1 More than practical memcache timeout(timeout2) Small . When from cache Read timeout1 When it's found out it's overdue , Extend now timeout1 And reset to cache. Then load the data from the database and set it to cache in .
4、 resource protection :
use netflix Of hystrix, It can be used as a resource isolation and protection main thread pool , If you apply this to the construction of cache, it's not too bad .
Four solutions : There is no best but the best
8、 ... and : summary
For business systems , It's always a case by case analysis , No best , Only the most suitable . Last , For the cache system common cache full and data loss problems , It needs to be analyzed according to the specific business , Usually we use LRU Policy handling overflow ,Redis Of RDB and AOF Persistence strategy to ensure the data security under certain circumstances .
边栏推荐
- 面试:List 如何根据对象的属性去重?
- Unity particle special effects series - the poison spray preform is ready, and the unitypackage package is directly used - on
- Design and Simulation of fuzzy PID control system for liquid level of double tank (matlab/simulink)
- [论文阅读] KGAT: Knowledge Graph Attention Network for Recommendation
- 《天天数学》连载58:二月二十七日
- 请问大佬们 有遇到过flink cdc mongdb 执行flinksql 遇到这样的问题的么?
- Mysql80 service does not start
- SLAM 01.人类识别环境&路径的模型建立
- How to write high-quality code?
- How did automated specification inspection software develop?
猜你喜欢
Wechat applet - simple diet recommendation (4)
Hard core, have you ever seen robots play "escape from the secret room"? (code attached)
Fluent generates icon prompt logo widget
. Net delay queue
Implementation of smart home project
What is the most suitable book for programmers to engage in open source?
A high density 256 channel electrode cap for dry EEG
Cut off 20% of Imagenet data volume, and the performance of the model will not decline! Meta Stanford et al. Proposed a new method, using knowledge distillation to slim down the data set
【小技巧】获取matlab中cdfplot函数的x轴,y轴的数值
uniapp + uniCloud+unipay 实现微信小程序支付功能
随机推荐
@JsonAdapter注解使用
Activity jump encapsulation
WorkManager的学习二
小程序中自定义行内左滑按钮,类似于qq和wx消息界面那种
Events and bubbles in the applet of "wechat applet - Basics"
pytorch输出tensor张量时有省略号的解决方案(将tensor完整输出)
《天天数学》连载58:二月二十七日
AtCoder Beginner Contest 258「ABCDEFG」
学习笔记6--卫星定位技术(上)
能源势动:电力行业的碳中和该如何实现?
Write double click event
Wechat applet - simple diet recommendation (3)
最全是一次I2C总结
AtCoder Beginner Contest 254「E bfs」「F st表维护差分数组gcd」
Zblogphp breadcrumb navigation code
Cent7 Oracle database installation error
字节跳动面试官:一张图片占据的内存大小是如何计算
橫向滾動的RecycleView一屏顯示五個半,低於五個平均分布
[论文阅读] KGAT: Knowledge Graph Attention Network for Recommendation
Comment obtenir le temps STW du GC (collecteur d'ordures)?