当前位置:网站首页>Frequently asked questions about redis
Frequently asked questions about redis
2022-06-26 00:07:00 【Just put a flower in heaven and earth】
Cache breakdown
What is cache breakdown
Cache breakdown refers to a request to access data , Not in cache , But there are cases in the database .
Generally speaking, this situation is that the cache has expired . But at this time, because there are many users accessing the cache concurrently , This is a hot spot key, Requests from so many users come at the same time , There is no data in the cache , So at the same time, I access the database to get data , Cause a surge in database traffic , The pressure suddenly increases ,
So a data cache , Each request quickly returns data from the cache , But at some point in time, the cache fails , A request did not request data in the cache , At this time, we say this request is " breakdown " Cache .
How to solve the cache breakdown problem
Solve the cache breakdown problem , There are roughly three ideas :
- Only one request is released to the database , Then do the operation of building cache
With the help of Redis set( for example :SET mykey Redis EX 1000 NX) Command to set a flag bit . Set successful release , If the setting fails, wait for polling . The released request goes back to the build cache operation . - Backstage renewal
The idea of this scheme is , Open a scheduled task in the background , Actively update data that is about to expire . - Never expire
Why is the cache broken down ? Because the expiration time is set , It is recycled after expiration , Then you can directly set it not to expire , Simple violence !
Cache penetration
What is cache penetration
Cache penetration It refers to a request to access data , Neither the cache nor the database , And users in a short time 、 High density of such requests , Every time you request to the database service , Put pressure on the database .
Generally speaking, such requests are malicious requests . I know that there is no such data here ( There is no cache or data ), But still send such a request , If there are too many such requests , It's easy to overwhelm the database .
How to solve the problem of cache penetration
There are generally two solutions : Caching empty objects and The bloon filter .
Caching empty objects
When the storage tier misses , Even empty objects returned are cached , At the same time, an expiration time will be set , Then accessing this data will get from the cache , Protected back-end data sources .
shortcoming : If null values can be cached , This means that the cache needs more space to store more keys , Because there may be a lot of empty keys ; Even if expiration time is set for null value , There will be some inconsistency between the data of cache layer and storage layer for a period of time , This has an impact on businesses that need to be consistent . How to avoid ?
- The query result is also cached if it is empty , The cache time is set a little shorter , Or should key Corresponding data insert Clean up the cache after .
- Yes, it must not exist key To filter . You can put all the possibilities key Put it in a big Bitmap in , Query through the bitmap Filter .
The bloon filter
adopt Bloom filter details This article I believe you should be able to pass redis Implement a simple bloom filter . So how to solve the cache penetration problem through the bloom filter ?
It's very simple , Take all of our redis Of key All are added to the bloom filter in advance , Then judge the request key Is in collection , If it does not exist, it can be considered illegal key Just end the request . In this way, most requests can be filtered out in advance , So as to protect the database .
Cache avalanche
What is a cache avalanche
Cache avalanche It means that a large number of cached data reach the expiration time in a very short time , At the same time, there are many requests to query these cached data , This causes all requests to be made directly to the database , Cause a surge in database traffic , And then cause the database to crash . At this time, there is no data in the cache , There is data in the database .
How to solve the problem of cache avalanche
The solutions are as follows :
Data preheating
In order to prevent cache avalanche, hot data can be preheated in advance , This can avoid a large number of requests as soon as the project goes online , And there is no corresponding data in the cache . Data preheating It means before the formal deployment , Go through the possible data in advance , Manually trigger the loading of different caches key, In this way, a large amount of data that may be accessed will be loaded into the cache .Peak staggering expiration
Peak staggering expiration is the simplest way to prevent cache avalanche , in other words , Set up key When the expiration time , Add a short random expiration time , Make the cache expiration time as uniform as possible , This can avoid the cache avalanche caused by a large number of caches expiring at the same time .Second level cache
A1 For the original cache ,A2 Cache for copy ,A1 When the failure , You can visit A2,A1 Cache expiration time is set to short-term ,A2 Set to long term .Current limiting the drop
The idea of this solution is : After cache failure , Control the number of threads that read the database write cache by locking or queuing . For example, to some key Only one thread is allowed to query data and write cache , Other threads, etc. or directly downgrade .redis High availability
There is an extreme case of cache avalanche , It's all Redis The server cannot provide external services , In this case , What we have to do is to improve Redis High availability . High availability means , since redis It's possible to hang up , I'll add more redis, After this one goes down, others can continue to work , In fact, it's a cluster built .
Redis and DB Data double write
What is data double writing
How to solve the data double write problem
Using caching can improve performance 、 Relieve database pressure , But using caching can also lead to data inconsistency problems . To solve the problem of cache and database consistency , There are generally three classic modes :
- Cache-Aside Pattern
Bypass cache mode , It is proposed to solve the problem of data inconsistency between cache and database as much as possible . This mode is divided into : Reading process and Writing process .
Reading process : When reading , Read cache first , If cache hits , Direct return data ; If the cache misses , Just read the database , Take data from the database , After putting it in the cache , Return response at the same time .
Writing process : When it's updated , Update the database first , Then delete the cache . - Read-Through/Write through
Read/Write Through In the pattern , The server takes the cache as the main data storage . The application interacts with the database cache , It's all done through the abstract cache layer . This mode is divided into : Reading process and Writing process .
Reading process : Read data from the cache , Read back to ; If you can't read , Load from database , After writing to the cache , And back to the response . This process and Cache-Aside Pattern The pattern is very similar to , Actually Read-Through It's just one more layer Cache-Provider. actually ,Read-Through It's just Cache-Aside There's a layer of encapsulation on it , It will make the code more concise , It also reduces the load on the data source .
Writing process :Write-Through In mode , When a write request occurs , The cache abstraction layer also updates the data source and cache data . - Write behind
Write behind Follow Read-Through/Write-Through There are similar places , It's all by Cache Provider To be responsible for cache and database reading and writing . There's a big difference between them :Read/Write Through Is to update the cache and data synchronously ,Write Behind It just updates the cache , Don't update the database directly , Through batch asynchronous way to update the database .
A comparison of the three models :
- Cache Aside The update mode is relatively simple to implement , But you need to maintain two data stores : One is caching (Cache), One is the database (Repository).
- Read/Write Through The write mode needs to maintain a data store ( cache ), It's a little more complicated to implement .
- Write Behind Caching Update mode and Read/Write Through Update mode is similar to , The difference is that Write Behind Caching The data persistence operation of update mode is asynchronous , however Read/Write Through The data persistence operation of update mode is synchronous .
- Write Behind Caching The advantage of the is that the direct operation of the memory is fast , Multiple operations can be merged and persisted to the database . The disadvantage is that data can be lost , For example, power failure of the system .
Cache-Aside The problem of
in fact , What we use most in actual development is Cache-Aside Pattern Pattern , Next, let's analyze the use of Cache-Aside Pattern Possible problems with patterns .
Question 1 : When updating data ,Cache-Aside Delete cache , Or should we update the cache ?
The answer is to delete the cache rather than update it . Let's look at an example , Let's say I have two threads A and B, Write at the same time .A The thread updates the database first , But because of the network , here B The thread updates the database again , The cache is also updated , Last A The thread updates the cache , In this scenario, the database data and cache data are inconsistent . If the deletion cache replaces the update cache, the dirty data problem will not appear .
Update cache versus delete cache , There are two disadvantages :
- If you write the cache value , If it's a complex calculation , If the cache is updated frequently , It's a waste of performance .
- In the case of more writing and less reading , Most of the time, the data has not been read , It's been updated again , This also wastes performance ( actually , Write more scenes , It's not very cost-effective to use caching )
Question two : In the case of double writing , Operate database first or cache first ?
The answer is to operate the database first . for instance , Suppose there is A、B Two requests , request A Do update operation , request B Do query read operation , If you are operating the cache first , The operation sequence of two threads may appear as follows :
- Threads A Initiate a write operation , The first step is to delete the cache
- The thread B Initiate a read operation , Found no data in cache
- Threads B Continue to read DB, Read out an old data
- Then the thread B Set the old data into the cache
- Threads A write in DB The latest data
In this scenario, the database is new data , But the cache is old data , This leads to data inconsistency .
A scheme to ensure data double write consistency
There are roughly three schemes to guarantee Redis and DB Data consistency : Delay double delete strategy 、 Delete cache retry mechanism and Read biglog Delete cache asynchronously . In fact, the scheme mentioned here and the ideas mentioned above can be considered as complementary , These three schemes also contain different design ideas , But they are more inclined to concrete logical implementation .
Delay double delete strategy
So-called Delay double delete Delete the cache first , Update the database , Sleep for a while , Delete cache again .
disadvantages : Combined with double deletion strategy + Cache timeout settings , The worst case scenario is that the data is inconsistent within the timeout period , It also increases the time-consuming of writing requests .
Question 1 : How to determine the sleep time ?
answer : The sleep time is determined according to the time spent reading the data business logic of your project . This is mainly to ensure that the read request ends before the write request is completed , Write requests can remove cached dirty data caused by read requests .
There may be scenes :A The write request first deletes the cached data , But in A Before updating the database, another one came B Read request , And B First, I read the old data in the database , then A Updated database , If A If the sleep time is very short , Soon A The cache is deleted for the second time after hibernation , But now because of B After reading the data , Need a bunch of calculations , Wait like this B When the calculation is completed and the data is updated into the cache ,A You may have completed the second deletion , This leads to data inconsistency .
Question two : What if you use a read-write separation architecture ?
answer : In fact, you can still use the delayed double deletion strategy , Only the sleep time is modified to be based on the delay time of master-slave synchronization , Add a few hundred ms that will do .
Question 3 : Adopt this synchronous elimination strategy , What to do with throughput reduction ?
answer : The second deletion can be implemented asynchronously . That is to start a thread by yourself , Delete asynchronously , The request written in this way does not need to sleep for a period of time before returning . Do it , Increase throughput .
Delete cache retry mechanism
Whether it's delayed double deletion or Cache-Aside First operate the database and then delete the cache , If the second step of deleting the cache fails ? If delete cache fails , There will also be data inconsistency . In this case, we can introduce Delete cache retry mechanism To solve this problem , The general steps are as follows :
- Write requests to update the database
- Caching for some reason , Delete failed
- Delete the failed key Put it in the message queue
- Consume messages from message queues , Get the key
- Retry the delete cache operation
Read biglog Delete cache asynchronously
The delete cache retry mechanism can solve the problem of data inconsistency caused by cache deletion failure , But it will cause many business code intrusions . Actually , Also through the database of binlog To eliminate key. With mysql For example , You can use Ali's canal take binlog Log collection sent to MQ Inside the queue , Then write a simple cache to delete the message subscriber subscription binlog journal , According to the update log Delete cache , And through ACK The mechanism confirms the processing of this update log, Ensure data cache consistency .
Actually , It's easy to see Read biglog Delete cache asynchronously In fact, that is Delete cache retry mechanism Another way to realize .
Here we need to pay attention to the problems caused by master-slave delay , If a read request comes in after the write request updates the main database and deletes the cached data , But at this time, the master-slave replication is delayed, resulting in the old data in the slave database , At this time, the read request will get the expired data from the library and update it to the cache , This still causes data inconsistencies . The solution to this situation is : Read logs from the library . Some students may have questions , If a master database has multiple slave databases , Which slave database should we subscribe to ? To solve this problem, we need to know about open source software canel Principle :canal It's a disguised as slave subscribe mysql Of binlog, Middleware for data synchronization .
Redis The concurrency problem of contention in
What is? Redis The concurrency problem of contention in
This is also a very common online problem , Multiple clients write one concurrently key, Maybe the data that should have arrived first came later , Wrong data version , Or multiple clients get one at the same time key, Change the value and write it back , As long as the order is wrong , The data is wrong . These scenes are collectively referred to as “Redis Contention concurrency problem ”. It should be noted that ,Redis The contention concurrency problem usually occurs in Redis When faced with a large number of requests , For example, there are tens of thousands of read and write requests per second , For applications with low request volume, this problem will not occur .
How to solve Redis The concurrency problem of contention in
From the above description, we can see , There are many scenarios of competing concurrency problems , Different scenarios have different solutions .
Scene one
There are multiple clients that need to operate at the same time keyA, however keyA The value of must be queried in the database and then written to the cache , There are two points to be noted : First, query from the database and write to the cache should be atomic , 2. The sequence of multiple client requests .
Scene two
There are multiple requests to reduce the inventory of a certain commodity :1. Get the current inventory value 2. Calculate new inventory value 3. Write new inventory value
. But there is a wrong situation :A The requested inventory is 30,B Request for inventory is also 30, then A subtract 5, Inventory surplus 25,B subtract 5, Inventory surplus 25, At this time, there is no requirement for the order of updating inventory values , But obviously the final inventory value is different from what we expected .
Using distributed locks + Timestamp scheme
This scheme is suitable for scenarios that require sequence . for example , Scene one .
In this case, distributed locks are used + The timestamp method can be solved , The first point can be solved by using distributed locks , Using time stamps can achieve sequencing , When the following request is executed set Determine the timestamp before operation , If you find that the timestamp of an existing value is greater than your own timestamp , Just give up this modification , If it is less than, modify it directly .
However, this scheme requires that the time of each system is consistent , Otherwise, the timestamp is meaningless , The version number can be used instead of .
Use Redis Of watch
This scheme is suitable for scenarios that require sequence . for example , Scene two .
Be careful not to use in a partitioned cluster
In depth understanding of Redis Business
Using message queuing
This method is a general solution in some high concurrency scenarios . Whether or not order is required , Can use this scheme .
In the case of too much concurrency , It can be processed through message middleware , Serializing parallel reading and writing . hold Redis.set Actions are placed in a queue to serialize , One by one implementation required .
边栏推荐
猜你喜欢

WINCC与STEP7的仿真连接_过路老熊_新浪博客

keil编译运行错误,缺少error:#5:#includecore_cm3.h_过路老熊_新浪博客

ValueError: color kwarg must have one color per data set. 9 data sets and 1 colors were provided

About the solution to prompt modulenotfounderror: no module named'pymongo 'when running the scratch project

Mysql5.7.31自定义安装详细说明

How to configure SQL Server 2008 Manager_ Old bear passing by_ Sina blog

Connecting MySQL database with VBScript_ Old bear passing by_ Sina blog

懒人教你用猕猴桃一月饱减16斤_过路老熊_新浪博客

Circuit de fabrication manuelle d'un port série de niveau USB à TTL pour PL - 2303hx Old bear passing Sina blog

Let's talk about string today
随机推荐
huibian
文献调研(二):基于短期能源预测的建筑节能性能定量评估
社交网络可视化第三方库igraph的安装
Some common operation methods of array
Topic36——53. 最大子数组和
贴片加工厂家讲解__ICT是什么?主要测试什么?ICT测试的优缺点?
Oracle writes a trigger that inserts a piece of data first and updates a field in the data
11.1.1、flink概述_flink概述
Common knowledge points in JS
Idea common shortcut keys
Redis之跳跃表
Building cloud computers with FRP
Using swiper to realize the rotation chart
Mysql5.7.31自定义安装详细说明
Detailed explanation of redis
(转载)进程和线程的形象解释
10.4.1 data console
SPI锡膏检查机的作用及原理
10.4.1 données intermédiaires
10.2.2、Kylin_kylin的安装,上传解压,验证环境变量,启动,访问