当前位置:网站首页>Solve the problem of cache and database double write data consistency
Solve the problem of cache and database double write data consistency
2022-07-05 12:17:00 【Xujunsheng】
Solve the consistency problem of cache and database double write data
The role of caching
At the beginning of the company , When the business volume is relatively small . Read and write requests to the system , Our general practice is to directly operate the database . But with the continuous growth of business volume , Increased user requests , At this time, it is not enough to only use the database to deal with business , There are performance problems . The general practice in the industry is to introduce 「 cache 」.
Caching can improve performance , Relieve database pressure , But at the same time, cache will also appear 「 Cache and database data are inconsistent 」 The problem of .
If the data is inconsistent , It will cause the application to read the latest data in the cache , This is obviously unacceptable . Before we solve this problem , First of all, we need to understand :「 This leads to inconsistent double writes between the cache and the database 」 Why .
The reason why the cache and database double write are inconsistent
Let's take a look first Cache and database consistency definitions :
- Data in cache , And consistent with the database data ;
- No data in cache , The database data is up to date .
that Failure to comply with these two conditions belongs to the problem of inconsistency between cache and database 了 .
When the client sends a data modification request , We don't just have to modify the database , And we'll do it together ( modify / Delete ) cache . There is another order problem in the operation of database and cache : What is the Database first still Operation first cache .
Now let's report to..., as a client MySQL Take the deletion and modification of data in as an example to analyze the inconsistency of data .
Let's not consider concurrency , Under normal circumstances , No matter who comes first, who comes after , Can keep the two consistent , But now we need to focus on abnormal situation .
At this time, the application needs to modify both the database and the cache ( The effects of deletion and modification are similar , Convenience , Only modification operations are described below ) The data of .
Here are two scenarios. Let's take a look at them respectively :
- Modify cache first , Modify the database again
- I'm going to modify the database , Then modify the cache
Let's assume that Modify cache first , Modify the database again . If the cache modification is successful , But the database operation failed , that , When the application accesses the data again , The value in the cache is correct , But once in the cache 「 Data failure 」 perhaps 「 Cache downtime 」, then , The application then accesses the database , At this time, the value in the database is the old value , The application accesses the old value .
If we Update the database first , Update the cache again The value in , Can we solve this problem ? We continue to analyze .
If the application completes the database update first , however , Failed to update cache . that , The value in the database is a new value , In the cache are old values , It must be atypism Of .
This is the time , If there are other concurrent requests to access data , Follow the normal access process , Will first query in the cache , here , You'll read the old value .
Okay , Come here , We can see , In the process of operating the database and updating the cache value , No matter in which order these two operations are executed first, which is later , as long as 「 The operation of the second step 」 failed , This will cause the client to read the old value .
We continue to analyze , except 「 The second operation failed 」 The problem of , What other scenarios will affect data consistency : Concurrency issues .
Consistency problems caused by concurrency
Here are all the strategies , And the deletion and modification operations are discussed separately :
- Update the database first , Update cache after
- Update cache first , Update the database after
- Update the database first , Delete cache after
- So let's delete the cache , Update the database after
Update the database first , Update cache after
Suppose we use 「 Update the database first , Update cache after 」 The plan , And in both steps Successful execution Under the premise of , If there is concurrency , What's going to happen ?
There are threads A And thread B Two threads , You need to update 「 Same article 」 data x, Such a scenario may occur :
- Threads A Update the database (x = 1)
- Threads B Update the database (x = 2)
- Threads B Update cache (x = 2)
- Threads A Update cache (x = 1)
In the end, we found out , In the database x yes 2, And in the cache is 1. Obviously inconsistent .
In addition, this kind of scene is generally Not recommended The use of . Because of some business factors , Finally, the value written to the cache is not consistent with the database , It may take a series of calculations , Finally, write this value to the cache ; If there are a large number of requests to write data to the database at this time , But read requests are not many , At this point, if the cache is updated every time a write request is made , So the performance loss is very large .
For example, in the database x = 1
, Now we have 10 A request to add one operation at a time . But during this period, no read operation came in , If you update the database first , Then there will be 10 Requests to update the cache , There's going to be a lot of cold data .
as for 「 Update cache first , Update the database after 」 This situation is consistent with the above problems , We will not continue to discuss .
Whether you modify the cache first or later , In this way, the utilization of cache is not high , It also wastes machine performance . So at this point, we need to consider another scheme : Delete cache .
So let's delete the cache , Update the database after
Let's say I have two threads : Threads A( to update x ), Threads B( Read x ). The following scenarios may occur :
- Threads A Delete the cache first x , Then go to the database for update operation ;
- Threads B Now read x, It is found that the data is not in the cache , Query the database and add it to the cache ;
- And at this point the thread A The transaction for has not yet been committed .
This is the time 「 So let's delete the cache , Update the database after 」 There will still be inconsistencies between the database and the cache .
Update the database first , Delete cache after
We also use two threads : Threads A( to update x ), Threads B( Read x ) give an example .
- Threads A You have to put the data x From 1 Updated to 2, First, the database was successfully updated ;
- Threads B Read required x Value , But threads A The new value has not been updated to the cache ;
- At this point the thread B What I read is old data 1;
however , The probability of this happening is very small , Threads A Will quickly delete the cache median . thus , When another thread reads again , A cache miss occurs , Then read the latest value from the database . therefore , This situation has little impact on the business .
thus , We can adopt this scheme , To try to avoid the consistency problem between the database and cache in the case of concurrency .
below , We continue to analyze 「 The second operation failed 」, What should we do ?
How to ensure double write consistency
How to ensure 「 The second operation failed 」 The double writing of is consistent ?
Previously, we analyzed that , Whether it's 「 Update cache 」 still 「 Delete cache 」, As long as the second step fails , Then it will cause inconsistency between the database and the cache .
The key here is how to ensure the success of the second step .
First , Introduce a method :「 Message queue based retry mechanism 」.
Message queue based retry mechanism
say concretely , Is to cache operations , Or the request to operate the database is temporarily stored in the queue . Reprocess these requests through the consumption queue .
The process is as follows :
- request A Update the database first ;
- In the face of Redis When deleting, it is found that the deletion failed ;
- At this time will be Yes Redis Delete operation of Send to message queue as message body ;
- The system receives the message sent by the message queue , Again Redis Delete operation .
Two features of message queue meet our need for retry :
- Guarantee reliability : Messages written to the queue , You won't lose until you spend successfully ( Don't worry about restarting the project );
- Ensure that the message is delivered successfully : Downstream pull messages from the queue , The message will not be deleted until it is consumed successfully , Otherwise, it will continue to deliver messages to consumers ( In line with our scenario ).
Problems caused by introducing queues :
- Business code causes a lot of intrusion , At the same time, it increases the maintenance cost ;
- There will also be failures when writing queues .
For those two questions , first , We usually use message queues in projects , Maintenance costs have not increased much . And the probability of failure of both writing queue and cache at the same time is still very small .
If you really don't want to retry using queues in your application , At present, there are also popular solutions : Subscribe to database change logs , Reoperation cache . We are right. MySQL After the database is updated , stay binlog We can find the corresponding operation in the log , So we can subscribe to MySQL Database binlog Logs operate on the cache .
Subscribe to change logs , At present, there are more mature open source middleware , Like Ali's canal
.
The general process is as follows :
- The system modifies the database , Generate binlog journal ;
- canal Subscribe to this blog , Get specific operation data , Post to message queue ;
- Through message queuing , Delete data in cache .
summary : Recommend 「 Update the database first , Delete the cache 」 programme , And cooperate with 「 Message queue 」 or 「 Subscribe to change logs 」 To ensure the consistency of database and cache .
How to ensure data consistency in concurrent scenarios
We analyzed earlier , In a concurrent scenario ,「 So let's delete the cache , Update the database 」, Due to network delay , There may be data inconsistency . I'll stick the above picture again .
The core of the problem is : The cache has been planted 「 The old value 」.
Solve this problem , The most effective way is to , Delete the cache . however , You can't delete... Immediately , But needs 「 Delayed deletion 」, This is the solution given by the industry : Cache delay double deletion strategy .
In a thread A After updating the database values , We can let it first sleep For a short time , Do another cache delete operation .
Why add sleep During this period of time , Just to make the thread B Be able to read data from the database first , Then write the missing data to the cache , then , Threads A Then delete .
But here's the problem , This 「 Delay deletion 」 cache , How long will the delay time be set ?
Threads A sleep Time for , You need to be larger than thread B The time to read data and then write it to the cache . This needs to be estimated when the actual business is running .
besides , In fact, there is another scenario that will also have inconsistencies : If the database uses Read write separation architecture , There will also be a time difference between master-slave synchronization , It may also lead to inconsistencies :
- Threads A Update master library x = 2( Original value x = 1);
- Threads A Delete cache ;
- Threads B The query cache , missed , Inquire about 「 Slave Library 」 Get old value ( Slave Library x = 1);
- Slave Library 「 Sync 」 complete ( Master-slave library x = 2);
- Threads B take 「 The old value 」 Write cache (x = 1).
In the final cache x Is the old value 1, The final value of the master-slave library is the new value 2. The data is inconsistent .
The solution to this problem is , For threads B This kind of query operation , You can force it to point to the main database for query , You can also use the above 「 Delay deletion 」 Strategy solution .
Take this approach , It's just to ensure consistency as much as possible , In extreme cases , There is still a possibility of inconsistency .
So in practice , It's still recommended to use 「 Update the database first , Delete the cache 」 The plan , meanwhile , Try to ensure that 「 Master slave copy 」 Don't have too much delay , Reduce the probability of problems .
summary
- The system introduces cache to improve application performance
- After introducing cache , You need to consider the cache and database double write consistency , The options are :「 Update the database + Update cache 」、「 Update the database + Delete cache 」
- Either way , As long as the second step fails , Cannot guarantee the consistency of data , In response to such problems , You can retry the solution through the message queue
- 「 Update the database + Update cache 」 programme , stay 「 Concurrent 」 In this scenario, cache and data consistency cannot be guaranteed , And there are 「 Cache resources are wasted 」 and 「 Machine performance waste 」 What happened , It is generally not recommended to use
- stay 「 Update the database + Delete cache 」 The scheme ,「 So let's delete the cache , Update the database 」 stay 「 Concurrent 」 There are still data inconsistencies in this scenario , The solution is 「 Delay double delete 」, But this delay time is difficult to evaluate , So I recommend 「 Update the database first , Delete the cache 」 The plan
- stay 「 Update the database first , Delete the cache 」 Under the plan , To ensure that both steps are performed successfully , Need to cooperate 「 Message queue 」 or 「 Subscribe to change logs 」 To do , The essence is through 「 retry 」 To ensure data consistency
- stay 「 Update the database first , Delete the cache 」 Under the plan ,「 Read / write separation + Master slave delay 」 It will also lead to inconsistency between cache and database , The solution to this problem is 「 Forced read main database 」 perhaps 「 Delay double delete 」, Send... With experience 「 Delay message 」 Go to the queue , Delay delete cache , At the same time, it is also necessary to control the delay of the master-slave Library , Minimize the probability of inconsistency .
If you want to see more quality original articles , Welcome to my official account. 「ShawnBlog」.
边栏推荐
- PXE启动配置及原理
- Learn JVM garbage collection 05 - root node enumeration, security points, and security zones (hotspot)
- JS for loop number exception
- Course design of compilation principle --- formula calculator (a simple calculator with interface developed based on QT)
- 投资理财适合女生吗?女生可以买哪些理财产品?
- HiEngine:可媲美本地的云原生内存数据库引擎
- [pytorch pre training model modification, addition and deletion of specific layers]
- Halcon 模板匹配实战代码(一)
- 多表操作-自关联查询
- [hdu 2096] Xiaoming a+b
猜你喜欢
Read and understand the rendering mechanism and principle of flutter's three trees
HiEngine:可媲美本地的云原生内存数据库引擎
The evolution of mobile cross platform technology
Liunx prohibit Ping explain the different usage of traceroute
Hiengine: comparable to the local cloud native memory database engine
Pytorch softmax regression
One article tells the latest and complete learning materials of flutter
什么是数字化存在?数字化转型要先从数字化存在开始
[untitled]
[email protected] (using password"/>
Solve the error 1045 of Navicat creating local connection -access denied for user [email protected] (using password
随机推荐
What is digital existence? Digital transformation starts with digital existence
Deep discussion on the decoding of sent protocol
Redis cluster (master-slave) brain crack and solution
Differences between IPv6 and IPv4 three departments including the office of network information technology promote IPv6 scale deployment
MySQL splits strings for conditional queries
跨平台(32bit和64bit)的 printf 格式符 %lld 输出64位的解决方式
Codeforces Round #804 (Div. 2)
Understand kotlin from the perspective of an architect
Seven polymorphisms
图像超分实验:SRCNN/FSRCNN
One article tells the latest and complete learning materials of flutter
How to clear floating?
【load dataset】
Get all stock data of big a
2022年国内云管平台厂商哪家好?为什么?
POJ-2499 Binary Tree
Which domestic cloud management platform manufacturer is good in 2022? Why?
Matlab boundarymask function (find the boundary of the divided area)
Solve the error 1045 of Navicat creating local connection -access denied for user [email protected] (using password
信息服务器怎么恢复,服务器数据恢复怎么弄[通俗易懂]