当前位置:网站首页>How to ensure cache and database consistency
How to ensure cache and database consistency
2022-07-26 16:52:00 【Tao Ge is still fly】
Association between cache and database
But as business grows , Your project requests are increasing , At this time, if you read data from the database every time , There must be a performance problem .
The usual practice at this stage is , introduce cache To improve read performance , But how to use it ?
The figure shows the general process of our request

Specific process :
- The write request still only writes to the database
- Read requests read the cache first , If the cache does not exist , Read from the database , And rebuild the cache
- meanwhile , Write data in cache , Set the expiration time
thus , Infrequently accessed data in the cache , as time goes on , Will gradually 「 Be overdue 」 Eliminate , Reserved in the final cache , Are frequently accessed thermal data , Cache utilization is maximized .
Data consistency issues ( Non concurrent )
When the data is updated , We don't just have to operate the database , Also operate the cache . The specific operation is , When modifying a piece of data , Not only to update the database , Also update with the cache .
But the database and cache are updated , There are also priority problems , The corresponding scheme is 2 individual :
- Update cache first , Update the database after
- Update the database first , Update cache after
Let's not consider concurrency , Under normal circumstances , No matter who comes first, who comes after , Can keep the two consistent , But now we need to focus on anomalies .
Because the operation is divided into two steps , Then there is likely to be a first step success 、 The failure of the second step occurs .
Update cache first , Update the database after
If the cache update succeeds , But database update failed , Then the latest value is in the cache , But there are old values in the database .
Although the read request can hit the cache at this time , Get the right value , however , Once the cache fails , The old value will be read from the database , Rebuilding the cache is also the old value .
At this time, users will find that the data they modified before has changed back , Impact on business .
Update the database first , Update cache after
If the database update is successful , But cache update failed , Then the latest value in the database , Old value in cache .
Subsequent read requests read old data , Only when the cache expires , To get the correct value from the database .
At this time, the user will find , I just modified the data , But I can't see the change , After a while , The data just changed , It will also have an impact on the business .
Ensure the successful execution of the second step , Is the key to solving the problem
Retry after failure , Until success , But to avoid taking up too many resources , Should adopt Asynchronous retry , In fact, it is to write the retry request to the message queue , Then a special consumer will try again , Until success . Or more directly , In order to avoid the failure of the second step , We can cache the operation , Put directly Message queue in , The consumer operates the cache .
- Message queuing ensures reliability : Messages written to the queue , You won't lose until you spend successfully ( Don't worry about restarting the project )
- Message queuing ensures the successful delivery of messages : Downstream pull messages from the queue , The message will not be deleted until it is consumed successfully , Otherwise, it will continue to deliver messages to consumers ( In line with our scenario )
As for the write queue failure and the maintenance cost of the message queue :
- Write queue failed : Operation cache and write message queue , At the same time, the probability of failure is actually very small
- Maintenance cost : Message queues are commonly used in our projects , Maintenance costs have not increased much
Another way : Subscribe to database change logs , Reoperation cache
take MySQL give an example , When a piece of data is modified ,MySQL A change log will be generated (Binlog), We can subscribe to this blog , Get the specific operation data , Then based on this data , Delete the corresponding cache .
Consistency problem in concurrency
Suppose we use Update the database first , Update the cache again The plan , And on the premise that both steps can be successfully performed , If there is concurrency , What's going to happen ?
There are threads A And thread B Two threads , You need to update 「 Same article 」 data , This will happen :
- Threads A Update the database (X = 1)
- Threads B Update the database (X = 2)
- Threads B Update cache (X = 2)
- Threads A Update cache (X = 1)
Final X The value of in the cache is 1, In the database is 2, There is an inconsistency .
in other words ,A Although before B happen , but B Time to operate the database and cache , But it's better than that A It's a short time , The execution sequence is out of order , The final result of this data is not in line with expectations . Again Update cache first 、 Updating the database Similar problems will arise in our scheme .
Every time the data changes , All update cache , But the data in the cache may not be read immediately , This will lead to a lot of infrequently accessed data in the cache , Waste cache resources .
So at this point, we need to consider another scheme : Delete cache .
Can deleting the cache ensure consistency
There are also schemes for deleting the cache 2 Kind of :
- So let's delete the cache , Update the database after
- Update the database first , Delete cache after
If the second step fails , Will lead to inconsistent data .
So let's look here at Concurrent Under the circumstances :
So let's delete the cache , Update the database after
If there is 2 Threads need to be concurrent Reading and writing data , The following scenarios may occur :
- Threads A To update X = 2( Original value X = 1)
- Threads A So let's delete the cache
- Threads B Read cache , Discover that there is no , Read old value from database (X = 1)
- Threads A Writes the new value to the database (X = 2)
- Threads B Write the old value to the cache (X = 1)
Final X The value of in the cache is 1( The old value ), In the database is 2( The new value ), There is an inconsistency .
so , So let's delete the cache , Update the database after , Happen when read + Write Concurrent , There are still data inconsistencies .
Update the database first , Delete cache after
- In cache X non-existent ( database X=1)
- Threads A Read database , Get old value (X=1)
- Threads B Update the database (X=2)
- Threads B Delete cache
- Threads A Write the old value to the cache (X=1)
Final X The value of in the cache is 1( The old value ), In the database is 2( The new value ), Inconsistencies also occur .
At this time, three conditions for inconsistent data are met
- The cache just expired
- Read request + Write request concurrency
- Update the database + Time to delete cache ( step 3-4), Than reading a database + Short write cache time ( step 2 and 5)
In fact, the probability of occurrence at this time is very low , Because writing a database usually starts with Lock , So write the database , It usually takes longer than reading the database .
Delay and delay double deletion of master-slave database
** Question 1 :** In both cases, there are old value reentry
Question two :
Update the database first , Then delete the cache scheme , Read / write separation + The delay of master-slave database can also lead to inconsistency :
- Threads A Update master library X = 2( Original value X = 1)
- Threads A Delete cache
- Threads B The query cache , missed , The query gets the old value from the library ( Slave Library X = 1)
- Synchronization from Library completed ( Master-slave library X = 2)
- Threads B Write the old value to the cache (X = 1)
Final X The value of in the cache is 1( The old value ), In the master-slave library is 2( The new value ), Inconsistencies also occur .
Solve the first problem : In a thread A Delete cache 、 After updating the database , Sleep for a while , Delete the cache again .
Solve the second problem : Threads A A delay message can be generated , Write to message queue , Consumer delays deleting cache .
The purpose of these two programs , All to clear the cache , thus , Next time, you can read the latest value from the database , Write cache .****
边栏推荐
- About the idea plug-in I wrote that can generate service and mapper with one click (with source code)
- Win11怎么自动清理回收站?
- The Ministry of Public Security issued a traffic safety warning for summer tourism passenger transport: hold the steering wheel and tighten the safety string
- [untitled]
- 限流对比:Sentinel vs Hystrix 到底怎么选?
- The difference and efficiency comparison of three methods of C # conversion integer
- JD Sanmian: I want to query a table with tens of millions of data. How can I operate it?
- Why is digital transformation so difficult?!
- 如何保证缓存和数据库一致性
- 2022牛客暑期多校训练营1(ACDGIJ)
猜你喜欢

NUC 11构建 ESXi 7.0.3f安装网卡驱动-V2(2022年7月升级版)

Win11怎么自动清理回收站?

MVC和ECS两种设计架构的初浅理解

第一章概述-------第一节--1.3互联网的组成

2022 software testing skills postman+newman+jenkins continuous integration practical tutorial

【开发教程9】疯壳·ARM功能手机-I2C教程

2022-2023 topic recommendation of information management graduation project

Tcpdump命令详解

Tdengine landed in GCL energy technology, with tens of billions of data compressed to 600gb

Alibaba side: analysis of ten classic interview questions
随机推荐
Vs2017 opens the project and prompts the solution of migration
Can TCP and UDP use the same port?
Win11怎么自动清理回收站?
Sharing of 40 completed projects of high-quality information management specialty [source code + Thesis] (VI)
【Express接收Get、Post、路由请求参数】
kubernetes之探针
2022 Niuke summer multi school training camp 1 (acdgij)
Win11系统如何一键进行重装?
A firefox/chrome plug-in that visualizes browser history
TensorFlow Lite源码解析
Replicationcontroller and replicaset of kubernetes
广州市安委办发布高温天气安全防范警示提醒
【开发教程7】疯壳·开源蓝牙心率防水运动手环-电容触摸
About the idea plug-in I wrote that can generate service and mapper with one click (with source code)
JD Sanmian: I want to query a table with tens of millions of data. How can I operate it?
Operating system migration practice: deploying MySQL database on openeuler
如何借助自动化工具落地DevOps|含低代码与DevOps应用实践
kubernetes之ReplicationController与ReplicaSet
Response object - response character data
Re8: reading papers Hier spcnet: a legal stat hierarchy based heterogeneous network for computing legal case