当前位置:网站首页>Redis distributed lock failure, I can't help but want to burst
Redis distributed lock failure, I can't help but want to burst
2022-07-02 21:36:00 【Hollis Chuang】
Source of the article :https://c1n.cn/OZvGN
Catalog
background
Problem analysis
Solution
summary
background
The enterprise and micro alarm group continuously sends out production environment error warnings , The core information of error reporting is as follows :
redis setNX error java.lang.NumberFormatException: For input string: "null"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
......
Locate by abnormal information , Discovery is customized in the project Redis Distributed lock error , And the exception occurred suddenly after the recent requirement was launched , And accompanied by the exception , There is also the problem of partial disorder of the business data involved in the requirements .
Problem analysis
Old rules , First post the code involved :
// section
public class RedisLockAspect{
public void around(ProceedingJoinPoint pjp) {
String key = "...";
try {
// Blocking , Until the lock is acquired
while (!JedisUtil.lock(key, timeOut)) {
Thread.sleep(10);
}
// Execute business logic
pjp.proceed();
}finally {
JedisUtil.unLock(key);
}
}
}
The above is customized Redis Facets of distributed locks , Don't look at the details , Just look at the overall logic , No big problem .
Let's look at the actual locking method :
public class JedisUtil{
public static boolean lock(String key, long timeOut){
long currentTimeMillis = System.currentTimeMillis();
long newExpireTime = currentTimeMillis + timeOut;
RedisConnection connection = null;
try {
connection = getRedisTemplate().getConnectionFactory().getConnection();
Boolean setNxResult = connection.setNX(key.getBytes(StandardCharsets.UTF_8), String.valueOf(newExpireTime).getBytes(StandardCharsets.UTF_8));
// Location 1
if(setNxResult){
expire(key,timeOut, TimeUnit.MILLISECONDS);
return true;
}
// Location 2
Object objVal = getRedisTemplate().opsForValue().get(key);
String currentValue = String.valueOf(objVal);
// Location 3, The abnormal position is if In judgment Long.parseLong(currentValue),currentValue by null String
if (currentValue != null && Long.parseLong(currentValue) < currentTimeMillis) {
String oldExpireTime = (String) getAndSet(key, String.valueOf(newExpireTime));
if (oldExpireTime != null && oldExpireTime.equals(currentValue)) {
return true;
}
}
}
return false;
}
public static void unLock(String key){
getRedisTemplate().delete(key);
}
}
Experienced boss sees this code , I guess I can't help being rude , But let's leave it alone , Look at the wrong position first .
Abnormal information can be seen ,currentValue The value of is string “null”, namely String.valueOf(objVal) Medium objVal The object is null, That is to say Redis in ,key Corresponding value non-existent .
Now think about it ,key Corresponding value non-existent , There are only two cases :
key Be actively deleted
key Out of date
Continue to follow the code up , It is found that setNx command , And back to setNxResult Indicates whether it was successful .
Normally , When setNxResult by false When , Locking failed , At this point, the code should not go down , But in this code , But continue to go down !
Asked relevant colleagues , It is said that it is to make a reentrant lock ......( Weak roast , But the re-entry lock doesn't work like this ...)
In fact, this analysis , You can already know what caused the abnormal fault , That's what it says ,key Be actively deleted 、key Due to expiration .
Let's assume that there are two threads , To the same key Lock , Corresponding to the above two situations respectively :
①key Being deleted voluntarily , Occurs after the distributed locking logic is executed , call unlock Method , See above RedisLockAspect Class finally part , Here's the picture :
②key Past due , Mainly after the thread is locked and the expiration time is set , The time spent executing business code exceeds the set lock expiration time , And before the lock expires , Lock not renewed :
Solution
From the code above , It's not simple anymore Long.parseLong("null") Problem. , This is the whole thing Redis The problem of distributed lock implementation .
And the distributed lock is widely used in the whole project , It is conceivable that the problem is very serious , If it's just a solution Long.parseLong("null") The problem of , There is no doubt that it is tickling between boots , It doesn't make any sense .
In general , Customize Redis Distributed locks are prone to the following problems :
setNx Lock release problem
setNx Expire Atomic question
Lock expiration problem
Multi thread lock release problem
Reentrant problem
Spinlock problem in case of a large number of failures
Lock data synchronization under master-slave architecture
Combined with the above fault codes , You can find Redis The implementation of distributed locks is hardly correct Redis Consider the distributed lock problem .
The following are the main problems and corresponding solutions :
setNx and expire Atomic manipulation : Use Lua Script , In a Lua In the script command , perform setNx And expire command , Guaranteed atomicity .
Lock expiration problem : To prevent the lock from automatically expiring , Before the lock expires , Periodically renew the lock expiration time .
Reentrant problem : The granularity of reentrant design needs to reach the thread level , Thread uniqueness can be added to the lock id.
Lock spin problem : Reference resources JDK in AQS Design , To achieve the maximum waiting time when acquiring a lock .
For the problems in the project and the solution implementation of each problem ,baidu There are a lot of references at once , No more about .
At present, the more mature comprehensive solution is to use Redisson client , The following is simple pseudocode demo:
public class RedisLockAspect{
@Autowired
private Redisson redisson;
public void around(ProceedingJoinPoint pjp) {
String key = "...";
Long waitTime = 3000L;
// Get the lock
RLock lock = redisson.getLock(key);
boolean lockSuccess = false;
try {
// Lock and set timeout , Prevent infinite spin . The watchdog function is enabled by default ( Automatically renew locks )
lockSuccess = lock.tryLock(waitTime);
// Execute business logic
pjp.proceed();
}finally {
// Unlock , Prevent other thread locks from being released
if (lock.isLocked() && lock.isHeldByCurrentThread() && lockSuccess){
lock.unlock();
}
}
}
}
Use Redisson It can quickly solve the problems in the current project Redis Problems with distributed locks . besides , about Redis Lock problem caused by data synchronization in master-slave architecture , Corresponding solutions RedLock, The corresponding implementation is also provided .
See the official documents for more information :
https://github.com/liulongbiao/redisson-doc-cn
summary
For distributed locks , The realizable scheme is far more than Redis This implementation approach , For example, based on Zookeeper、 be based on Etcd And so on .
But for the purpose , They all go the same way , The point is , How to safely 、 Use these solutions correctly , Make sure the business is normal .
For the R & D team , For similar problems , Technical partners need to be trained , Keep improving technology , We need to pay more attention to codereview Work , Identify risks in a timely manner , Avoid serious loss caused by failure ( This failure caused dirty data repair to take more than a week ).
Fear technology , Loyal to business .
End
My new book 《 In depth understanding of Java The core technology 》 It's on the market , After listing, it has been ranked in Jingdong best seller list for several times , At present 6 In the discount , If you want to start, don't miss it ~ Long press the QR code to buy ~
Long press to scan code and enjoy 6 A discount
Previous recommendation
Social recruitment for two and a half years 10 A company 28 Round interview experience
There is Tao without skill , It can be done with skill ; No way with skill , Stop at surgery
Welcome to pay attention Java Road official account
Good article , I was watching ️
边栏推荐
- Sword finger offer (I) -- handwriting singleton mode
- ctf-HCTF-Final-Misc200
- 5 environment construction spark on yarn
- Construction and maintenance of business website [3]
- D4: unpaired image defogging, self enhancement method based on density and depth decomposition (CVPR 2022)
- 暑期第一周总结
- Plastic granule Industry Research Report - market status analysis and development prospect forecast
- kernel_ uaf
- Construction and maintenance of business website [2]
- China's log saw blade market trend report, technological innovation and market forecast
猜你喜欢
System (hierarchical) clustering method and SPSS implementation
rwctf2022_ QLaaS
[shutter] shutter layout component (opacity component | clipprect component | padding component)
Add two numbers of leetcode
One week dynamics of dragon lizard community | 2.07-2.13
如何防止你的 jar 被反编译?
Structure array, pointer and function and application cases
26 FPS video super-resolution model DAP! Output 720p Video Online
Capacity expansion mechanism of ArrayList
基本IO接口技术——微机第七章笔记
随机推荐
Golang embeds variables in strings
MySQL learning record (9)
China's Micro SD market trend report, technology dynamic innovation and market forecast
MySQL learning record (2)
MySQL learning notes (Advanced)
5 environment construction spark on yarn
Number of DP schemes
Go web programming practice (2) -- process control statement
What is the difference between programming in real work and that in school?
~91 rotation
Construction and maintenance of business websites [7]
Cardinality sorting (detailed illustration)
Accounting regulations and professional ethics [18]
China's crude oil heater market trend report, technological innovation and market forecast
[dynamic planning] p1220: interval DP: turn off the street lights
Welfare, let me introduce you to someone
MySQL learning record (1)
Codeworks global round 19 (CF 1637) a ~ e problem solution
ctf-HCTF-Final-Misc200
Construction and maintenance of business websites [9]