当前位置:网站首页>[Distributed Advanced] Let's fill in those pits in Redis distributed locks.
[Distributed Advanced] Let's fill in those pits in Redis distributed locks.
2022-08-04 18:30:00 【TodoCoder】
携手创作,共同成长!这是我参与「掘金日新计划 · 8 月更文挑战」的第1天,点击查看活动详情
大家好,我是Coder哥,Recently in preparation for the interview pigeon for a long time,Put an end to the interview,Today let's chat based onRedisLock in the hole.This article analyzed comprehensive,记得点赞收藏哟!!!
在分布式系统开发过程中,A distributed lock is we must to master the basic skills of,分布式锁的实现方式有很多种,redis, zk, mysql, etcd等等,The most common or throughRedis来实现,RedisThe speed is faster and more convenient,But I see a lot of useRedisTo implement distributed lock are more or less the existence of certain defects,Today we will talk on thisRedisThe pit of implementing distributed lock.
In this paper, with the most commonly usedRedis实现为例,分别从The realization of the code level和Distributed architecture level,1 step by step and see what are the implementation of a distributed lock hole.To help your pit,And also can learn the ideas of distributed lock,The interview can be easy to trick the interviewer.
代码实现层面
分布式锁演进-阶段一
我们使用redis的setnx命令来实现分布式加锁,setnx(key)意思是:
- 如果不存在,setnx会把key 存到redis里面(加锁成功),返回true.
- 如果存在key, That has been tokey上锁了(加锁失败),返回false.
下面我们来看一下代码:
public String redis1() {
// In each instance to add lock,key值为"good_lock",value随机生成
String uuid = UUID.randomUUID().toString();
try {
// Set the lock as placeholders
Boolean good_lock = stringRedisTemplate.opsForValue().setIfAbsent("good1_lock", uuid);
if (!good_lock) {
return "抢锁失败";
}
// 加锁成功,执行业务
// 获取商品1的库存 并减一
String goods1 = stringRedisTemplate.opsForValue().get("good1");
Integer goods1num = Strings.isNullOrEmpty(goods1) ? 0 : Integer.parseInt(goods1);
if (goods1num > 0) {
int realNum = goods1num - 1;
stringRedisTemplate.opsForValue().set("good1",String.valueOf(realNum));
return "购买成功";
}
return "库存不足,购买失败";
} finally {
//释放锁
stringRedisTemplate.delete("good1_lock");
}
}
复制代码
Where is this code problem?We can imagine the scene,A service after get the lock,At the time of the execution of business suddenly loses power,At the back of the lock won't get release,那这个key依然在Redis里面,While waiting for the next instance to get the same lock won't always get it.
问题: setnx占好了位,A power failure procedure in the implementation of business process is down, 没有执行删除锁逻辑,这就造成了死锁
解决: 设置过期时间和占位必须是原子的.redis支持使用setnx ex
分布式锁演进-阶段二
在上面的代码中,如果程序在运行期间,The machine suddenly hung up,代码层面根本就没有走到finally
代码块,也就是说在宕机前,锁并没有被删除掉,这样的话,就没办法保证解锁,所以这里需要给key
加一个过期时间,在RedisSet the expiration time has two ways:
stringRedisTemplate.expire("good1_lock",30, TimeUnit.SECONDS)
stringRedisTemplate.opsForValue().setIfAbsent("good1_lock", uuid, 30, TimeUnit.SECONDS)
第一种方式Is a separate set expiration time,That is first need to set upsetIfAbsent("good1_lock", uuid)
,然后再设置过期时间,So this is a two step,不具备原子性,也会出问题: 比如:Set after the lock down,Haven't had time to set the expiration time,It still can lead to the problems existing in the lock has been.
第二种方式At the same time of lock made set expiration time,所以没有问题,这里采用这种方式,The following code changes:
// Set the lock as placeholders
Boolean good_lock = stringRedisTemplate.opsForValue().setIfAbsent("good1_lock", uuid);
复制代码
改为
// Set the lock as placeholders
Boolean good_lock = stringRedisTemplate.opsForValue().setIfAbsent("good1_lock", uuid, 10, TimeUnit.SECONDS);
复制代码
This approach solved the problem of the server downtime can't delete,But we'll take a look at the code,Look at problems but also not to: 比如这样一个场景,实例1
先获取到锁good1_lock
, Then the business processing time is a bit long,处理了15s
, 在第10s
的时候,redisAlready lock release,这个时候实例2
Also for the lockgood1_lock
, 那么实例1
处理完后,Will perform the operation of the lock is released, 这时会把good1_lock
释放掉,The lock is actually实例2
的锁,也就是说实例1
Due to long time to execute released实例2
的锁,这是一个很严重的问题.
问题:
实例1
Due to long time to execute released实例2
的锁解决:谁上的锁,谁才能删除,我们可以通过uuid来判断
分布式锁演进-阶段三
Based on the phase 2 code problem,我们可以通过uuid来判断 谁上的锁,谁才能删除
代码如下:
public String redis3() {
// In each instance to be locked before,key值为"good_lock",value随机生成
String uuid = UUID.randomUUID().toString();
try {
// 加锁并设置超时时间30秒
Boolean good_lock = stringRedisTemplate.opsForValue().setIfAbsent("good1_lock", uuid, 10, TimeUnit.SECONDS);
if (!good_lock) {
return "抢锁失败";
}
// 加锁成功,执行业务
// 获取商品1的库存 并减一
String goods1 = stringRedisTemplate.opsForValue().get("good1");
Integer goods1num = Strings.isNullOrEmpty(goods1) ? 0 : Integer.parseInt(goods1);
if (goods1num > 0) {
int realNum = goods1num - 1;
stringRedisTemplate.opsForValue().set("good1",String.valueOf(realNum));
return "购买成功";
}
return "库存不足,购买失败";
} finally {
//释放锁 谁上的锁,谁才能删除
if (uuid.equals(stringRedisTemplate.opsForValue().get("good1_lock"))) {
stringRedisTemplate.delete("good1_lock");
}
}
}
复制代码
The above code defines who the lock on the,谁才能删除,但finally
块的判断和del
Delete operation is not atomic operation,Concurrent still has the problem of data consistency when,比如Just the judgment is the current value,正要删除锁的时候,锁已经过期,Then the delete operation is still lock of others,所以我们Need to keep judgment, and delete operations are atomic
问题: 判断 And delete is two operations,不是原子的,有一致性问题.
解决: RedisProvide us withLuaScripts to perform multiple command to ensure that multiple atomic,所以我们可以通过LuaScript to ensure that judgment and delete step two atomic operations.
分布式锁演进-阶段四
Based on the phase three code,We do some improvement,删除锁必须保证原子性.使用redis+Lua脚本完成,代码如下:
public String redis4() {
// In each instance to be locked before,key值为"good_lock",value随机生成
String uuid = UUID.randomUUID().toString();
try {
// 加锁并设置超时时间30秒
Boolean good_lock = stringRedisTemplate.opsForValue().setIfAbsent("good1_lock", uuid, 10, TimeUnit.SECONDS);
if (!good_lock) {
return "抢锁失败";
}
// 加锁成功,执行业务
// 获取商品1的库存 并减一
String goods1 = stringRedisTemplate.opsForValue().get("good1");
Integer goods1num = Strings.isNullOrEmpty(goods1) ? 0 : Integer.parseInt(goods1);
if (goods1num > 0) {
int realNum = goods1num - 1;
stringRedisTemplate.opsForValue().set("good1",String.valueOf(realNum));
return "购买成功";
}
return "库存不足,购买失败";
} finally {
try {
//释放锁 谁上的锁,谁才能删除, redis+Lua脚本来实现
String script =
"if redis.call('get', KEYS[1]) == ARGV[1] then " +
"return redis.call('del', KEYS[1]) " +
"else " +
"return 0 " +
"end";
RedisScript<String> redisScript = new DefaultRedisScript<>(script);
String delResult = stringRedisTemplate.execute(redisScript,
Collections.singletonList("good1_lock"), Collections.singletonList(uuid));
if ("1".equals(delResult)) {
System.out.println("del redis lock success !");
} else {
System.out.println("del redis lock fail !");
}
} catch (Exception e) {
}
}
}
复制代码
The code evolution to the fourth stage,我们解决了加锁、Delete the lock atomicity problem,也解决了Who's who delete the lock on the problem,But there are some scenarios we need to consider,比如:
加锁后,如果超时了,redisAutomatically remove lock,This business actually has not processed the lock is released in advance,这也是个问题.
分布式锁演进-阶段五
对于上面的场景,Through simple operation is in front of problem too,We can think about how to deal with:
In fact this problemKey point is the lock automatically remove,那么Way of solving this problem is to automatically lock the lives,Specific we can open a thread to check the expiration date of monitoring lock,When expiring see if business releases the lock,If the lock is released just don't do deal with,如果没有释放锁,We went to the extension of active lock expired time,So you can solve the first problem.The logic of himself is hard to implement,刚好Redis有个框架Redisson
Can help us deal with this problem,We can directly use this framework to deal with.代码如下:
@Autowired
private StringRedisTemplate stringRedisTemplate;
@Autowired
private Redisson redisson;
/** * 通过RedissonTo ensure that all the problems * @return */
public String redis5() {
RLock lock = redisson.getLock("good1_lock");
try {
// 加锁并设置超时时间10秒
lock.lock(10, TimeUnit.SECONDS); // 等价于 setIfAbsent("good1_lock",uuid+threadId,10,TimeUnit.SECONDS);
// 加锁成功,执行业务
// 获取商品1的库存 并减一
String goods1 = stringRedisTemplate.opsForValue().get("good1");
Integer goods1num = Strings.isNullOrEmpty(goods1) ? 0 : Integer.parseInt(goods1);
if (goods1num > 0) {
int realNum = goods1num - 1;
stringRedisTemplate.opsForValue().set("good1",String.valueOf(realNum));
return "购买成功";
}
return "库存不足,购买失败";
} finally {
//释放锁
lock.unlock();
}
}
复制代码
Redisson
实现的原理就是,Through the watchdog mechanism,意思是说,Each time you lock opens aWatchDogThread to monitorredis的锁状态,If the lock is released just don't do deal with,如果没有释放锁,We went to the extension of active lock expired time,对于RedissonIntroduced here is not opened,有兴趣的可以自行搜索.
The distributed architecture implementation level
Finally fast at the end of the day,We start with the realization of the basic step by step analysis,To the final form,Do you think everything is all right,可以高枕无忧了吗?还是太年轻了,The above are based on code level to realize,Can use the code to solve the problem is not problem,Is li programmer,But as a delving into the programmer we alsoTo consider from the aspect of architecture of distributed scene,如下:
众所周知,Redis是基于AP模型的,Cluster is asynchronous replication cause data inconsistency problem,比如:The master node justset好good1_lock, Before the main section could sync to other nodes just hung up the,Thus causing the lock leakage problems.
For the above problem how to deal with???我们可以思考一下:
Now that is distributed level,In code have been settled not so good,We know that distributedCAP原则(一致性、可用性、分区容错性),Redis是满足AP(可用性、分区容错性)的分布式系统,So to solve the problem of this scene,Only with the strong consistencyCP(一致性、分区容错性)A distributed system to deal with,So there are two kinds of the way we handle:
- We can through the algorithm to theredis变为CP模型的,It thus can also be usedRedis了.
- 直接换个CP模型的系统,比如zookeeper.
对于第一种方案 RedLock算法实现
Redisson里面就支持,Just need to build a cluster platform more,Our Lord, for example:
This is to use its own a consistency algorithm Redlock算法
To ensure the consistency of the
RedissonRedLock加锁过程如下:
- 获取所有的redisson node节点信息,循环向所有的redisson node节点加锁,假设节点数为N,例子中N等于3.
- 如果在N个节点当中,有N/2 + 1个节点加锁成功了,那么整个RedissonRedLock加锁是成功的.
- 如果在N个节点当中,小于N/2 + 1个节点加锁成功,那么整个RedissonRedLock加锁是失败的.
- 如果中途发现各个节点加锁的总耗时,大于等于设置的最大等待时间,则直接返回失败.
从上面可以看出,使用Redlock算法,确实能解决多实例场景中,假如master节点挂了,导致分布式锁失效的问题.
This model also has some disadvantages:
- 资源成本比较高.
- Need to add multiple lock,增加了时间成本,降低了并发性.
对于第二种方案 换成CP模型的Zookeeper来实现
zookeeper的集群间数据同步机制是当主节点接收数据后不会立即返回给客户端成功的反馈,它会先与子节点进行数据同步,半数以上的节点都完成同步后才会通知客户端接收成功.并且如果主节点宕机后,根据zookeeper的Zab协议(Zookeeper原子广播)重新选举的主节点一定是已经同步成功的.
So to solve the above problem we need to changezookeeper来实现分布式锁了.具体实现这里就不写了,网上有很多.
总结
对于Redis和zkHow are we going to choose?
我们来看一下The characteristics of the two components:
- zk是保证一致性,Can sacrifice certain availability,And his concurrent ability are far lessRedis.
- Redis Is to ensure availability,So there will be data inconsistency in extreme cases the problem,But his high concurrency ability,使用起来方便.
所以,如果是单机情况下,There is no data consistency problem,It must be the preferredRedis, Cluster down if requiring concurrent business advice also electRedis, If the concurrent request is not high but the high requirement of data consistency can choose zk.
最后
Actually most of our business scenarios are based onCAPModel of thoughts to,We need before consistency and availability are,比如分布式锁就是这样的,We both do not have it all,Either high concurrency,Either high fault tolerant,其他也一样,As a distributed transaction is the same reason.
感谢各位能看到最后,希望本篇的内容对你有帮助,有什么意见或者建议可以留言一起讨论,看到后第一时间回复,也希望大家能给个赞,你的赞就是我写文章的动力,再次感谢.This article analyzed comprehensive,记得点赞收藏哟!!!
边栏推荐
- 【注册荣耀开发者】赢【荣耀70】手机
- MySQL安装教程(详细)
- 【web自动化测试】Playwright快速入门,5分钟上手
- DHCP&OSPF combined experimental demonstration (Huawei routing and switching equipment configuration)
- Boosting之GBDT原理
- 谁能解答?从mysql的binlog读取数据到kafka,但是数据类型有Insert,updata,
- leetcode/含有所有字符的最短字符串
- DHCP&OSPF组合实验演示(Huawei路由交换设备配置)
- 单行、多行文本超出显示省略号
- CAN光纤转换器CAN光端机解决消防火灾报警
猜你喜欢
基于 eBPF 的 Kubernetes 可观测实践
leetcode 13. 罗马数字转整数
MMDetection 使用示例:从入门到出门
企业即时通讯软件有哪些功能?对企业有什么帮助?
【STM32】入门(五):串口TTL、RS232、RS485
EuROC 数据集格式及相关代码
运力升级助力算力流转,中国数字经济的加速时刻
自己经常使用的三种调试:Pycharm、Vscode、pdb调试
A group of friends asked for help, but the needs that were not solved in a week were solved in 3 minutes?
工业元宇宙对工业带来的改变
随机推荐
通俗易懂-二维数组只能省略行不能省略列-人话版本
Regardless of whether you are a public, professional or non-major class, I have been sorting out the learning route for a long time here, and the learning route I have summarized is not yet rolled up
2019年海淀区青少年程序设计挑战活动小学组复赛试题详细答案
LVS+NAT 负载均衡群集,NAT模式部署
The prefix and discretization
袋鼠云思枢:数驹DTengine,助力企业构建高效的流批一体数据湖计算平台
Flask framework implementations registered encryption, a Flask enterprise class learning 】 【
【STM32】STM32单片机总目录
关于使用腾讯云HiFlow场景连接器每天提醒签到打卡
防火墙基础之防火墙做出口设备安全防护
机器学习——线性回归
gbase8s创建RANGE分片表
leetcode 13. 罗马数字转整数
22/8/4 记忆化搜索+博弈论
Boosting之GBDT原理
敏捷开发项目管理的一些心得
谷歌开源芯片 180 纳米制造工艺
2018读书记
Go 言 Go 语,一文看懂 Go 语言文件操作
MMDetection 使用示例:从入门到出门