Interviewer asked : In a product iteration , The product manager puts forward a new demand , It is required to send a blessing message at 10 a.m. on the user's birthday , How would you implement this function ?
The candidate
: This requirement is a typical scenario for timed tasks , Use the scheduled task to scan the list of qualified users at the specified time point , And call the interface of sending SMS circularly .
interviewer
: good , This scheduled task service will deploy at least two instances , To avoid a single point of failure , How to avoid duplicate messages caused by two instances sending messages to the same user at the same time ?
The candidate
: The essence of this problem is that only one instance can run a scheduled task at a time , It is a typical distributed locking scenario .
interviewer
: So why do we need distributed locks ?
The candidate
: The distributed lock is actually an extension of the single machine lock in the distributed scenario , Before explaining why distributed locks are needed , Let me first briefly introduce the concept of lower lock , Lock is the basic primitive of operating system , It is used for concurrency control , Can ensure that in many CPU 、 In a multithreaded environment , At a certain point in time , Only one thread can enter
Critical area code
, So as to ensure the consistency of operation data in the critical area ; When extending the usage scenario to a distributed environment , That is, across machines and processes , Distributed locks , In essence, it is to solve the problem of accessing the critical area code between processes , The code of sending SMS to be executed in the above timing task is the code of critical area .
interviewer
: What characteristics does a relatively complete distributed lock need ?
The candidate
: Implement a distributed lock , First of all, determine where the lock is stored ? For a single lock, we can use different values of an integer in memory to represent the state of locking or unlocking ; For distributed locks , Because this lock needs to be accessed by processes on different machines , therefore , Generally, the lock is stored in shared storage , For example, a relational database 、 Distributed cache, etc . Determine the storage position of the lock , Next We need to consider the core features of distributed locks , To sum up, there are mainly :
Timeout mechanism
: The lock service and the lock request service are scattered on different machines , They communicate with each other through the network , So we need to use the timeout mechanism , To avoid node failure or network exceptions that obtain locks , The lock it holds cannot be released , There is a deadlock situation .
Fairness
: According to the concrete implementation , Locks can be divided into fair locks and unfair locks , Suppose there are currently three threads competing to agree to lock , Threads A Successful lock acquisition , Threads B And thread C Failed to get and block waiting A Release the lock , And thread B Ahead of thread C Block waiting , So in the thread A After releasing the lock , This lock will be locked by the thread with the longest waiting time B get , On a first come, first served basis , Then this lock is a fair lock , The opposite is unfair lock .
Complete lock interface
: That is, the interface definition of the lock , The locking operation should also provide a blocking interface lock And non blocking interface tryLock, Unlocking operation shall be provided release Interface .
interviewer
: For the timeout mechanism mentioned above , If the node holding the lock processes the code of the critical area, it is time-consuming , The time required is greater than the timeout of the lock , At this time, there will be a critical area, and the lock will be released before the code is processed , Eventually, other nodes can acquire the lock and execute the critical area code , The problem that causes mutual exclusion to fail , How to solve it ?
The candidate
: This problem can be solved by lock renewal , That is to say, another thread will continuously extend the timeout of the lock through the heartbeat mechanism .
interviewer
: good , How to realize the reentrancy of the lock ?
The candidate
: Since we are implementing the same thread, we can repeatedly acquire a lock , therefore , After the lock is added successfully , We need to record the node that obtained the lock id+ Threads id, Bind the combination of the two as a unique identifier to the lock ; And before the locking logic is executed , Add a judgment , If the currently requested node id+ Threads id It is the same as the one currently holding the lock , Then directly return to success , Otherwise, execute the normal locking logic .
interviewer
: What are the implementation methods of distributed locks ?
The candidate
: There are three mainstream implementation methods for distributed locks , Namely :
Based on relational database ( for example MySQL): Create a table to record shared resource information , Make uniqueness constraints on critical resources , Lock a resource by adding a record , Release the lock by deleting the record .
Distributed cache based Redis : By calling Redis function SETNX+EXPIRE Realization , At the same time, in order to ensure atomicity , Can pass Lua Script to achieve lock settings and expiration time atomicity . stay Redis 2.6.12 After version SETNX Added expiration time parameter , You can also use this overloaded method directly .SETNX Method returns 1 Indicates acquisition key The lock represented , return 0 Indicates failed to acquire lock
Based on distributed coordination service ZooKeeper : At the corresponding persistent node shared_lock For each process, create a
Temporary order node
, Then check which process has the smallest node number , The most novel Ming style was first created , So get the lock , otherwise , Wait for the lowest numbered node to release the lock .
interviewer
: The advantages and disadvantages of these three implementation methods 、 How to use the scene ?
The candidate
: The advantage of database implementation is simple , The disadvantage is that it is prone to a single point of failure , The deadlock problem , And the performance and reliability are low ;Redis The advantage of the implementation method is high performance , It can be deployed across clusters , No single point of failure ; The disadvantage is that the control of lock failure time is unstable , Reliability is not as good as that based on ZooKeeper Way to achieve high ;ZooKeeper The advantage of this method is that there is no single point of failure 、 The deadlock problem , High reliability ; The disadvantage is that the performance is not Redis High mode . From the use scenario , The database mode is suitable for scenarios with small system concurrency and low performance requirements ;Redis This method is suitable for scenarios with high concurrency and high performance requirements ;ZooKeeper This method is applicable to most scenes ( In addition to scenes that require extremely high performance ).
interviewer
: If it's in Redis In a cluster environment , because Redis When the cluster data is synchronized to each node, it is asynchronous , If in Master After the node obtains the lock , Before synchronizing to other nodes ,Master Node crashed , At this time, the newly elected Master Nodes can still acquire locks , This will cause multiple application instances to obtain locks at the same time , The mutex of the lock is invalid , How to solve this problem ?
The candidate
: It does exist , therefore , Generally based on Redis We recommend using the distributed lock implemented by the cluster RedLock Algorithm , Open source Reddison The function library implements this algorithm . Use a single instance to obtain locks on different nodes , And every time you get a lock, you have a timeout , If the request times out , Think of it as Redis Node unavailable . When the application service successfully obtains the lock Redis More than half of nodes (N/2+1,N For the node number ) when , And the actual time spent acquiring the lock does not exceed the expiration time of the lock , The lock is obtained successfully . Once the lock is obtained successfully , The time to release the lock will be recalculated , This time is the time taken to release the lock minus the time taken to acquire the lock ; And if the lock acquisition fails , The client will still release the node that has successfully obtained the lock .
interviewer
: The usage scenarios of distributed lock , In addition to the scheduled tasks we mentioned above , What other common usage scenarios ?
The candidate
: In the second kill , In order to prevent oversold inventory, you can use .
Aside
: About distributed locks , There is a good open source implementation ,
, be based on Spring AOP Declarative and programmatic distributed locks , Support RedisTemplate、Redisson、Zookeeper etc. , Other distributed lock implementations can also be extended .
Reference material
《 High performance Java framework : Core principles and case practice 》 The first 12 Chapter
Distributed lock : All distributed locks are wrong ?
Distributed lock : The key point is , Please do not enter
How to design a better distributed lock ?
About me
WeChat official account :
Interviewer asked
, Original high-quality interview questions , Start with the interview question , But it's not just interview questions .
原网站版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/186/202207052126179212.html