当前位置:网站首页>Distributed global ID generation scheme
Distributed global ID generation scheme
2022-07-07 05:40:00 【Qin Tian】
Catalog
One 、 Distributed ID Characteristics of
Two 、 Distributed ID The generation scheme of
4、 Take the current milliseconds
One 、 Distributed ID Characteristics of
- Uniqueness : Make sure that the generated ID It is the only one in the whole network .
- Order and increment : Make sure that the generated ID For a certain user or business, it increases in order according to a certain number .
- High availability : Make sure that you can correctly generate... At any time ID.
- Take time :ID It contains time , A glance at the past will tell which day of the transaction .
Two 、 Distributed ID The generation scheme of
1、 Database autoincrement ID
Using the database id Self increasing strategy , Such as MySQL Of auto_increment. And you can use two databases to set different steps , Generation does not repeat ID To achieve high availability .
advantage :
(1) Simple , Use the existing functions of the database
(2) Can guarantee uniqueness
(3) Database generated id Can guarantee the incremental
(4) Fixed step size
shortcoming :
(1) Availability is hard to guarantee : The common database architecture is one master and many slaves + Read / write separation , Generate auto increment ID It's a request , You can't play when the main library hangs up
(2) Expandability of , There's an upper limit to performance : Because writing is a single point , The write performance of the main database determines ID The upper bound of the generation performance , And it's hard to expand
How to improve :
(1) Add the main library , Avoid writing single dots
(2) Data horizontal segmentation , Ensure that each main library generates ID No repetition
As shown above , from 1 A write library becomes 3 Write Libraries , Each write library has a different auto_increment Initial value , And the same growth step , To ensure that each database generates ID Is different ( The library in the above figure 0 Generate 0,3,6,9…, library 1 Generate 1,4,7,10, library 2 Generate 2,5,8,11…)
The improved architecture ensures usability , But the disadvantage is :
(1) Lost ID Generated “ Absolute increasement ”: First visit the Library 0 Generate 0,3, Visit the library again 1 Generate 1, It can lead to... In a very short time ,ID Generation is not absolutely incremental ( It's not a big problem , Our goal is to increase the trend , It's not an absolute increase )
(2) The database is still under a lot of pressure , Each generation ID Access to the database
To solve these two problems , The following common solutions are introduced
2、 Batch build ID
Generate multiple on-demand batches at a time ID, Every student needs to access the database , Modify the database to the largest ID value , And record the current value and maximum value in memory .
Distributed systems are difficult , One of the important reasons is that “ There is no global clock , It's hard to guarantee absolute timing ”, To ensure absolute timing , Or can we only use single point service , Use the local clock to guarantee “ Absolute timing ”. Database writing pressure is high , Because every time you generate ID All accessed the database , You can use batch method to reduce the pressure of database writing .
As shown above , The database uses dual master Guaranteed availability , Only the current ID The maximum of , for example 0.ID The generation Service assumes that each batch pull 6 individual ID, Services access databases , Will the current ID The maximum value of is changed to 5, In this way, application access ID Generate service request ID,ID The build service does not need to access the database every time , You can distribute them in turn 0,1,2,3,4,5 these ID 了 , When ID After sending , then ID The maximum value of is changed to 11, And then it can be distributed again 6,7,8,9,10,11 these ID 了 , So the pressure on the database is reduced to the original 1/6 了 .
advantage :
(1) To ensure the ID Absolute increasing order of generation
(2) Greatly reduce the pressure of the database ,ID It can generate tens of thousands of them per second
shortcoming :
(1) The service is still single point
(2) If the service hangs up , After the service is restarted , Continue to build ID It could be discontinuous , There's a hole in the middle ( The service memory keeps 0,1,2,3,4,5, In the database max-id yes 5, Assigned to 3 when , The service is restarted , Next time I will 6 Assigned starting ,4 and 5 It's empty , But it's not a big problem )
(3) Although it can generate hundreds of thousands per second ID, But after all, there is a performance limit , No horizontal expansion
How to improve :
The common high availability optimization scheme for single point service is “ Standby service ”, Also called “ Shadow service ”, So we can optimize these disadvantages in the following ways (1):
Pictured above , The external service is the main service , There is a shadow service in standby at all times , When the main service is down, the shadow service is on top . This switching process is transparent to the caller , Can be done automatically , The common technique is vip+keepalived, The details are not here .
3、UUID
The core idea of the algorithm is to combine the network card of the machine 、 Local time 、 A random number to generate UUID.
UUID Generate... For the local ID Methods , High performance , Low delay .uuid Is a common solution :string ID =GenUUID();
advantage :
(1) Locally generated ID, There is no need to make a remote call , Low latency , There is no high availability risk
(2) Good scalability , Basically, there is no upper performance limit
shortcoming :
(1) Increasing trend cannot be guaranteed
(2)uuid Too long , It's often represented as a string , It is inefficient to build index as primary key , The common optimization scheme is “ Into two uint64 Integer storage ” perhaps “ Half storage ”( Half can't guarantee uniqueness )
4、 Take the current milliseconds
uuid It's a local algorithm , High generation performance , But there is no guarantee that the trend will increase , And as a string ID Low retrieval efficiency , Is there a local algorithm that can guarantee increment ?
Taking the current number of milliseconds is a common solution :uint64 ID = GenTimeMS();
advantage :
(1) Locally generated ID, There is no need to make a remote call , Low latency
(2) Generated ID The trend is increasing
(3) Generated ID Is an integer , After the index is established, the query efficiency is high
shortcoming :
(1) If the concurrency exceeds 1000, Duplicate will be generated ID
I went to , This shortcoming is fatal , There is no guarantee ID Uniqueness . Of course , Using microseconds can reduce the probability of conflict , But it can only generate at most per second 1000000 individual ID, If there are more, there will be conflicts , So using microseconds doesn't fundamentally solve the problem .
5、Redis Generate ID
Redis All command operations of are single threaded , It provides the image incr and increby Such a self increasing atomic order , So it's guaranteed that ID It must be the only orderly .
- advantage : Database independent , Flexible and convenient , And its performance is better than that of database ; Numbers ID Natural sequencing , It's very helpful for paging or sorting results .
- shortcoming : If there is no Redis, New components need to be introduced , Increase system complexity ; The amount of coding and configuration required is large .
Considering the performance bottleneck of single node , have access to Redis Cluster to get higher throughput . If a cluster has 5 platform Redis. Each can be initialized Redis The values of 1, 2, 3, 4, 5, And then the steps are 5. each Redis Generated ID by :
A:1, 6, 11, 16, 21
B:2, 7, 12, 17, 22
C:3, 8, 13, 18, 23
D:4, 9, 14, 19, 24
E:5, 10, 15, 20, 25
Decide which machine you want to load , It's hard to make changes in the future . The step size and initial value must be determined in advance . Use Redis Clusters can also be used for single point of failure problems .
in addition , It's more suitable for Redis To generate every day from 0 The serial number of the beginning . Like the order number = date + Since the growth of the day . Can be in every day Redis Generate a Key , Use INCR Add up .
6、 class snowflake Algorithm
snowflake yes twitter Open source distributed ID generating algorithm , The central idea is this : One long Type ID, Use among them 41bit As milliseconds ,10bit As machine number ,12bit As a sequence number in milliseconds . Theoretically, this algorithm can generate at most 1000*(2^12), That is to say 400W Of ID, Fully meet the needs of the business .
reference snowflake Thought , Combine the business logic and concurrency of each company , You can implement your own distributed ID generating algorithm .
give an example , Suppose a company ID The requirements of the generator service are as follows :
(1) The peak concurrency of a single machine is less than 1W, Is expected in the future 5 The annual peak concurrency of single machine is less than 10W
(2) Yes 2 A computer room , Is expected in the future 5 The annual number of machine rooms is less than 4 individual
(3) The number of machines in each machine room is less than 100 platform
(4) There are 5 Business lines have ID Generating requirements , It is estimated that the number of business lines in the future will be less than 10 individual
(5)…
The analysis process is as follows :
(1) High order fetching 2016 year 1 month 1 The number of milliseconds from today ( Suppose the system ID The generator service goes online after this time ), Assume that the system is running at least 10 year , At least 10 year *365 God *24 Hours *3600 second *1000 millisecond =320*10^9, Almost reserved 39bit Give milliseconds
(2) The peak concurrency of a single machine per second is less than 10W, That is, the average peak concurrency of a single machine per millisecond is less than 100, Almost reserved 7bit Give the serial number every millisecond
(3)5 The number of machine rooms in the year is less than 4 individual , reserve 2bit Mark the machine room
(4) Each machine room is less than 100 Taiwan machine , reserve 7bit Identify the server in each machine room
(5) The line of business is less than 10 individual , reserve 4bit Identify the business line
Designed like this 64bit identification , Can guarantee :
(1) Each line of business 、 Every computer room 、 Generated by each machine ID It's all different
(2) The same machine , Generated in every millisecond ID It's all different
(3) The same machine , In the same millisecond , Distinguish guaranteed generated by serial number ID Is different
(4) Put the number of milliseconds in the highest order , Guaranteed to be generated ID It's an increasing trend
shortcoming :
(1) because “ There is no global clock ”, Each server is assigned ID It's absolutely increasing , But on the whole , Generated ID It's just an increasing trend ( Some servers are early , Some servers are late )
The last one that's easy to ignore :
Generated ID, for example message-id/ order-id/ tiezi-id, When the amount of data is large, it is often necessary to divide the database and table , these ID It is often used as the basis for taking module, dividing library and dividing table , In order to divide the database and table, the data is even ,ID Generation often has “ Randomness of moduli ” The needs of , So we usually put the sequence number per second in ID At the bottom of the list , Guaranteed to be generated ID Is random .
And if , We're crossing milliseconds , The serial number always goes to 0, It makes the serial number 0 Of ID More , Resulting in ID Uneven after mold removal . The solution is , The serial number doesn't always go to 0, It's one 0 To 9 The random number , This place .
边栏推荐
- 分布式事务解决方案之TCC
- 4. 对象映射 - Mapping.Mapster
- 淘宝店铺发布API接口(新),淘宝oAuth2.0店铺商品API接口,淘宝商品发布API接口,淘宝商品上架API接口,一整套发布上架店铺接口对接分享
- Preliminary practice of niuke.com (9)
- K6EL-100漏电继电器
- 淘宝商品详情页API接口、淘宝商品列表API接口,淘宝商品销量API接口,淘宝APP详情API接口,淘宝详情API接口
- 纪念下,我从CSDN搬家到博客园啦!
- Most commonly used high number formula
- Design, configuration and points for attention of network unicast (one server, multiple clients) simulation using OPNET
- 论文阅读【Open-book Video Captioning with Retrieve-Copy-Generate Network】
猜你喜欢
随机推荐
Jhok-zbg2 leakage relay
The navigation bar changes colors according to the route
In memory, I moved from CSDN to blog park!
利用OPNET进行网络指定源组播(SSM)仿真的设计、配置及注意点
阿里云的神龙架构是怎么工作的 | 科普图解
实现网页内容可编辑
Talk about mvcc multi version concurrency controller?
DOM-节点对象+时间节点 综合案例
How Alibaba cloud's DPCA architecture works | popular science diagram
Vector and class copy constructors
集群、分布式、微服務的區別和介紹
[Oracle] simple date and time formatting and sorting problem
Common skills and understanding of SQL optimization
Taobao Commodity details page API interface, Taobao Commodity List API interface, Taobao Commodity sales API interface, Taobao app details API interface, Taobao details API interface
K6EL-100漏电继电器
Paper reading [open book video captioning with retrieve copy generate network]
How can professional people find background music materials when doing we media video clips?
How can project managers counter attack with NPDP certificates? Look here
Unity让摄像机一直跟随在玩家后上方
JHOK-ZBG2漏电继电器