Nine kinds of distributed primary key ID generation schemes of sub database and sub table are quite comprehensive

《sharding-jdbc Sub database and sub table 4 Seed fragmentation strategy 》 We introduced sharding-jdbc 4 The usage scenarios of the fragmentation strategy , Can meet the basic fragmentation function development , Let's look at the sub database table , How to generate a globally unique primary key for a partitioned table ID.

There are risks in introducing any technology , Database and table are no exception , Unless the library 、 The amount of table data continues to increase , To a certain extent , As a result, the existing high availability architecture has been unable to support , Otherwise, we do not recommend that we do the sub database table , Because after slicing the data , You'll find yourself on a path of stepping into a pit , And distributed primary keys ID It's the first pit that we met .

It is a difficult problem to generate a global unique primary key between different data nodes , A logic table t_order Split into multiple real tables t_order_n, And then it's distributed to different libraries db_0、db_1... , Since the auto increment keys of the real tables cannot be perceived by each other, they will produce duplicate primary keys , At this time, the database itself adds its own primary key , It can't meet the requirement of global uniqueness of primary key in sub database and sub table .

 db_0--
    |-- t_order_0
    |-- t_order_1
    |-- t_order_2
 db_1--
    |-- t_order_0
    |-- t_order_1
    |-- t_order_2

Although we can be strictly bound , Each partition table automatically increases the primary key Initial value and step To solve the problem ID Repetitive questions , But this will make the operation and maintenance cost increase sharply , And the scalability is very poor , Once you want to expand the number of split tables , The data in the original table changed a lot , So it's not very desirable .

  step  step =  The number of separate sheets 

 db_0--
    |-- t_order_0  ID: 0、6、12、18...
    |-- t_order_1  ID: 1、7、13、19...
    |-- t_order_2  ID: 2、8、14、20...
 db_1--
    |-- t_order_0  ID: 3、9、15、21...
    |-- t_order_1  ID: 4、10、16、22...
    |-- t_order_2  ID: 5、11、17、23...

There are many third-party solutions that can solve this problem perfectly , For example, based on UUID、SNOWFLAKE Algorithm 、segment Paragraph number , Generate non repeating keys using a specific algorithm , Or directly reference the primary key generation service , Like the US regiment （Leaf） and sound of dripping water （TinyId） etc. .

and sharding-jdbc Two distributed primary key generation schemes are built in ,UUID、SNOWFLAKE, Not only that, it also takes away the interface of the distributed primary key generator , In order to facilitate developers to implement a custom primary key generator , In the future, we will connect to the custom generator sound of dripping water （TinyId） Primary key generation service for .

As mentioned earlier in sharding-jdbc To automatically generate a primary key for a field ID, Only need application.properties Do the following configuration in the file :

#  Primary key field 
spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
#  Primary key ID  Generation scheme 
spring.shardingsphere.sharding.tables.t_order.key-generator.type=UUID
#  Work the machine  id
spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=123

key-generator.column Represents the primary key field ,key-generator.type Primary key ID Generation scheme （ Built in or custom ）,key-generator.props.worker.id For the machine ID, Set the primary key generation scheme to SNOWFLAKE Time machine ID Will participate in bit operations .

In the use of sharding-jdbc Two things should be noticed when distributed primary key is distributed ：

once insert The primary key field in the entity object of the insert operation has been assigned , Then even if the primary key generation scheme is configured, it will fail , Last SQL The data executed will be based on the value assigned .
Don't set auto increment attribute to primary key field , Otherwise, the primary key ID By default SNOWFLAKE How to generate . such as ： use mybatis plus Of @TableId Annotate fields order_id Auto increment primary key is set , What kind of solution to configure at this time , Always follow the snowflake algorithm .

Let's analyze it from source sharding-jdbc Built in primary key generation scheme UUID、SNOWFLAKE How did it happen .

UUID

open UUID The primary key of type generates the implementation class UUIDShardingKeyGenerator Source code discovery for , Its generation rules are only UUID.randomUUID() Such a line of code , forehead ~ There is a word in my heart Oh my god .

UUID Although we can achieve global uniqueness , But it is still not recommended as a primary key , Because we do business in real life, whether it is user_id still order_id Primary keys are mostly integers , and UUID It's a 32 Bit string .

Its storage and query to MySQL High performance consumption , and MySQL Officials have also made clear their recommendations , The primary key should be as short as possible , As a database primary key UUID The disorder of the data also leads to frequent changes in data locations , Seriously affect performance .

public final class UUIDShardingKeyGenerator implements ShardingKeyGenerator {
    private Properties properties = new Properties();

    public UUIDShardingKeyGenerator() {
    }

    public String getType() {
        return "UUID";
    }

    public synchronized Comparable<?> generateKey() {
        return UUID.randomUUID().toString().replaceAll("-", "");
    }

    public Properties getProperties() {
        return this.properties;
    }

    public void setProperties(Properties properties) {
        this.properties = properties;
    }
}

SNOWFLAKE

SNOWFLAKE（ Snowflake algorithm ） Is the default primary key generation scheme , Generate a 64bit Long integer （Long） data .

sharding-jdbc The primary key generated by snowflake algorithm is mainly composed of 4 Part of it is made up of ,1bit Sign bit 、41bit Time stamp bit 、10bit Working process bit and 12bit Serial number position .

Snowflake algorithm ID form

Sign bit （1bit position ）

Java in Long The highest bit of a type is the sign bit , A positive number is 0, Negative number is 1, General generation ID All positive , So the default is 0

Time stamp bit （41bit）

41 The number of milliseconds a bit's timestamp can hold is 2 Of 41 The next power , And the total number of milliseconds a year is 1000L * 60 * 60 * 24 * 365, The calculation time is about 69 year , forehead ~, I have enough for my life .

Math.pow(2, 41) / (365 * 24 * 60 * 60 * 1000L) = = 69 year

Work progress bit （10bit）

Represents a unique work process id, The default value is 0, It can be done by key-generator.props.worker.id Property settings .

spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=0000

Serial number position （12bit）

Different... Are generated in the same millisecond ID.

Clock back

Understand the primary key of snowflake algorithm ID It's not hard to find out , This is an algorithm that depends heavily on server time , And those who rely on server time will encounter a thorny problem ： Clock back .

Why is there a clock back ？

There is a network time protocol in the Internet ntp Full name (Network Time Protocol) , Specifically used to synchronize 、 Calibrate the time of each computer in the network .

That's why , Our phones don't have to manually calibrate the time now , But everyone's cell phone time is the same .

Our hardware clock may become inaccurate for various reasons （ fast or slow ）, At this point, we need ntp Service to do time calibration , When you do the calibration, the server clock will happen jumping perhaps Back to dial the The problem of .

How to solve clock callback by snowflake algorithm

Server clock callback can cause duplicate ID,SNOWFLAKE In the scheme, the original snowflake algorithm is improved , Added a maximum tolerated clock callback in milliseconds .

If the time of clock callback exceeds the maximum tolerance threshold of milliseconds , The program reports an error directly ; If it's within tolerance , Default distributed primary key generator , It will wait for the clock to synchronize to the time of the last primary key generation before continuing to work .

Maximum number of clock callback milliseconds tolerated , The default value is 0, You can use the properties max.tolerate.time.difference.milliseconds Set up .

#  Maximum number of clock callback milliseconds tolerated 
spring.shardingsphere.sharding.tables.t_order.key-generator.max.tolerate.time.difference.milliseconds=5

Here is the source code implementation class SnowflakeShardingKeyGenerator, The core process is as follows ：

When the primary key was last generated lastMilliseconds And current time currentMilliseconds compare , If lastMilliseconds > currentMilliseconds It means that the clock is back .

Then judge the difference between the two times （timeDifferenceMilliseconds） Whether the maximum tolerance time threshold is set max.tolerate.time.difference.milliseconds Inside , The thread sleep difference time is within the threshold value Thread.sleep(timeDifferenceMilliseconds), Otherwise, if it is greater than the difference value, it will report an exception directly .

 
/**
 * @author xiaofu
 */
public final class SnowflakeShardingKeyGenerator implements ShardingKeyGenerator{
    @Getter
    @Setter
    private Properties properties = new Properties();
    
    public String getType() {
        return "SNOWFLAKE";
    }
    
    public synchronized Comparable<?> generateKey() {
        /**
         *  Current system time in milliseconds  
         */ 
        long currentMilliseconds = timeService.getCurrentMillis();
        /**
         *  Judge whether the waiting time is poor , if necessary , Then the waiting time goes by , And then get the current system time  
         */ 
        if (waitTolerateTimeDifferenceIfNeed(currentMilliseconds)) {
            currentMilliseconds = timeService.getCurrentMillis();
        }
        /**
         *  If the last millisecond and   The current system time is the same in milliseconds , In the same millisecond  
         */
        if (lastMilliseconds == currentMilliseconds) {
            /**
             * & Bit and operator ： Both numbers are binary , If all the corresponding bits are 1, The result is 1, Otherwise 0
             *  When the sequence is 4095 when ,4095+1 After the new sequence and mask bit and operation, the result is 0
             *  When the sequence is other values , Neither the bits nor the result of the operation will be 0
             *  That is, the maximum value has been used in this millisecond sequence 4096, At this time, you need to take down a millisecond time value 
             */
            if (0L == (sequence = (sequence + 1) & SEQUENCE_MASK)) {
                currentMilliseconds = waitUntilNextTime(currentMilliseconds);
            }
        } else {
            /**
             *  The last millisecond has passed , Reset the sequence value to 1 
             */
            vibrateSequenceOffset();
            sequence = sequenceOffset;
        }
        lastMilliseconds = currentMilliseconds;
        
        /**
         * XX......XX XX000000 00000000 00000000     Time difference  XX
         *          XXXXXX XXXX0000 00000000     machine ID XX
         *                     XXXX XXXXXXXX     Serial number  XX
         *   Three parts | Bitwise OR operation ： If all the corresponding bits are 0, The result is 0, Otherwise 1
         */
        return ((currentMilliseconds - EPOCH) << TIMESTAMP_LEFT_SHIFT_BITS) | (getWorkerId() << WORKER_ID_LEFT_SHIFT_BITS) | sequence;
    }
    
    /**
     *  Judge whether the waiting time is poor 
     */
    @SneakyThrows
    private boolean waitTolerateTimeDifferenceIfNeed(final long currentMilliseconds) {
        /**
         *  If you get ID The last time in MS is less than or equal to the current system time Ms , It's normal , You don't have to wait  
         */
        if (lastMilliseconds <= currentMilliseconds) {
            return false;
        }
        /**
         * ===> When the clock goes back （ The time to generate the sequence is longer than that of the current system ）, Need to wait time difference  
         */
        /**
         *  obtain ID The time difference between the last millisecond of the current system time minus the number of milliseconds of the current system time  
         */
        long timeDifferenceMilliseconds = lastMilliseconds - currentMilliseconds;
        /**
         *  Time difference is less than the maximum tolerance time difference , That is, it is still within the time difference of clock callback  
         */
        Preconditions.checkState(timeDifferenceMilliseconds < getMaxTolerateTimeDifferenceMilliseconds(), 
                "Clock is moving backwards, last time is %d milliseconds, current time is %d milliseconds", lastMilliseconds, currentMilliseconds);
        /**
         *  Thread sleep time difference  
         */
        Thread.sleep(timeDifferenceMilliseconds);
        return true;
    }
    
    //  Configured machine ID
    private long getWorkerId() {
        long result = Long.valueOf(properties.getProperty("worker.id", String.valueOf(WORKER_ID)));
        Preconditions.checkArgument(result >= 0L && result < WORKER_ID_MAX_VALUE);
        return result;
    }
    
    private int getMaxTolerateTimeDifferenceMilliseconds() {
        return Integer.valueOf(properties.getProperty("max.tolerate.time.difference.milliseconds", String.valueOf(MAX_TOLERATE_TIME_DIFFERENCE_MILLISECONDS)));
    }
    
    private long waitUntilNextTime(final long lastTime) {
        long result = timeService.getCurrentMillis();
        while (result <= lastTime) {
            result = timeService.getCurrentMillis();
        }
        return result;
    }
}

But from SNOWFLAKE The primary key generated by the scheme ID Look at ,order_id It's a 18 A long integer number of bits , Did you find it too long , to want to MySQL That kind of 0 How to realize the incremental self increasing primary key ？ Don't worry. , The solution will be given in the back ！

SNOWFLAKE Primary key ID

Customize

sharding-jdbc utilize SPI Full name （ Service Provider Interface） Mechanism to expand the primary key generation rules , This is a service discovery mechanism , By scanning the project path META-INF/services The files under the , And automatically loads the classes defined in the file .

In fact, it is relatively simple to implement a custom primary key generator , There are only two steps .

First step , Realization ShardingKeyGenerator Interface , And rewrite its internal methods , among getType() The method is a user-defined primary key production scheme type 、generateKey() The method is to generate the rules of primary key .

The following code uses AtomicInteger To simulate and implement an ordered self increasing ID Generate .

/**
 * @Author: xiaofu
 * @Description:  Custom key generator 
 */
@Component
public class MyShardingKeyGenerator implements ShardingKeyGenerator {


    private final AtomicInteger count = new AtomicInteger();

    /**
     *  Custom generation scheme type 
     */
    @Override
    public String getType() {
        return "XXX";
    }

    /**
     *  The core approach - Generate primary key ID
     */
    @Override
    public Comparable<?> generateKey() {
        return count.incrementAndGet();
    }

    @Override
    public Properties getProperties() {
        return null;
    }

    @Override
    public void setProperties(Properties properties) {

    }
}

The second step , Because of the use of SPI The mechanism realizes the function expansion , We will be having META-INF/services The file is configured with a user-defined primary key generator class .

com.xiaofu.sharding.key.MyShardingKeyGenerator

Custom primary key SPI To configure

We'll test these things , Configure defined primary key generation type XXX, And insert a few pieces of data to see the effect .

spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
spring.shardingsphere.sharding.tables.t_order.key-generator.type=XXX

Through the console SQL Parsing logs found ,order_id The field has been inserted into the record in an ordered auto increment manner , It shows that the configuration is OK .

Take one against nine

Since you can customize the generation scheme , So there are many ways to realize distributed primary keys , I think of this one I wrote before 《9 Kind of Distributed ID Generation scheme 》, Discover that it can be perfectly compatible with , Here's a selection of sound of dripping water （Tinyid） Let's practice , Because it's a separate distributed ID Build service , So we have to build the environment first .

Tinyid Service provision of Http and Tinyid-client Two access modes , Use below Tinyid-client Way to use quickly , Read more details in this article , It's been introduced too many times .

Tinyid Service establishment

First pull the source code https://github.com/didi/tinyid.git.

Because it is distributed based on the segment pattern ID, So it depends on the database , To create the corresponding table tiny_id_info 、tiny_id_token And insert the default data .


CREATE TABLE `tiny_id_info` (
    `id` BIGINT (20) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT ' Since the primary key ',
    `biz_type` VARCHAR (63) NOT NULL DEFAULT '' COMMENT ' Business types , only ',
    `begin_id` BIGINT (20) NOT NULL DEFAULT '0' COMMENT ' Start id, Record only the initial value , No other meaning . On initialization begin_id and max_id It should be the same ',
    `max_id` BIGINT (20) NOT NULL DEFAULT '0' COMMENT ' At present, the biggest id',
    `step` INT (11) DEFAULT '0' COMMENT ' step ',
    `delta` INT (11) NOT NULL DEFAULT '1' COMMENT ' Every time id The incremental ',
    `remainder` INT (11) NOT NULL DEFAULT '0' COMMENT ' remainder ',
    `create_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT ' Creation time ',
    `update_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT ' Update time ',
    `version` BIGINT (20) NOT NULL DEFAULT '0' COMMENT ' Version number ',
    PRIMARY KEY (`id`),
    UNIQUE KEY `uniq_biz_type` (`biz_type`)
) ENGINE = INNODB AUTO_INCREMENT = 1 DEFAULT CHARSET = utf8 COMMENT 'id Information sheet ';

CREATE TABLE `tiny_id_token` (
    `id` INT (11) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT ' Self increasing id',
    `token` VARCHAR (255) NOT NULL DEFAULT '' COMMENT 'token',
    `biz_type` VARCHAR (63) NOT NULL DEFAULT '' COMMENT ' this token Accessible business type ID ',
    `remark` VARCHAR (255) NOT NULL DEFAULT '' COMMENT ' remarks ',
    `create_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT ' Creation time ',
    `update_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT ' Update time ',
    PRIMARY KEY (`id`)
) ENGINE = INNODB AUTO_INCREMENT = 1 DEFAULT CHARSET = utf8 COMMENT 'token Information sheet ';

INSERT INTO `tiny_id_token` (`id`, `token`, `biz_type`, `remark`, `create_time`, `update_time`) VALUES ('1', '0f673adf80504e2eaa552f5d791b644c', 'order', '1', '2017-12-14 16:36:46', '2017-12-14 16:36:48');

INSERT INTO `tiny_id_info` (`id`, `biz_type`, `begin_id`, `max_id`, `step`, `delta`, `remainder`, `create_time`, `update_time`, `version`) VALUES ('1', 'order', '1', '1', '100000', '1', '0', '2018-07-21 23:52:58', '2018-07-22 23:19:27', '1');

And in Tinyid Configure the data source information of the upper table in the service

datasource.tinyid.primary.url=jdbc:mysql://47.93.6.e:3306/ds-0?autoReconnect=true&useUnicode=true&characterEncoding=UTF-8
datasource.tinyid.primary.username=root
datasource.tinyid.primary.password=root

The final project maven install , Right click TinyIdServerApplication Start the service , Tinyid Distributed ID The build service is finished .

Customize Tinyid Primary key type

Tinyid After the service is built, we will introduce it into the project , Create a new tinyid_client.properties Add... To the file tinyid.server and tinyid.token attribute ,token Before SQL Pre inserted user data .

# tinyid  Distributed ID
#  Service address 
tinyid.server=127.0.0.1:9999
#  Business token
tinyid.token=0f673adf80504e2eaa552f5d791b644c

Get in code ID It's simpler , Just one line of code , Business types order Before SQ L Pre inserted data .

Long id = TinyId.nextId("order");

We started customizing Tinyid Implementation class of primary key generation type TinyIdShardingKeyGenerator .

/**
 * @Author: xiaofu
 * @Description:  Custom key generator 
 */
@Component
public class TinyIdShardingKeyGenerator implements ShardingKeyGenerator {
    
    /**
     *  Custom generation scheme type 
     */
    @Override
    public String getType() {
        return "tinyid";
    }

    /**
     *  The core approach - Generate primary key ID
     */
    @Override
    public Comparable<?> generateKey() {
        
        Long id = TinyId.nextId("order");
        
        return id;
    }

    @Override
    public Properties getProperties() {
        return null;
    }

    @Override
    public void setProperties(Properties properties) {

    }
}

And enable... In the configuration file Tinyid Primary key generation type , This is the end of the configuration , Test it quickly .

#  Primary key field 
spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
#  Primary key ID  Generation scheme 
spring.shardingsphere.sharding.tables.t_order.key-generator.type=tinyid

test Tinyid Primary key

Insert the order record into the database and test it , Primary key ID Field order_id It's increasing for the trend , Tinyid The service was successfully connected to , perfect ！

Insert picture description here

summary

The following eight generation methods are referred to 《9 Kind of Distributed ID Generation scheme 》 Access on demand , It is relatively simple as a whole, and it is not implemented in turn here .