当前位置:网站首页>Nine kinds of distributed primary key ID generation schemes of sub database and sub table are quite comprehensive

Nine kinds of distributed primary key ID generation schemes of sub database and sub table are quite comprehensive

2020-11-09 12:10:00 Programmer's internal affairs

《sharding-jdbc Sub database and sub table 4 Seed fragmentation strategy 》 We introduced sharding-jdbc 4 The usage scenarios of the fragmentation strategy , Can meet the basic fragmentation function development , Let's look at the sub database table , How to generate a globally unique primary key for a partitioned table ID.

There are risks in introducing any technology , Database and table are no exception , Unless the library 、 The amount of table data continues to increase , To a certain extent , As a result, the existing high availability architecture has been unable to support , Otherwise, we do not recommend that we do the sub database table , Because after slicing the data , You'll find yourself on a path of stepping into a pit , And distributed primary keys ID It's the first pit that we met .

It is a difficult problem to generate a global unique primary key between different data nodes , A logic table t_order Split into multiple real tables t_order_n, And then it's distributed to different libraries db_0db_1... , Since the auto increment keys of the real tables cannot be perceived by each other, they will produce duplicate primary keys , At this time, the database itself adds its own primary key , It can't meet the requirement of global uniqueness of primary key in sub database and sub table .

 db_0--
    |-- t_order_0
    |-- t_order_1
    |-- t_order_2
 db_1--
    |-- t_order_0
    |-- t_order_1
    |-- t_order_2

Although we can be strictly bound , Each partition table automatically increases the primary key Initial value and step To solve the problem ID Repetitive questions , But this will make the operation and maintenance cost increase sharply , And the scalability is very poor , Once you want to expand the number of split tables , The data in the original table changed a lot , So it's not very desirable .

  step  step =  The number of separate sheets 

 db_0--
    |-- t_order_0  ID: 0、6、12、18...
    |-- t_order_1  ID: 1、7、13、19...
    |-- t_order_2  ID: 2、8、14、20...
 db_1--
    |-- t_order_0  ID: 3、9、15、21...
    |-- t_order_1  ID: 4、10、16、22...
    |-- t_order_2  ID: 5、11、17、23...

There are many third-party solutions that can solve this problem perfectly , For example, based on UUIDSNOWFLAKE Algorithm 、segment Paragraph number , Generate non repeating keys using a specific algorithm , Or directly reference the primary key generation service , Like the US regiment (Leaf) and sound of dripping water (TinyId) etc. .

and sharding-jdbc Two distributed primary key generation schemes are built in ,UUIDSNOWFLAKE, Not only that, it also takes away the interface of the distributed primary key generator , In order to facilitate developers to implement a custom primary key generator , In the future, we will connect to the custom generator sound of dripping water (TinyId) Primary key generation service for .

As mentioned earlier in sharding-jdbc To automatically generate a primary key for a field ID, Only need application.properties Do the following configuration in the file :

#  Primary key field 
spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
#  Primary key ID  Generation scheme 
spring.shardingsphere.sharding.tables.t_order.key-generator.type=UUID
#  Work the machine  id
spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=123

key-generator.column Represents the primary key field ,key-generator.type Primary key ID Generation scheme ( Built in or custom ),key-generator.props.worker.id For the machine ID, Set the primary key generation scheme to SNOWFLAKE Time machine ID Will participate in bit operations .

In the use of sharding-jdbc Two things should be noticed when distributed primary key is distributed :
  • once insert The primary key field in the entity object of the insert operation has been assigned , Then even if the primary key generation scheme is configured, it will fail , Last SQL The data executed will be based on the value assigned .
  • Don't set auto increment attribute to primary key field , Otherwise, the primary key ID By default SNOWFLAKE How to generate . such as : use mybatis plus Of @TableId Annotate fields order_id Auto increment primary key is set , What kind of solution to configure at this time , Always follow the snowflake algorithm .

Let's analyze it from source sharding-jdbc Built in primary key generation scheme UUIDSNOWFLAKE How did it happen .

UUID

open UUID The primary key of type generates the implementation class UUIDShardingKeyGenerator Source code discovery for , Its generation rules are only UUID.randomUUID() Such a line of code , forehead ~ There is a word in my heart Oh my god .

UUID Although we can achieve global uniqueness , But it is still not recommended as a primary key , Because we do business in real life, whether it is user_id still order_id Primary keys are mostly integers , and UUID It's a 32 Bit string .

Its storage and query to MySQL High performance consumption , and MySQL Officials have also made clear their recommendations , The primary key should be as short as possible , As a database primary key UUID The disorder of the data also leads to frequent changes in data locations , Seriously affect performance .

public final class UUIDShardingKeyGenerator implements ShardingKeyGenerator {
    private Properties properties = new Properties();

    public UUIDShardingKeyGenerator() {
    }

    public String getType() {
        return "UUID";
    }

    public synchronized Comparable<?> generateKey() {
        return UUID.randomUUID().toString().replaceAll("-", "");
    }

    public Properties getProperties() {
        return this.properties;
    }

    public void setProperties(Properties properties) {
        this.properties = properties;
    }
}

SNOWFLAKE

SNOWFLAKE( Snowflake algorithm ) Is the default primary key generation scheme , Generate a 64bit Long integer (Long) data .

sharding-jdbc The primary key generated by snowflake algorithm is mainly composed of 4 Part of it is made up of ,1bit Sign bit 、41bit Time stamp bit 、10bit Working process bit and 12bit Serial number position .

 Snowflake algorithm ID form

Sign bit (1bit position )

Java in Long The highest bit of a type is the sign bit , A positive number is 0, Negative number is 1, General generation ID All positive , So the default is 0

Time stamp bit (41bit)

41 The number of milliseconds a bit's timestamp can hold is 2 Of 41 The next power , And the total number of milliseconds a year is 1000L * 60 * 60 * 24 * 365, The calculation time is about 69 year , forehead ~, I have enough for my life .

Math.pow(2, 41) / (365 * 24 * 60 * 60 * 1000L) = = 69 year  

Work progress bit (10bit)

Represents a unique work process id, The default value is 0, It can be done by key-generator.props.worker.id Property settings .

spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=0000

Serial number position (12bit)

Different... Are generated in the same millisecond ID.

Clock back

Understand the primary key of snowflake algorithm ID It's not hard to find out , This is an algorithm that depends heavily on server time , And those who rely on server time will encounter a thorny problem : Clock back .

Why is there a clock back ?

There is a network time protocol in the Internet ntp Full name (Network Time Protocol) , Specifically used to synchronize 、 Calibrate the time of each computer in the network .

That's why , Our phones don't have to manually calibrate the time now , But everyone's cell phone time is the same .

Our hardware clock may become inaccurate for various reasons ( fast or slow ), At this point, we need ntp Service to do time calibration , When you do the calibration, the server clock will happen jumping perhaps Back to dial the The problem of .

How to solve clock callback by snowflake algorithm

Server clock callback can cause duplicate ID,SNOWFLAKE In the scheme, the original snowflake algorithm is improved , Added a maximum tolerated clock callback in milliseconds .

If the time of clock callback exceeds the maximum tolerance threshold of milliseconds , The program reports an error directly ; If it's within tolerance , Default distributed primary key generator , It will wait for the clock to synchronize to the time of the last primary key generation before continuing to work .

Maximum number of clock callback milliseconds tolerated , The default value is 0, You can use the properties max.tolerate.time.difference.milliseconds Set up .

#  Maximum number of clock callback milliseconds tolerated 
spring.shardingsphere.sharding.tables.t_order.key-generator.max.tolerate.time.difference.milliseconds=5

Here is the source code implementation class SnowflakeShardingKeyGenerator, The core process is as follows :

When the primary key was last generated lastMilliseconds And current time currentMilliseconds compare , If lastMilliseconds > currentMilliseconds It means that the clock is back .

Then judge the difference between the two times (timeDifferenceMilliseconds) Whether the maximum tolerance time threshold is set max.tolerate.time.difference.milliseconds Inside , The thread sleep difference time is within the threshold value Thread.sleep(timeDifferenceMilliseconds), Otherwise, if it is greater than the difference value, it will report an exception directly .

 
/**
 * @author xiaofu
 */
public final class SnowflakeShardingKeyGenerator implements ShardingKeyGenerator{
    @Getter
    @Setter
    private Properties properties = new Properties();
    
    public String getType() {
        return "SNOWFLAKE";
    }
    
    public synchronized Comparable<?> generateKey() {
        /**
         *  Current system time in milliseconds  
         */ 
        long currentMilliseconds = timeService.getCurrentMillis();
        /**
         *  Judge whether the waiting time is poor , if necessary , Then the waiting time goes by , And then get the current system time  
         */ 
        if (waitTolerateTimeDifferenceIfNeed(currentMilliseconds)) {
            currentMilliseconds = timeService.getCurrentMillis();
        }
        /**
         *  If the last millisecond and   The current system time is the same in milliseconds , In the same millisecond  
         */
        if (lastMilliseconds == currentMilliseconds) {
            /**
             * & Bit and operator : Both numbers are binary , If all the corresponding bits are 1, The result is 1, Otherwise 0
             *  When the sequence is 4095 when ,4095+1 After the new sequence and mask bit and operation, the result is 0
             *  When the sequence is other values , Neither the bits nor the result of the operation will be 0
             *  That is, the maximum value has been used in this millisecond sequence 4096, At this time, you need to take down a millisecond time value 
             */
            if (0L == (sequence = (sequence + 1) & SEQUENCE_MASK)) {
                currentMilliseconds = waitUntilNextTime(currentMilliseconds);
            }
        } else {
            /**
             *  The last millisecond has passed , Reset the sequence value to 1 
             */
            vibrateSequenceOffset();
            sequence = sequenceOffset;
        }
        lastMilliseconds = currentMilliseconds;
        
        /**
         * XX......XX XX000000 00000000 00000000     Time difference  XX
         *          XXXXXX XXXX0000 00000000     machine ID XX
         *                     XXXX XXXXXXXX     Serial number  XX
         *   Three parts | Bitwise OR operation : If all the corresponding bits are 0, The result is 0, Otherwise 1
         */
        return ((currentMilliseconds - EPOCH) << TIMESTAMP_LEFT_SHIFT_BITS) | (getWorkerId() << WORKER_ID_LEFT_SHIFT_BITS) | sequence;
    }
    
    /**
     *  Judge whether the waiting time is poor 
     */
    @SneakyThrows
    private boolean waitTolerateTimeDifferenceIfNeed(final long currentMilliseconds) {
        /**
         *  If you get ID The last time in MS is less than or equal to the current system time Ms , It's normal , You don't have to wait  
         */
        if (lastMilliseconds <= currentMilliseconds) {
            return false;
        }
        /**
         * ===> When the clock goes back ( The time to generate the sequence is longer than that of the current system ), Need to wait time difference  
         */
        /**
         *  obtain ID The time difference between the last millisecond of the current system time minus the number of milliseconds of the current system time  
         */
        long timeDifferenceMilliseconds = lastMilliseconds - currentMilliseconds;
        /**
         *  Time difference is less than the maximum tolerance time difference , That is, it is still within the time difference of clock callback  
         */
        Preconditions.checkState(timeDifferenceMilliseconds < getMaxTolerateTimeDifferenceMilliseconds(), 
                "Clock is moving backwards, last time is %d milliseconds, current time is %d milliseconds", lastMilliseconds, currentMilliseconds);
        /**
         *  Thread sleep time difference  
         */
        Thread.sleep(timeDifferenceMilliseconds);
        return true;
    }
    
    //  Configured machine ID
    private long getWorkerId() {
        long result = Long.valueOf(properties.getProperty("worker.id", String.valueOf(WORKER_ID)));
        Preconditions.checkArgument(result >= 0L && result < WORKER_ID_MAX_VALUE);
        return result;
    }
    
    private int getMaxTolerateTimeDifferenceMilliseconds() {
        return Integer.valueOf(properties.getProperty("max.tolerate.time.difference.milliseconds", String.valueOf(MAX_TOLERATE_TIME_DIFFERENCE_MILLISECONDS)));
    }
    
    private long waitUntilNextTime(final long lastTime) {
        long result = timeService.getCurrentMillis();
        while (result <= lastTime) {
            result = timeService.getCurrentMillis();
        }
        return result;
    }
}

But from SNOWFLAKE The primary key generated by the scheme ID Look at ,order_id It's a 18 A long integer number of bits , Did you find it too long , to want to MySQL That kind of 0 How to realize the incremental self increasing primary key ? Don't worry. , The solution will be given in the back !

SNOWFLAKE  Primary key ID

Customize

sharding-jdbc utilize SPI Full name ( Service Provider Interface) Mechanism to expand the primary key generation rules , This is a service discovery mechanism , By scanning the project path META-INF/services The files under the , And automatically loads the classes defined in the file .

In fact, it is relatively simple to implement a custom primary key generator , There are only two steps .

First step , Realization ShardingKeyGenerator Interface , And rewrite its internal methods , among getType() The method is a user-defined primary key production scheme type 、generateKey() The method is to generate the rules of primary key .

The following code uses AtomicInteger To simulate and implement an ordered self increasing ID Generate .

/**
 * @Author: xiaofu
 * @Description:  Custom key generator 
 */
@Component
public class MyShardingKeyGenerator implements ShardingKeyGenerator {


    private final AtomicInteger count = new AtomicInteger();

    /**
     *  Custom generation scheme type 
     */
    @Override
    public String getType() {
        return "XXX";
    }

    /**
     *  The core approach - Generate primary key ID
     */
    @Override
    public Comparable<?> generateKey() {
        return count.incrementAndGet();
    }

    @Override
    public Properties getProperties() {
        return null;
    }

    @Override
    public void setProperties(Properties properties) {

    }
}

The second step , Because of the use of SPI The mechanism realizes the function expansion , We will be having META-INF/services The file is configured with a user-defined primary key generator class .

com.xiaofu.sharding.key.MyShardingKeyGenerator

 Custom primary key  SPI  To configure

We'll test these things , Configure defined primary key generation type XXX, And insert a few pieces of data to see the effect .

spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
spring.shardingsphere.sharding.tables.t_order.key-generator.type=XXX

Through the console SQL Parsing logs found ,order_id The field has been inserted into the record in an ordered auto increment manner , It shows that the configuration is OK .

Take one against nine

Since you can customize the generation scheme , So there are many ways to realize distributed primary keys , I think of this one I wrote before 《9 Kind of Distributed ID Generation scheme 》, Discover that it can be perfectly compatible with , Here's a selection of sound of dripping water (Tinyid) Let's practice , Because it's a separate distributed ID Build service , So we have to build the environment first .

Tinyid Service provision of Http and Tinyid-client Two access modes , Use below Tinyid-client Way to use quickly , Read more details in this article , It's been introduced too many times .

Tinyid Service establishment

First pull the source code https://github.com/didi/tinyid.git.

Because it is distributed based on the segment pattern ID, So it depends on the database , To create the corresponding table tiny_id_infotiny_id_token And insert the default data .


CREATE TABLE `tiny_id_info` (
    `id` BIGINT (20) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT ' Since the primary key ',
    `biz_type` VARCHAR (63) NOT NULL DEFAULT '' COMMENT ' Business types , only ',
    `begin_id` BIGINT (20) NOT NULL DEFAULT '0' COMMENT ' Start id, Record only the initial value , No other meaning . On initialization begin_id and max_id It should be the same ',
    `max_id` BIGINT (20) NOT NULL DEFAULT '0' COMMENT ' At present, the biggest id',
    `step` INT (11) DEFAULT '0' COMMENT ' step ',
    `delta` INT (11) NOT NULL DEFAULT '1' COMMENT ' Every time id The incremental ',
    `remainder` INT (11) NOT NULL DEFAULT '0' COMMENT ' remainder ',
    `create_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT ' Creation time ',
    `update_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT ' Update time ',
    `version` BIGINT (20) NOT NULL DEFAULT '0' COMMENT ' Version number ',
    PRIMARY KEY (`id`),
    UNIQUE KEY `uniq_biz_type` (`biz_type`)
) ENGINE = INNODB AUTO_INCREMENT = 1 DEFAULT CHARSET = utf8 COMMENT 'id Information sheet ';

CREATE TABLE `tiny_id_token` (
    `id` INT (11) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT ' Self increasing id',
    `token` VARCHAR (255) NOT NULL DEFAULT '' COMMENT 'token',
    `biz_type` VARCHAR (63) NOT NULL DEFAULT '' COMMENT ' this token Accessible business type ID ',
    `remark` VARCHAR (255) NOT NULL DEFAULT '' COMMENT ' remarks ',
    `create_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT ' Creation time ',
    `update_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT ' Update time ',
    PRIMARY KEY (`id`)
) ENGINE = INNODB AUTO_INCREMENT = 1 DEFAULT CHARSET = utf8 COMMENT 'token Information sheet ';

INSERT INTO `tiny_id_token` (`id`, `token`, `biz_type`, `remark`, `create_time`, `update_time`) VALUES ('1', '0f673adf80504e2eaa552f5d791b644c', 'order', '1', '2017-12-14 16:36:46', '2017-12-14 16:36:48');

INSERT INTO `tiny_id_info` (`id`, `biz_type`, `begin_id`, `max_id`, `step`, `delta`, `remainder`, `create_time`, `update_time`, `version`) VALUES ('1', 'order', '1', '1', '100000', '1', '0', '2018-07-21 23:52:58', '2018-07-22 23:19:27', '1');

And in Tinyid Configure the data source information of the upper table in the service

datasource.tinyid.primary.url=jdbc:mysql://47.93.6.e:3306/ds-0?autoReconnect=true&useUnicode=true&characterEncoding=UTF-8
datasource.tinyid.primary.username=root
datasource.tinyid.primary.password=root

The final project maven install , Right click TinyIdServerApplication Start the service , Tinyid Distributed ID The build service is finished .

Customize Tinyid Primary key type

Tinyid After the service is built, we will introduce it into the project , Create a new tinyid_client.properties Add... To the file tinyid.server and tinyid.token attribute ,token Before SQL Pre inserted user data .

# tinyid  Distributed ID
#  Service address 
tinyid.server=127.0.0.1:9999
#  Business token
tinyid.token=0f673adf80504e2eaa552f5d791b644c

Get in code ID It's simpler , Just one line of code , Business types order Before SQ L Pre inserted data .

Long id = TinyId.nextId("order");

We started customizing Tinyid Implementation class of primary key generation type TinyIdShardingKeyGenerator .

/**
 * @Author: xiaofu
 * @Description:  Custom key generator 
 */
@Component
public class TinyIdShardingKeyGenerator implements ShardingKeyGenerator {
    
    /**
     *  Custom generation scheme type 
     */
    @Override
    public String getType() {
        return "tinyid";
    }

    /**
     *  The core approach - Generate primary key ID
     */
    @Override
    public Comparable<?> generateKey() {
        
        Long id = TinyId.nextId("order");
        
        return id;
    }

    @Override
    public Properties getProperties() {
        return null;
    }

    @Override
    public void setProperties(Properties properties) {

    }
}

And enable... In the configuration file Tinyid Primary key generation type , This is the end of the configuration , Test it quickly .

#  Primary key field 
spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
#  Primary key ID  Generation scheme 
spring.shardingsphere.sharding.tables.t_order.key-generator.type=tinyid

test Tinyid Primary key

Insert the order record into the database and test it , Primary key ID Field order_id It's increasing for the trend , Tinyid The service was successfully connected to , perfect !

 Insert picture description here

summary

The following eight generation methods are referred to 《9 Kind of Distributed ID Generation scheme 》 Access on demand , It is relatively simple as a whole, and it is not implemented in turn here .

Case study GitHub Address : https://github.com/chengxy-nd...

版权声明
本文为[Programmer's internal affairs]所创,转载请带上原文链接,感谢