当前位置：网站首页>Shardingsphere practice (6) - elastic scaling

For systems running with a single database , How to safely and simply migrate data to a horizontally fragmented database , It has always been an urgent need . For already used ShardingSphere For users of , With the rapid change of business scale , It may also be necessary to elastically expand or shrink the existing partitioned cluster .

ShardingSphere It provides users with a great degree of freedom in the segmentation algorithm , But it poses a great challenge to elasticity . Find a partition algorithm that can support customization , It can also efficiently expand and shrink the capacity of data nodes , Is the first challenge of elastic scaling .

meanwhile , In the process of stretching , Should not affect the running business . Minimize the time window when data is unavailable during scaling , Even make the user completely imperceptible , Is another challenge of resiliency .

Last , Elastic scaling should not affect existing data . How to ensure the correctness of data , Is the third challenge of elasticity .

The elastic expansion process is shown in the figure below .

Support custom segmentation algorithm , Reduce the business impact of data scaling and migration , Provide a one-stop general elastic expansion solution , yes ShardingSphere Main design objectives of elastic expansion .ShardingSphere-Scaling It is a general data access migration and elastic scaling solution for users .ShardingSphere-Scaling from 4.1.0 The version is available to users , The current in the alpha The development phase .

2. The core concept

Elastic telescopic operation ： It refers to the complete process of migrating data from old rules to new rules at one time .
Stock data ： Before the elastic expansion and contraction work starts , Existing data in the data node .
Incremental data ： During the execution of elastic scaling jobs , New data generated by the business system .

3. Use standard

Support items ：

Migrate peripheral data to ShardingSphere The database managed .
take ShardingSphere Expand or shrink the capacity of data nodes .

Item... Is not supported ：

No primary key table expansion and contraction .
The expansion and contraction capacity of the composite primary key table .
Migration on the current storage node is not supported , You need to prepare a new database cluster as the migration target database .

Two 、 Implementation details

1. Principle that

in consideration of ShardingSphere Flexibility of several modules , The current elastic scaling solution is ： Temporarily use two database clusters , After the scaling is completed, the switching mode is realized , As shown in the figure below .

This implementation has the following advantages ：

During expansion and contraction , The raw data has no impact .
There is no risk of scaling failure .
It is not restricted by the partition strategy .

At the same time, it also has some disadvantages ：

The memory is in the server for a certain time .
All data needs to be moved .

The elastic expansion module will resolve the old fragmentation rules , Extract the data source in the configuration 、 Data nodes and other information , Then create the scaling job workflow , Disassemble the primary elastic expansion into 4 The main stages ：

Preparation stage .
Stock data migration stage .
Incremental data synchronization phase .
Rule switching phase .

The elastic scaling workflow is shown in the following figure .

2. Description of the implementation phase

（1） Preparation stage

In the preparation stage , The elastic expansion module will verify the connectivity and authority of the data source , At the same time, the statistics of stock data 、 Record of log sites , Finally, according to the amount of data and the parallelism set by the user , Slice the task .

（2） Stock data migration stage

Execute the migration of stock data split in the preparation stage , The stock transfer stage adopts JDBC How to query , Read data directly from the data node , And use the new rules to write to the new cluster .

（3） Incremental data synchronization phase

Because the time spent in the migration of stock data is affected by factors such as data volume and parallelism , At this time, you need to synchronize the new data of the business during this period . Different databases use different technical details , However, they are generally based on replication protocols or WAL The change data capture function realized by log .

MySQL： Subscribe and parse binlog;
PostgreSQL： Copy with official logic test_decoding.

These captured incremental data , Similarly, the elastic expansion module will write to the new data node according to the new rules . When the incremental data is basically synchronized （ Because the business system has not stopped , Incremental data is constantly ）, Then enter the rule switching stage .

（4） Rule switching phase

At this stage , There may be a certain period of business read-only window , By setting the database read-only or ShardingSphere The fusing mechanism of , Let the data in the old data node be static for a short time , Make sure that the incremental synchronization is complete .

This window is as short as a few seconds , It's a few minutes long , It depends on the amount of data and whether the user needs to perform strong verification on the data . After confirmation ,ShardingSphere The configuration can be modified through the configuration center , Lead the business to the cluster of new rules , Elastic expansion is completed .

3. Current limiting and fusing

In the face of overloaded traffic , Fuse and limit current for a node , To ensure that the entire database cluster can continue to run , It is a challenge to the control ability of a single node in a distributed system . Fusing refers to blocking ShardingSphere Connection to database . When a ShardingSphere After the node exceeds the load , Stop the node from accessing the database , So that the database can ensure sufficient resources to provide services for other nodes . Current limiting refers to opening the flow limit in the face of overload requests , In order to protect some requests, high-quality response can be obtained .

Current limiting is implemented during data migration or capacity expansion , Limit the traffic on the source side or the target side . The following table shows the fuse statements currently provided .

sentence	explain	Example
[ENABLE / DISABLE] READWRITE_SPLITTING (READ)? resourceName [FROM databaseName]	Enable / Disable library reading	ENABLE READWRITE_SPLITTING READ resource_0
[ENABLE / DISABLE] INSTANCE instanceId	Enable / Ban Proxy example	DISABLE INSTANCE instance_1
SHOW INSTANCE LIST	Inquire about Proxy Instance information	SHOW INSTANCE LIST
SHOW READWRITE_SPLITTING (READ)? resourceName [FROM databaseName]	Query the status of all read libraries	SHOW READWRITE_SPLITTING READ RESOURCES

Instance level fusing is used for multiple Proxy In the case scenario , Enable / Prohibit certain Porxy example . Library level fusing is mainly used to enable in the read / write separation scenario / Disable library reading .

3、 ... and 、 Use case testing

This section demonstrates two examples , example 1 It's from MySQL Instance migration to ShardingSphere-Proxy, example 2 It is used to expand the capacity of existing database nodes . Both examples use ShardingSphere-Scaling Realization , Over similar .

ShardingSphere-Scaling It is not an independent product at present , But rather ShardingSphere-Proxy A configuration item in provides corresponding functions . If the backend is connected to the following databases , You need to download the corresponding JDBC drive jar package , And put it in ${shardingsphere-proxy}/lib Directory .

database	JDBC drive	Reference resources
MySQL	mysql-connector-java-5.1.47.jar	Connector/J Versions
PostgreSQL	opengauss-jdbc-2.0.1-compatibility.jar

#  introduce  JDBC  drive 
cp ~/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar ~/apache-shardingsphere-5.1.1-shardingsphere-proxy-bin/lib/

#  restart Proxy
/root/apache-shardingsphere-5.1.1-shardingsphere-proxy-bin/bin/stop.sh
/root/apache-shardingsphere-5.1.1-shardingsphere-proxy-bin/bin/start.sh

1. Data migration

demand ： There is one in use MySQL database , Two of these tables need to be migrated to Proxy Data node under . The data of the two tables changes in real time , It is required to minimize the time of business impact .

Source ：172.18.26.198:3306/migrating_db
The goal is ：172.18.10.66:3306/db1、172.18.10.66:3306/db2、172.18.18.102:3306\db1、172.18.18.102:3306\db2
Proxy：172.18.10.66:3307、172.18.18.102:3307,Cluster Operation mode

（1） Prepare the test case environment

The goal is 66、102 perform ：

drop database if exists db1;
drop database if exists db2;
create database db1;
create database db2;

Source 198 perform ：

--  Building database 
drop database if exists migrating_db;
create database migrating_db;
use migrating_db;

--  Build table 
create table t_order (
    order_id bigint auto_increment primary key, 
    order_datetime datetime not null,
    user_id bigint not null, 
    order_amount decimal(10,2) not null default 0, 
    key idx_order_datetime (order_datetime),
    key idx_user_id (user_id));

create table t_order_item (
    order_item_id bigint auto_increment primary key, 
    order_id bigint not null, 
    item_id int null,
    item_quantity int not null default 0,
    key idx_order_id (order_id));

--  Create a load simulation stored procedure 
delimiter //    
create procedure sp_generate_order_data(p_seconds int)  
begin   
    set @start_ts := now();  
    set @start_date := unix_timestamp('2022-03-01');  
    set @end_date := unix_timestamp('2022-06-01');

    while timestampdiff(second,@start_ts,now()) <= p_seconds do  
        start transaction;
        set @order_datetime := from_unixtime(@start_date + rand() * (@end_date - @start_date));
        set @user_id := floor(1 + rand() * 100000000);   
        set @order_amount := round((10 + rand() * 2000),2);

        insert into t_order (order_datetime, user_id, order_amount) 
        values (@order_datetime, @user_id, @order_amount);  
        set @order_id := last_insert_id();

        set @quantity := floor(1 + rand() * 50);
        set @i := 1;
        while @i <= @quantity do 
            set @item_id := floor(1 + rand() * 10000); 
            set @item_quantity := floor(1 + rand() * 20); 
            insert into t_order_item (order_id, item_id, item_quantity) values (@order_id, @item_id, @item_quantity);
            set @i:[email protected]+1;
        end while;
        commit;
 
    end while;   
end   
//    
delimiter ; 

--  Execute stored procedures 
call sp_generate_order_data(1800);

The following steps are in the process of executing a stored procedure （ Half an hour ） Conduct , Simulate the actual online migration process .

（2） Create a data migration job

Creating a data migration job involves the following steps ：

Create a logical library .
Add source resource .
Change a single table rule to a single rule .
establish sharding scaling The rules .
Add target resource .
Modify fragmentation rules , Trigger migration .
Monitoring migration jobs

Connect Proxy：

mysql -u root -h 172.18.10.66 -P 3307 -p123456

Create a logical library ：

drop database if exists migrating_db;
create database migrating_db;
use migrating_db;

Add source resource ：

add resource 
resource_source (host=172.18.26.198, port=3306, db=migrating_db, user=wxy, password=mypass);
show schema resources\G

Confirm to automatically create a single table rule ：

count schema rules;

Change a single table rule to a single rule ：

create sharding table rule 
t_order (datanodes("resource_source.t_order")),
t_order_item (datanodes("resource_source.t_order_item"));

Preview the current shard rule ：

preview select count(1) from t_order;
preview select count(1) from t_order_item;

perform SQL Confirm that the data can be queried correctly ：

select count(1) from t_order;
select count(1) from t_order_item;

Create manual mode scaling The rules ：

create sharding scaling rule scaling_manual (
input(
  worker_thread=40,
  batch_size=1000
),
output(
  worker_thread=40,
  batch_size=1000
),
stream_channel(type(name=memory, properties("block-queue-size"=10000))),
data_consistency_checker(type(name=data_match, properties("chunk-size"=1000)))
);

see sharding_scaling The rules ：

show sharding scaling rules\G

Parameter description ：

worker_thread： Ingest from the source / The size of the thread pool that writes the full amount of data to the target .
batch_size： The maximum number of records returned by a query operation .
stream_channel： Data channel , Connecting producers and consumers , be used for input and output link .type Specify the algorithm type , optional MEMORY;block-queue-size The algorithm attribute specifies the blocking queue size .
data_consistency_checker： Data consistency checking algorithm .type Specify the algorithm type , optional DATA_MATCH、CRC32_MATCH;chunk-size The algorithm attribute specifies the maximum number of records returned by a query operation .

dataconsistencychecker Of type Can be executed by show scaling check algorithms Query to ：

mysql> show scaling check algorithms;
+-------------+----------------------------+--------------------------------------------------------------+----------------+
| type        | description                | supported_database_types                                     | provider       |
+-------------+----------------------------+--------------------------------------------------------------+----------------+
| CRC32_MATCH | Match CRC32 of records.    | MySQL                                                        | ShardingSphere |
| DATA_MATCH  | Match raw data of records. | SQL92,openGauss,PostgreSQL,MySQL,MariaDB,H2,Oracle,SQLServer | ShardingSphere |
+-------------+----------------------------+--------------------------------------------------------------+----------------+
2 rows in set (0.01 sec)

data_match Supports all databases , But the performance is not the best ;crc32_match Only support MySQL, But the performance is better .

There are also completion_detector、rateLimiter Two legacy configuration issues have not been resolved .

Except manual mode , The official documentation also supports automatic mode configuration ：

create sharding scaling rule scaling_auto (
input(
  worker_thread=40,
  batch_size=1000
),
output(
  worker_thread=40,
  batch_size=1000
),
stream_channel(type(name=memory, properties("block-queue-size"=10000))),
completion_detector(type(name=idle, properties("incremental-task-idle-seconds-threshold"=1800))),
data_consistency_checker(type(name=data_match, properties("chunk-size"=1000)))
);

completion_detector Specifies whether the job is near completion detection algorithm . If it is not configured, the following steps cannot be performed automatically , Can pass DistSQL Manual operation .type Specify the algorithm type , optional IDLE;incremental-task-idle-seconds-threshold The algorithm attribute specifies that if the incremental synchronization task is no longer active for a certain period of time （ Number of seconds ）, Then it can be considered that the incremental synchronization task is nearly completed , Applicable algorithm types IDLE.

however , An error occurs when I configure the automatic mode to trigger the migration ：

[ERROR] 2022-06-06 06:35:21.727 [0130317c30317c3054317c7363616c696e675f6462_Worker-1] o.a.s.e.e.h.g.LogJobErrorHandler - Job '0130317c30317c3054317c7363616c696e675f6462' exception occur in job processing
java.lang.IllegalArgumentException: incremental task idle threshold can not be null.

The second remaining problem is the current limiting configuration .Proxy 5.1.1 About in the document scaling The configuration item contains the following descriptions ：

rateLimiter: #  Current limiting algorithm . If it is not configured, the current is not limited .
  type: #  Algorithm type . optional ：
  props: #  Algorithm properties

You can see that the optional algorithm types and related algorithm attributes are not given in the document .DistSQL There is also no rateLimiter Current limiting configuration item .

Add target resource ：

add resource 
resource_1 (host=172.18.10.66, port=3306, db=db1, user=wxy, password=mypass),
resource_2 (host=172.18.10.66, port=3306, db=db2, user=wxy, password=mypass),
resource_3 (host=172.18.18.102, port=3306, db=db1, user=wxy, password=mypass),
resource_4 (host=172.18.18.102, port=3306, db=db2, user=wxy, password=mypass);
show schema resources\G

Modify fragmentation rules , Trigger migration ：

alter sharding table rule 
t_order (
resources(resource_1,resource_2,resource_3,resource_4),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=8)),
key_generate_strategy(column=order_id,type(name=snowflake))),
t_order_item (
resources(resource_1,resource_2,resource_3,resource_4),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=8)),
key_generate_strategy(column=order_item_id,type(name=snowflake)));

At present, only through implementation ALTER SHARDING TABLE RULE DistSQL To trigger the migration . In this example, the sharding rules of the two migration tables are changed from one to the other datanodes Change to four data sources 8 A shard , Will trigger the migration .

Monitoring migration jobs ：

mysql> show scaling list;
+------------------------------------------------+----------------------+----------------------+--------+---------------------+-----------+
| id                                             | tables               | sharding_total_count | active | create_time         | stop_time |
+------------------------------------------------+----------------------+----------------------+--------+---------------------+-----------+
| 0130317c30317c3054317c6d6967726174696e675f6462 | t_order,t_order_item | 1                    | true   | 2022-06-06 18:19:47 | NULL      |
+------------------------------------------------+----------------------+----------------------+--------+---------------------+-----------+
1 row in set (0.01 sec)

mysql> show scaling status 0130317c30317c3054317c6d6967726174696e675f6462;
+------+-----------------+------------------------+--------+-------------------------------+--------------------------+
| item | data_source     | status                 | active | inventory_finished_percentage | incremental_idle_seconds |
+------+-----------------+------------------------+--------+-------------------------------+--------------------------+
| 0    | resource_source | EXECUTE_INVENTORY_TASK | true   | 80                            | 0                        |
+------+-----------------+------------------------+--------+-------------------------------+--------------------------+
1 row in set (0.00 sec)

SHOW SCALING LIST Query all migration jobs . This is a big picture DistSQL command , Returns all status Scaling Homework , Not only for returning to the current logical database .SHOW SCALING STATUS Command to query the progress of a migration job .

（3） Cutover

The cutting steps include ：

The source database application stops writing , To avoid losing data .
View migration job progress .
Proxy Stop writing , The fuse .
Data consistency check .
Switch metadata .
Confirm that the target fragmentation rule is effective .
Create binding table rules （ Optional ）.
Confirm that the migration job has completed .
Applications connect to Proxy Access database .

Application stop writing ：Ctrl + c or kill Drop the running stored procedure .

Query migration job progress ：

mysql> show scaling status 0130317c30317c3054317c6d6967726174696e675f6462;
+------+-----------------+--------------------------+--------+-------------------------------+--------------------------+
| item | data_source     | status                   | active | inventory_finished_percentage | incremental_idle_seconds |
+------+-----------------+--------------------------+--------+-------------------------------+--------------------------+
| 0    | resource_source | EXECUTE_INCREMENTAL_TASK | true   | 100                           | 5                        |
+------+-----------------+--------------------------+--------+-------------------------------+--------------------------+
1 row in set (0.00 sec)

When status achieve EXECUTE_INCREMENTAL_TASK, Full migration completed , In the incremental migration phase .inventory_finished_percentage Indicates the completion percentage of stock data ,incremental_idle_seconds Indicates the number of incremental idle seconds , Indicates whether the increment is nearing completion .

Proxy Stop writing , Select a business low peak period , Stop writing to the source end library or data operation entry ：

stop scaling source writing 0130317c30317c3054317c6d6967726174696e675f6462;

Data consistency check , According to the amount of data , This step may take a long time ：

mysql> check scaling 0130317c30317c3054317c6d6967726174696e675f6462 by type (name=crc32_match);
+--------------+----------------------+----------------------+-----------------------+-------------------------+
| table_name   | source_records_count | target_records_count | records_count_matched | records_content_matched |
+--------------+----------------------+----------------------+-----------------------+-------------------------+
| t_order      | 57599                | 57599                | true                  | true                    |
| t_order_item | 1468140              | 1468140              | true                  | true                    |
+--------------+----------------------+----------------------+-----------------------+-------------------------+
2 rows in set (2.17 sec)

Switch metadata ：

apply scaling 0130317c30317c3054317c6d6967726174696e675f6462;

Preview whether the target partition has taken effect ：

mysql> preview select count(1) from t_order;
+------------------+-------------------------------------------------------------------------+
| data_source_name | actual_sql                                                              |
+------------------+-------------------------------------------------------------------------+
| resource_1       | select count(1) from t_order_0 UNION ALL select count(1) from t_order_4 |
| resource_2       | select count(1) from t_order_1 UNION ALL select count(1) from t_order_5 |
| resource_3       | select count(1) from t_order_2 UNION ALL select count(1) from t_order_6 |
| resource_4       | select count(1) from t_order_3 UNION ALL select count(1) from t_order_7 |
+------------------+-------------------------------------------------------------------------+
4 rows in set (0.01 sec)

mysql> preview select count(1) from t_order_item;
+------------------+-----------------------------------------------------------------------------------+
| data_source_name | actual_sql                                                                        |
+------------------+-----------------------------------------------------------------------------------+
| resource_1       | select count(1) from t_order_item_0 UNION ALL select count(1) from t_order_item_4 |
| resource_2       | select count(1) from t_order_item_1 UNION ALL select count(1) from t_order_item_5 |
| resource_3       | select count(1) from t_order_item_2 UNION ALL select count(1) from t_order_item_6 |
| resource_4       | select count(1) from t_order_item_3 UNION ALL select count(1) from t_order_item_7 |
+------------------+-----------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

You can see that the data has been fragmented into new database resources .

Create binding table ：

create sharding binding table rules (t_order,t_order_item);

Confirm that the binding table rule is effective ：

mysql> preview select i.* from t_order o join t_order_item i on o.order_id=i.order_id where o.order_id in (10, 11);
+------------------+---------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql                                                                                              |
+------------------+---------------------------------------------------------------------------------------------------------+
| resource_3       | select i.* from t_order_2 o join t_order_item_2 i on o.order_id=i.order_id where o.order_id in (10, 11) |
| resource_4       | select i.* from t_order_3 o join t_order_item_3 i on o.order_id=i.order_id where o.order_id in (10, 11) |
+------------------+---------------------------------------------------------------------------------------------------------+
2 rows in set (0.06 sec)

Confirm that the migration job has completed ：

mysql> show scaling status 0130317c30317c3054317c7363616c696e675f6462;
+------+-----------------+----------+--------+-------------------------------+--------------------------+
| item | data_source     | status   | active | inventory_finished_percentage | incremental_idle_seconds |
+------+-----------------+----------+--------+-------------------------------+--------------------------+
| 0    | resource_source | FINISHED | false  | 100                           | 251                      |
+------+-----------------+----------+--------+-------------------------------+--------------------------+
1 rows in set (0.00 sec)

2. Database node expansion

demand ： There are two in use 8 Fragment table , Their data nodes need to be expanded to 16 Fragmentation . The data of the two tables changes in real time , It is required to minimize the time of business impact .

Source ：172.18.26.198:3306/db1、172.18.26.198:3306/db2
The goal is ：172.18.10.66:3306/db1、172.18.10.66:3306/db2、172.18.18.102:3306\db1、172.18.18.102:3306\db2
Proxy：172.18.10.66:3307、172.18.18.102:3307,Cluster Operation mode

（1） Prepare the test case environment

Source 198, The goal is 66、102 perform ：

drop database if exists db1;
drop database if exists db2;
create database db1;
create database db2;

Connect Proxy：

mysql -u root -h 172.18.10.66 -P 3307 -p123456

Create a logical library ：

drop database if exists scaling_db;
create database scaling_db;
use scaling_db;

Add source resource ：

add resource 
resource_source1 (host=172.18.26.198, port=3306, db=db1, user=wxy, password=mypass),
resource_source2 (host=172.18.26.198, port=3306, db=db2, user=wxy, password=mypass);
show schema resources\G

Create rules ：

--  Fragment table 
create sharding table rule 
t_order (
resources(resource_source1,resource_source2),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=8)),
key_generate_strategy(column=order_id,type(name=snowflake))),
t_order_item (
resources(resource_source1,resource_source2),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=8)),
key_generate_strategy(column=order_item_id,type(name=snowflake)));

--  Binding table 
create sharding binding table rules (t_order,t_order_item);

Build table ：

create table t_order (
    order_id bigint auto_increment primary key, 
    order_datetime datetime not null,
    user_id bigint not null, 
    order_amount decimal(10,2) not null default 0, 
    key idx_order_datetime (order_datetime),
    key idx_user_id (user_id));

create table t_order_item (
    order_item_id bigint auto_increment primary key, 
    order_id bigint not null, 
    item_id int null,
    item_quantity int not null default 0,
    key idx_order_id (order_id));

Preview whether the rule has taken effect ：

mysql> preview select count(1) from t_order;
+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql                                                                                                                                                |
+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| resource_source1 | select count(1) from t_order_0 UNION ALL select count(1) from t_order_2 UNION ALL select count(1) from t_order_4 UNION ALL select count(1) from t_order_6 |
| resource_source2 | select count(1) from t_order_1 UNION ALL select count(1) from t_order_3 UNION ALL select count(1) from t_order_5 UNION ALL select count(1) from t_order_7 |
+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> preview select count(1) from t_order_item;
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql                                                                                                                                                                    |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| resource_source1 | select count(1) from t_order_item_0 UNION ALL select count(1) from t_order_item_2 UNION ALL select count(1) from t_order_item_4 UNION ALL select count(1) from t_order_item_6 |
| resource_source2 | select count(1) from t_order_item_1 UNION ALL select count(1) from t_order_item_3 UNION ALL select count(1) from t_order_item_5 UNION ALL select count(1) from t_order_item_7 |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> preview select i.* from t_order o join t_order_item i on o.order_id=i.order_id where o.order_id in (10, 11);
+------------------+---------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql                                                                                              |
+------------------+---------------------------------------------------------------------------------------------------------+
| resource_source1 | select i.* from t_order_2 o join t_order_item_2 i on o.order_id=i.order_id where o.order_id in (10, 11) |
| resource_source2 | select i.* from t_order_3 o join t_order_item_3 i on o.order_id=i.order_id where o.order_id in (10, 11) |
+------------------+---------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

Add stock data ：

insert into t_order (order_id, order_datetime, user_id, order_amount) 
values 
(1, now(), 1, 100),(2, now(), 2, 200),(3, now(), 3, 300),(4, now(), 4, 400),
(5, now(), 5, 500),(6, now(), 6, 600),(7, now(), 7, 700),(8, now(), 8, 800);

insert into t_order_item (order_item_id, order_id, item_id, item_quantity) 
values 
(1,1,1,10),(2,1,2,20),(3,2,3,30),(4,2,4,40),(5,3,1,10),(6,3,2,20),(7,4,3,30),(8,4,4,40),
(9,5,1,10),(10,5,2,20),(11,6,3,30),(12,6,4,40),(13,7,1,10),(14,7,2,20),(15,8,3,30),(16,8,4,40);

ShardingSphere-Proxy Only non sharding tables are supported MySQL Storage function 、 Stored procedure operations , So here we can only use ordinary insert sentence .

（2） Create a data migration job

Creating a data migration job involves the following steps ：

Create a logical library .
Add source resource .
Configure tables in the existing system into rules .
establish sharding scaling The rules .
Add target resource .
Modify fragmentation rules , Trigger migration .
Monitoring migration jobs

In this case, because our source and target use the same Proxy colony , The first 1-3 Step is already in the previous step “ Prepare the test case environment ” Finished in .

Create manual mode scaling The rules ：

create sharding scaling rule scaling_manual (
input(
  worker_thread=40,
  batch_size=1000
),
output(
  worker_thread=40,
  batch_size=1000
),
stream_channel(type(name=memory, properties("block-queue-size"=10000))),
data_consistency_checker(type(name=data_match, properties("chunk-size"=1000)))
);

Add target resource ：

add resource 
resource_1 (host=172.18.10.66, port=3306, db=db1, user=wxy, password=mypass),
resource_2 (host=172.18.10.66, port=3306, db=db2, user=wxy, password=mypass),
resource_3 (host=172.18.18.102, port=3306, db=db1, user=wxy, password=mypass),
resource_4 (host=172.18.18.102, port=3306, db=db2, user=wxy, password=mypass);
show schema resources\G

Modify fragmentation rules , Trigger migration , Note that bound tables can only be migrated in one piece ：

alter sharding table rule 
t_order (
resources(resource_1,resource_2,resource_3,resource_4),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=16)),
key_generate_strategy(column=order_id,type(name=snowflake))),
t_order_item (
resources(resource_1,resource_2,resource_3,resource_4),
sharding_column=order_id,type(name=hash_mod,properties("sharding-count"=16)),
key_generate_strategy(column=order_item_id,type(name=snowflake)));