当前位置：网站首页>Scheme and practice of cold and hot separation of massive data

Scheme and practice of cold and hot separation of massive data

2022-07-03 01:32:00 【ByteDance technical team】

Focus on Dry goods don't get lost

background

With the rapid development of financial payment business , Considering the continuous growth of orders in the future , Online storage meets greater challenges , It needs to be planned in advance . At present, the main business of financial payment is to use mysql（InnoDB） As data storage , Due to the low access frequency of historical order information and taking up a lot of database storage space , It is expected to separate the historical data from the latest production transaction data , The current database retains the data of the latest period of time as a hot storage , Historical transactions are stored in another database for compressed storage as a cold storage (rocksdb), That is, the separation of cold and hot databases . This will greatly save the cost of database equipment , Reduce the duration of service downtime due to insufficient online storage space expansion , The following case analysis based on the current situation of the unified trading system of financial payment is for your reference only .

programme

Technology selection

Architecture diagram

Scheme analysis

Because the business scenario is relatively complex , If the workload is sorted according to the business scenario, it will increase exponentially , Alternate dimension , Database related operations are nothing more than queries 、 Insert 、 to update , As long as the query can be guaranteed in the database interaction layer 、 Insert 、 The basic operation of updating these databases will not be affected after adding cold and hot separation . Financial payment codes have unified layered specifications , All database operations are converged and encapsulated to the database interaction layer , Therefore, it is better to transform , Without capacity expansion , The heat storage is expected to be kept recently X God （ The time is adjustable ） data , X Archive the data of days ago to the cold storage .

Scheme comparison

Scheme 1 ： A solution to the pressure of database storage , But the performance requirements of cold storage are too high , If the insertion involved 、 to update 、 Query can filter the time according to the document number , Reduce dependence on cold storage .
Option two ： Suitable for cold storage with low performance , The insertion involved 、 to update 、 When most queries cannot filter the time according to the document number , Need to transfer and filter the heat storage archive table .
Option three ： If the scenario involved in the system is relatively simple , There is no subsequent change to historical orders , You can archive by scene .

Options

transaction ： The transaction table is responsible for recording the mapping between merchant orders and financial payment internal orders 、 Transaction amount 、 Important information such as buyer and seller , The most important function is to prevent repeated transactions . But the performance of cold storage is lower than that of hot storage , There is no fixed rule for merchant order number , It is impossible to judge the time filter according to the order number to reduce the pressure of the cold storage , And heat storage cpu The usage rate is very low , Heat storage database calculation is not a bottleneck , Therefore, scheme II is selected for the transaction . The main significance of transaction filing table is to reduce the dependence of online transactions on cold storage .

payment ： The payment form is responsible for saving the payment method used in the transaction sheet 、 How much does this payment method need to deduct 、 Where to buckle 、 Where to buckle to wait for information , Order query involved 、 to update 、 The insertion can be judged according to the transaction number or payment number, and the time is reduced to query the cold storage , So payment option one .

The basic principle of

In order to fully guarantee 0 accident ,0 Asset loss , During scheme design , The following basic principles are proposed , In R & D 、 test 、 During code review, the following basic principles are referred to for layer by layer control , It can effectively avoid the occurrence of production accidents .

Data insertion uniqueness ：

All the unique keys of the hot storage archive table must be consistent with the hot storage table to be archived .
Hot storage archive records existing orders , The cold storage must have corresponding data ,
- Cold storage insert : First insert After the success of the cold storage Again insert Heat storage archive table
- Cold storage update ： Update cold storage data , Use the same transaction First delete Again insert Cold storage data
- Hot storage delete ： Use cold storage data when deleting hot storage data where Conditions , All hot storage fields （ contain ID） Only when all the conditions are met can the deletion be successful .

Data update consistency ：

There is no cold storage update operation , All update operations must be carried out in the hot storage , If the data needs to be updated and only exists in the cold storage , It needs to be synchronized to the hot storage , Then complete the update in the hot storage .
When cold storage and hot storage data exist at the same time , Subject to heat storage data . The data source of cold storage is only the hot storage data synchronized to the cold storage .
When data is synchronized from cold storage to hot storage , The operation filing table and transaction table should be completed in the same transaction , The queries involved must use the write library .

Accuracy of data query ：

Single query ： When the query hot storage data does not exist , There is no need to query the cold storage again （ If the order date can be judged in the order number , You can add another layer of date filtering , Reduce cold storage queries ）
Batch query ： When the cold storage and hot storage data exist, the hot storage data will be returned first .
Batch query ： After merging cold and hot storage data , It depends on whether the interface sequence of the original query is required , If there are requirements for order, you need to sort after merging .
Reduce cold storage pressure ： The performance of cold storage is low , Online real-time transactions minimize the query and dependence on cold storage （ You can filter through the date in the transaction number or the filing table ）.
Limit the number of days to control ： Database interaction Layer days control by n, The number of days for archiving task control is m, requirement m>n. for example ,mode layer Some judge that the order exceeds n Days will query the cold storage , Archiving tasks only archive m The historical data of the day before , Separate control can prevent data from being found due to the adjustment of filing days .

Specific details

Archive table structure

Archive table status flow

Consistent deletion

Use all the fields of the cold storage record as the delete hot storage where Conditions （ Including self increasing id）, Deleting hot storage and updating hot storage archive status to cold storage need to be in one thing , Rollback if any failure .

Transaction and payment tasks （ Data archiving 、 Delete 、 The bottom line ）

Archiving tasks

Query the hot storage order table X （ The time is adjustable ） Order of days ago , Synchronize hot storage orders to cold storage , Insert the hot storage archive table , The archiving status is in process , Put delay delete mq news .

Archive delete TASK

The resident service TASK Consumption delete mq news ,rpc Call the deletion interface provided by transaction payment , Support local current limiting capability .

Tell the whole story ：

The main function ： Query the orders in the hot storage archive table that have been modified for more than the specified time in processing, and force the deletion operation . Mainly used to prevent mq Abnormal or daily lost messages , Using the task of covering the bottom can compensate for the archived records in the digestion process .

Perform logical

 Data archiving tasks （ Start once a day ）
for {
    Initialize query time range and paging 
   for{
         Inquire about  X  Trading order days ago  limit 1000( Index sort , Rolling time query )
        if  Records exist   also   Number of pieces =1000 {
           for  For each record  {
            //  Enable x Processes 
             Trade order idempotent is written into cold storage （ There is no guarantee of the latest , Only ensure the existence of cold storage data ）
             Idempotent write archive record table （type: PROCESSING, When the hot storage data is deleted, it will be updated to COLD, Archive record already exists HOT Status updated to PROCESSING ）
             Hair MQ Delay message ,X min（ Configurable ） Delete the heat storage data 
            }
        }
        if  Number of pieces =1000 {
            continue
        }
         The time frame moves down 
        // Record does not exist 
        if  The end time exceeds the specified time {
            break ( Out of the loop , End of the task )
        }
        redis Record current query criteria , It is convenient for subsequent tasks to resume 
       }
}

 Delete hot storage data , consumption MQ
 consumption MQ Record  {
     Query cold storage 
     Data consistency deletion （ Open transaction   Conditionally delete hot storage data , Update archive record table status as COLD  End the business ）
     Consistent deletion of hot storage failed , Synchronize hot storage data to cold storage , Data consistency deletion 
}

 Compensation task （ Every time 30 It starts every 10 minutes ）
{
     The status in the query archive record table is PROCESSING, Change the time to X +Y min Records before  limit 1000
    if  non-existent  {
        break
    }
    for  For each record 
         Query cold storage 
         Data consistency deletion （ Open transaction   Conditionally delete hot storage data , Update archive record table status as COLD  End the business ）
         Consistent deletion of hot storage failed , Synchronize hot storage data to cold storage , Data consistency deletion 
    }
}

Archive task query time rolling mechanism ： The first start time of the time range is a fixed date （ Earliest date of financial payment order ）, The end time is the specified date , The next start time is equal to the last end time , The end time is the last end time plus the specified time range ）. Every time you query the next time window redis Save information , Specify Date , The time range of the task of the day , paged .

Archive tasks are processed concurrently ： It needs to support multi task sharding and concurrent processing

Increase the volume of archived orders throughout the day ： In order not to affect online transactions , all day 24 Hours distinguish Peak trading 、 Low peak 、 daily Three different time periods , Archiving speed is different .

transaction - There are filing forms （ Inquire about 、 newly added 、 to update ）

characteristic : The only key has an external number , The order rules are random, and the time cannot be judged according to the order number , Therefore, there must be a filing form .

Inquire about

The logic realizes unified processing in the database interaction layer

Some of the following situations can be handled specially to reduce the dependence of database cold storage .

Single query ：
- Query according to the external document number , If the inquiry qps Higher , You can use the archive table to filter and judge before querying the cold storage .
- Query according to the transaction number , If you can judge the time according to the order number , Use the document number to filter the time range before querying the cold storage .
Batch query ： Some functions manage background function paging query , When adding cold storage query logic with high requirements for data query range , You can add the start time of the incoming query time range to filter whether to query the cold storage , When cold storage and hot storage exist, the hot storage data shall be retained first （ Only filter the same document number data in the same page ）, If you have any objection to the result, you can use the order number to query again and return to the latest reconfirmation . Confirm with the product and operation whether it can be supported or not, just query the hot storage .

to update

The logic realizes unified processing in the database interaction layer

Insert

The logic is implemented in the database interaction layer

payment - No filing table （ Inquire about 、 newly added 、 to update ）

characteristic : The only key is the internal number , The time of existing main queries can be judged according to the document number , There is no need to archive the table , It can completely solve the problem of hot storage database .

Inquire about

The logic realizes unified processing in the database interaction layer

Some of the following situations can be handled specially to reduce the dependence of database cold storage .

Single query ：
- Query according to the payment order number , If you can judge the time according to the order number , Use the document number to filter the time range before querying the cold storage .
Batch query ：
- Query according to the transaction number , If you can judge the time according to the order number , Use the document number to filter the time range before querying the cold storage .
- Some functions manage background function paging query , When adding cold storage query logic with high requirements for data query range , You can add the start time of the incoming query time range to filter whether to query the cold storage , When cold storage and hot storage exist, the hot storage data shall be retained first （ Only filter the same document number data in the same page ）, If you have any objection to the result, you can use the order number to query again and return to the latest reconfirmation . Confirm with the product and operation whether it can be supported or not, just query the hot storage .

to update

The logic realizes unified processing in the database interaction layer

Insert

The logic is implemented in the database interaction layer

summary

Payment completely solves the problem of database storage pressure because there is no archived table , It greatly saves database storage resources .
Due to the new filing table , It greatly delays the storage pressure of the heat storage database , It also provides additional buffer expansion time for the transaction database , It provides sufficient time for the subsequent optimization of transactions and the solution of database storage problems .

results

Completely solved the problem of storage pressure of payment database , It effectively relieves the storage pressure of the hot database of the transaction database .
The retention days of database hot storage can be flexibly adjusted , The number of days available for storage can be reasonably adjusted according to the subsequent order quantity .

shortcoming

The archiving table is added in scheme 2 , And archive the full amount of data stored in the table , It can only reduce the storage space tension of transaction and payment databases , Unable to completely solve the problem of database storage .
Trading table released datafree Storage space cannot be provided for archive tables , It can only be used in the transaction table , The transaction table needs to be released irregularly datafree Space .

Join us

Financial payment As a public payment service , Provide stability for business 、 Efficient 、 Rich capital related services ; Tiktok payment as byte's own payment product , It can help users better consume and shop , Welcome to join the ByteDance financial team .

The financial R & D team is in hot recruitment , Welcome to Click on “ Read the original ” Or scan the qr code below The CV .