当前位置:网站首页>Design and implementation of redis 7.0 multi part AOF

Design and implementation of redis 7.0 multi part AOF

2022-07-04 05:44:00 Alibaba cloud yunqi

brief introduction : This article will explain in detail Redis Existing in AOF Some deficiencies of the mechanism and Redis 7.0 Introduced in Multi Part AOF Design and implementation details of .

Redis As a very popular memory database , By keeping data in memory ,Redis To have extremely high reading and writing performance . But once the process exits ,Redis All the data will be lost .

To solve this problem ,Redis Provides RDB and AOF Two persistence schemes , Save the data in memory to disk , Avoid data loss . This article will focus on AOF Persistence scheme , And some of its problems , And discuss in Redis 7.0 ( The published RC1) in Multi Part AOF( Hereinafter referred to as MP-AOF, This feature is provided by Alibaba cloud database Tair Team contribution ) Design and implementation details .

AOF

AOF( append only file ) Persistence records each write command as a separate log file , And in Redis Playback on startup AOF The command in the file to recover the data .

because AOF Each item will be recorded in an additional way redis Write the command of , So with Redis More write commands processed ,AOF Files will get bigger and bigger , Command playback time will also increase , To solve this problem ,Redis Introduced AOF rewrite Mechanism ( Hereinafter referred to as AOFRW).AOFRW Will remove AOF Redundant write commands in , Rewrite... In an equivalent way 、 Make a new one AOF file , To reduce AOF The purpose of file size .

AOFRW

chart 1 The show is AOFRW Implementation principle of . When AOFRW When triggered to execute ,Redis First of all be fork A child process performs background rewriting operations , This action will execute fork That moment Redis All data snapshots of are rewritten to a file named temp-rewriteaof-bg-pid.aof The temporary AOF In file .

Because the rewrite operation is performed in the background for child processes , The main process is in AOF You can still respond to user commands normally during rewriting . therefore , In order for the child process to finally get the incremental changes generated by the main process during rewriting , The main process will write the executed write command to aof_buf, And write a copy to aof_rewrite_buf Cache in . At a later stage of subprocess rewriting , The main process will aof_rewrite_buf Use of accumulated data in pipe Send to child process , The subprocess will append this data to the temporary AOF In file ( The detailed principle can be referred here ).

When the main process undertakes a large amount of write traffic ,aof_rewrite_buf There may be a lot of data piled up in , As a result, the subprocess cannot change aof_rewrite_buf All the data in are consumed . here ,aof_rewrite_buf The remaining data will be processed by the main process at the end of rewriting .

When the child process completes the rewrite operation and exits , The main process will be in backgroundRewriteDoneHandler Deal with the follow-up . First , Will be rewritten during aof_rewrite_buf The data not consumed in is added to the temporary data AOF In file . secondly , When everything is ready ,Redis Will use rename The operation will be temporary AOF Rename the file atom to server.aof_filename, Now the original AOF The file will be overwritten . thus , Whole AOFRW End of the process .

image.png
chart 1 AOFRW Realization principle

AOFRW The problem is

Memory overhead

From the figure 1 You can see , stay AOFRW period , The main process will fork After writing the data aof_rewrite_buf in ,aof_rewrite_buf and aof_buf Most of the content in is repetitive , Therefore, this will bring additional memory redundancy overhead .

stay Redis INFO Medium aof_rewrite_buffer_length Field to see the current time aof_rewrite_buf The amount of memory used . As shown below , Under high write traffic aof_rewrite_buffer_length Almost and aof_buffer_length It takes up the same amount of memory , Almost twice as much memory is wasted .

aof_pending_rewrite:0
aof_buffer_length:35500
aof_rewrite_buffer_length:34000
aof_pending_bio_fsync:0

When aof_rewrite_buf When the occupied memory size exceeds a certain threshold , We will be in Redis See the following information in the log . You can see ,aof_rewrite_buf Occupied 100MB Memory space and transfer between main process and child process 2135MB The data of ( The subprocess is passing pipe When reading these data, there will also be internal reading buffer Memory overhead ). For memory databases Redis for , This is not a small expense .

3351:M 25 Jan 2022 09:55:39.655 * Background append only file rewriting started by pid 6817
3351:M 25 Jan 2022 09:57:51.864 * AOF rewrite child asks to stop sending diffs.
6817:C 25 Jan 2022 09:57:51.864 * Parent agreed to stop sending diffs. Finalizing AOF...
6817:C 25 Jan 2022 09:57:51.864 * Concatenating 2135.60 MB of AOF diff received from parent.
3351:M 25 Jan 2022 09:57:56.545 * Background AOF buffer size: 100 MB

AOFRW The memory overhead may lead to Redis Memory suddenly reached maxmemory Limit , This affects the writing of normal commands , It will even trigger the operating system to be restricted OOM Killer Kill , Lead to Redis No service .

CPU expenses

CPU There are three main areas of expenditure , Explain as follows :

stay AOFRW period , The main process takes CPU Time direction aof_rewrite_buf Writing data , And use eventloop The event loop sends... To the child process aof_rewrite_buf Data in :

/* Append data to the AOF rewrite buffer, allocating new blocks if needed. */
void aofRewriteBufferAppend(unsigned char *s, unsigned long len) {
    //  Other details are omitted here ...
  
    /* Install a file event to send data to the rewrite child if there is
     * not one already. */
    if (!server.aof_stop_sending_diff &&
        aeGetFileEvents(server.el,server.aof_pipe_write_data_to_child) == 0)
    {
        aeCreateFileEvent(server.el, server.aof_pipe_write_data_to_child,
            AE_WRITABLE, aofChildWriteDiffData, NULL);
    } 
  
    //  Other details are omitted here ...
}

After the child process performs the rewrite operation , Will cycle through pipe Incremental data sent by the main process in , Then append and write to the temporary AOF file :

int rewriteAppendOnlyFile(char *filename) {
    //  Other details are omitted here ...
  
    /* Read again a few times to get more data from the parent.
     * We can't read forever (the server may receive data from clients
     * faster than it is able to send data to the child), so we try to read
     * some more data in a loop as soon as there is a good chance more data
     * will come. If it looks like we are wasting time, we abort (this
     * happens after 20 ms without new data). */
    int nodata = 0;
    mstime_t start = mstime();
    while(mstime()-start < 1000 && nodata < 20) {
        if (aeWait(server.aof_pipe_read_data_from_parent, AE_READABLE, 1) <= 0)
        {
            nodata++;
            continue;
        }
        nodata = 0; /* Start counting from zero, we stop on N *contiguous*
                       timeouts. */
        aofReadDiffFromParent();
    }
    //  Other details are omitted here ...
}

After the child process completes the rewrite operation , The main process will be in backgroundRewriteDoneHandler Finish the work . One of the tasks is to be during rewriting aof_rewrite_buf The data that has not been consumed in the is written to the temporary AOF file . If aof_rewrite_buf There is a lot of data left in , It will also consume CPU Time .

void backgroundRewriteDoneHandler(int exitcode, int bysignal) {
    //  Other details are omitted here ...
  
    /* Flush the differences accumulated by the parent to the rewritten AOF. */
    if (aofRewriteBufferWrite(newfd) == -1) {
        serverLog(LL_WARNING,
                "Error trying to flush the parent diff to the rewritten AOF: %s", strerror(errno));
        close(newfd);
        goto cleanup;
     }
    
     //  Other details are omitted here ...
}

AOFRW It brings CPU Expenses may cause Redis Appears when executing the command RT Jitter on , Even cause the problem of client timeout .

disk IO expenses

As mentioned earlier , stay AOFRW period , The main process will write the executed write command to aof_buf outside , And write a copy to aof_rewrite_buf in .aof_buf The data in will eventually be written to the old file currently in use AOF In file , Generate disks IO. meanwhile ,aof_rewrite_buf The data in will also be written to the new data generated by rewriting AOF In file , Generate disks IO. therefore , The same data will be generated twice IO.

Code complexity

Redis Use the six shown below pipe Carry out data transmission and control interaction between main process and sub process , This makes the whole AOFRW Logic becomes more complex and difficult to understand .

/* AOF pipes used to communicate between parent and child during rewrite. */
 int aof_pipe_write_data_to_child;
 int aof_pipe_read_data_from_parent;
 int aof_pipe_write_ack_to_parent;
 int aof_pipe_read_ack_from_child;
 int aof_pipe_write_ack_to_child;
 int aof_pipe_read_ack_from_parent;

MP-AOF Realization

Program Overview

seeing the name of a thing one thinks of its function ,MP-AOF Is to put the original single AOF File split into multiple AOF file . stay MP-AOF in , We will AOF There are three types , Respectively :

  • BASE: It means the basis AOF, It is generally generated by child processes through rewriting , The file has at most one .
  • INCR: Incremental representation AOF, It usually AOFRW Created at the beginning of execution , There may be multiple files in this file .
  • HISTORY: Represents history AOF, It consists of BASE and INCR AOF Change comes , Every time AOFRW On successful completion , This time AOFRW The previous corresponding BASE and INCR AOF Will become HISTORY,HISTORY Type of AOF Will be Redis Automatically delete .

To manage these AOF file , We introduced a manifest( detailed list ) File to track 、 Manage these AOF. meanwhile , For convenience AOF Backup and copy , We will all AOF Document and manifest Put the files in a separate file directory , The directory name is given by appenddirname To configure (Redis 7.0 Add configuration item ) decision .

image.png
chart 2 MP-AOF Rewrite principle

chart 2 The show is in MP-AOF Once in AOFRW General process of . In the beginning, we will still fork A child process performs rewriting operations , In the main process , We'll open a new one at the same time INCR Type of AOF file , During the child process rewrite operation , All data changes will be written to the newly opened INCR AOF in . The rewriting operation of the child process is completely independent , During rewriting, there will be no data and control interaction with the main process , Finally, the rewrite operation will produce a BASE AOF. A new generation of BASE AOF And the newly opened INCR AOF It represents the current moment Redis Full data .AOFRW At the end , The main process will be responsible for updating manifest file , Will create a new BASE AOF and INCR AOF Add information , And put the previous BASE AOF and INCR AOF Marked as HISTORY( these HISTORY AOF Will be Redis Delete asynchronously ). once manifest File update complete , It marks the whole AOFRW End of the process .

From the figure 2 You can see , We are AOFRW There is no need for aof_rewrite_buf, Therefore, the corresponding memory consumption is removed . meanwhile , There is no data transmission and control interaction between the main process and the sub process , So the corresponding CPU All expenses are also removed . Corresponding , The six mentioned above pipe And its corresponding codes are also deleted , bring AOFRW The logic is simpler and clearer .

The key to realize

Manifest

Representation in memory

MP-AOF Strong dependence manifest file ,manifest In memory, it is expressed as the following structure , among :

  • aofInfo: It means a AOF file information , Currently only the file name is included 、 Document serial number and document type
  • base_aof_info: Express BASE AOF Information , When it doesn't exist BASE AOF when , That field is NULL
  • incr_aof_list: Used to store all INCR AOF File information , be-all INCR AOF Will be arranged in the order of file opening
  • history_aof_list: To hold HISTORY AOF Information ,history_aof_list The elements in are all from base_aof_info and incr_aof_list in move Over here
typedef struct {
    sds           file_name;  /* file name */
    long long     file_seq;   /* file sequence */
    aof_file_type file_type;  /* file type */
} aofInfo;
typedef struct {
    aofInfo     *base_aof_info;       /* BASE file information. NULL if there is no BASE file. */
    list        *incr_aof_list;       /* INCR AOFs list. We may have multiple INCR AOF when rewrite fails. */
    list        *history_aof_list;    /* HISTORY AOF list. When the AOFRW success, The aofInfo contained in
                                         `base_aof_info` and `incr_aof_list` will be moved to this list. We
                                         will delete these AOF files when AOFRW finish. */
    long long   curr_base_file_seq;   /* The sequence number used by the current BASE file. */
    long long   curr_incr_file_seq;   /* The sequence number used by the current INCR file. */
    int         dirty;                /* 1 Indicates that the aofManifest in the memory is inconsistent with
                                         disk, we need to persist it immediately. */
} aofManifest;

To facilitate atomic modification and rollback operations , We are redisServer Structure using pointers aofManifest.

struct redisServer {
    //  Other details are omitted here ...
    aofManifest *aof_manifest;       /* Used to track AOFs. */
    //  Other details are omitted here ...
}

Representation on disk

Manifest The essence is a text file containing multiple lines of records , Each line of records corresponds to one AOF file information , This information passes through key/value Show in the right way , Easy Redis Handle 、 Easy to read and modify . Here is a possible manifest The contents of the document :

file appendonly.aof.1.base.rdb seq 1 type b
file appendonly.aof.1.incr.aof seq 1 type i
file appendonly.aof.2.incr.aof seq 2 type i

Manifest The format itself needs to be extensible , In order to add or support other functions in the future . For example, you can easily support new key/value Annotation ( similar AOF The annotations in ), This can ensure better forward compatibility.

file appendonly.aof.1.base.rdb seq 1 type b newkey newvalue
file appendonly.aof.1.incr.aof type i seq 1 
# this is annotations
seq 2 type i file appendonly.aof.2.incr.aof

File naming rules

stay MP-AOF Before ,AOF The file name of is appendfilename Parameter settings ( The default is appendonly.aof).

stay MP-AOF in , We use basename.suffix To name multiple AOF file . among ,appendfilename The configuration content will be used as basename part ,suffix It consists of three parts , The format is seq.type.format , among :

  • seq Is the serial number of the document , from 1 Start monotonously increasing ,BASE and INCR Have independent document serial number
  • type by AOF The type of , Express this AOF File is BASE still INCR
  • format Used to express this AOF Internal coding method , because Redis Support RDB preamble Mechanism ,

therefore BASE AOF May be RDB The format code may also be AOF Format encoding :

#define BASE_FILE_SUFFIX           ".base"
#define INCR_FILE_SUFFIX           ".incr"
#define RDB_FORMAT_SUFFIX          ".rdb"
#define AOF_FORMAT_SUFFIX          ".aof"
#define MANIFEST_NAME_SUFFIX       ".manifest"

therefore , When using appendfilename When configured by default ,BASE、INCR and manifest The possible names of the files are as follows :

appendonly.aof.1.base.rdb //  Turn on RDB preamble
appendonly.aof.1.base.aof //  close RDB preamble
appendonly.aof.1.incr.aof
appendonly.aof.2.incr.aof

Compatible with the old version upgrade

because MP-AOF Strong dependence manifest file ,Redis When starting, it will strictly follow manifest Load the corresponding AOF file . But from the old version Redis( finger Redis 7.0 Previous version ) Upgrade to Redis 7.0 when , Because there is no manifest file , So how to make Redis Correctly identify that this is an upgrade process and correct 、 Safely load old AOF It is a capability that must be supported .

Recognition ability is the primary link in this important process , In real load AOF Documents before , We will check Redis Whether there is a file named server.aof_filename Of AOF file . If there is , That means we may be changing from an old version Redis Perform upgrade , Next , We will continue to judge , When one of the following three conditions is met, we will think that this is an upgrade startup :

  • If appenddirname directory does not exist
  • perhaps appenddirname Directory exists , But there is no corresponding... In the directory manifest Inventory file
  • If appenddirname The directory exists and... Exists in the directory manifest Inventory file , And there are only BASE AOF Related information , And this BASE AOF And server.aof_filename identical , And appenddirname The directory named does not exist server.aof_filename The file of
/* Load the AOF files according the aofManifest pointed by am. */
int loadAppendOnlyFiles(aofManifest *am) {
    //  Other details are omitted here ...
  
    /* If the 'server.aof_filename' file exists in dir, we may be starting
     * from an old redis version. We will use enter upgrade mode in three situations.
     *
     * 1. If the 'server.aof_dirname' directory not exist
     * 2. If the 'server.aof_dirname' directory exists but the manifest file is missing
     * 3. If the 'server.aof_dirname' directory exists and the manifest file it contains
     *    has only one base AOF record, and the file name of this base AOF is 'server.aof_filename',
     *    and the 'server.aof_filename' file not exist in 'server.aof_dirname' directory
     * */
    if (fileExist(server.aof_filename)) {
        if (!dirExists(server.aof_dirname) ||
            (am->base_aof_info == NULL && listLength(am->incr_aof_list) == 0) ||
            (am->base_aof_info != NULL && listLength(am->incr_aof_list) == 0 &&
             !strcmp(am->base_aof_info->file_name, server.aof_filename) && !aofFileExist(server.aof_filename)))
        {
            aofUpgradePrepare(am);
        }
    }
  
    //  Other details are omitted here ...
  }

Once it is recognized that this is an upgrade startup , We will use aofUpgradePrepare Function before upgrading .

The preparation for upgrading is mainly divided into three parts :

  • Use server.aof_filename Construct a file as a file name BASE AOF Information
  • Will be BASE AOF Information persistence to manifest file
  • Use rename The old AOF File move to appenddirname Directory
void aofUpgradePrepare(aofManifest *am) {
    //  Other details are omitted here ...
  
    /* 1. Manually construct a BASE type aofInfo and add it to aofManifest. */
    if (am->base_aof_info) aofInfoFree(am->base_aof_info);
    aofInfo *ai = aofInfoCreate();
    ai->file_name = sdsnew(server.aof_filename);
    ai->file_seq = 1;
    ai->file_type = AOF_FILE_TYPE_BASE;
    am->base_aof_info = ai;
    am->curr_base_file_seq = 1;
    am->dirty = 1;
    /* 2. Persist the manifest file to AOF directory. */
    if (persistAofManifest(am) != C_OK) {
        exit(1);
    }
    /* 3. Move the old AOF file to AOF directory. */
    sds aof_filepath = makePath(server.aof_dirname, server.aof_filename);
    if (rename(server.aof_filename, aof_filepath) == -1) {
        sdsfree(aof_filepath);
        exit(1);;
    }
  
    //  Other details are omitted here ...
}

The upgrade preparation operation is Crash Safety Of , Any of the above three steps occurs Crash We can correctly identify and retry the whole upgrade operation in the next startup .

Multi file loading and progress calculation

Redis In the load AOF The loading progress will be recorded , And pass Redis INFO Of loading_loaded_perc Fields are displayed . stay MP-AOF in ,loadAppendOnlyFiles The function will be based on the passed aofManifest Conduct AOF File loading . Before loading , We need to calculate all to be loaded in advance AOF Total file size , And to the startLoading function , And then in loadSingleAppendOnlyFile Constantly report the loading progress .

Next ,loadAppendOnlyFiles Will be based on aofManifest Load... In turn BASE AOF and INCR AOF. All currently loaded AOF file , Will use stopLoading End loading status .

int loadAppendOnlyFiles(aofManifest *am) {
    //  Other details are omitted here ...
    /* Here we calculate the total size of all BASE and INCR files in
     * advance, it will be set to `server.loading_total_bytes`. */
    total_size = getBaseAndIncrAppendOnlyFilesSize(am);
    startLoading(total_size, RDBFLAGS_AOF_PREAMBLE, 0);
    /* Load BASE AOF if needed. */
    if (am->base_aof_info) {
        aof_name = (char*)am->base_aof_info->file_name;
        updateLoadingFileName(aof_name);
        loadSingleAppendOnlyFile(aof_name);
    }
    /* Load INCR AOFs if needed. */
    if (listLength(am->incr_aof_list)) {
        listNode *ln;
        listIter li;
        listRewind(am->incr_aof_list, &li);
        while ((ln = listNext(&li)) != NULL) {
            aofInfo *ai = (aofInfo*)ln->value;
            aof_name = (char*)ai->file_name;
            updateLoadingFileName(aof_name);
            loadSingleAppendOnlyFile(aof_name);
        }
    }
  
    server.aof_current_size = total_size;
    server.aof_rewrite_base_size = server.aof_current_size;
    server.aof_fsync_offset = server.aof_current_size;
    stopLoading();
    
    //  Other details are omitted here ...
}

AOFRW Crash Safety

When the child process completes the rewrite operation , The subprocess will create a named temp-rewriteaof-bg-pid.aof The temporary AOF file , At this time, this file is right Redis It is still invisible , Because it has not been added to manifest In file . To make it be Redis Identify and in Redis Load correctly at startup , We also need to follow the naming rules mentioned above rename operation , And add its information to manifest In file .

AOF file rename and manifest Although file modification is two independent operations , But we must ensure the atomicity of these two operations , In this way, we can make Redis The corresponding... Can be loaded correctly at startup AOF.MP-AOF Use two designs to solve this problem :

  • BASE AOF The name of contains the document serial number , Ensure that every time you create BASE AOF It won't be the same as before BASE AOF Conflict
  • Execute first AOF Of rename operation , Revise manifest file

For the sake of illustration , We assume that AOFRW Before the start ,manifest The contents of the document are as follows :

file appendonly.aof.1.base.rdb seq 1 type b
file appendonly.aof.1.incr.aof seq 1 type i

It's in AOFRW After starting execution manifest The contents of the document are as follows :

file appendonly.aof.1.base.rdb seq 1 type b
file appendonly.aof.1.incr.aof seq 1 type i
file appendonly.aof.2.incr.aof seq 2 type i

After the subprocess rewrite , In the main process , We will temp-rewriteaof-bg-pid.aof Rename it to appendonly.aof.2.base.rdb, And add it to manifest in , At the same time, the previous BASE and INCR AOF Marked as HISTORY. here manifest The contents of the document are as follows :

file appendonly.aof.2.base.rdb seq 2 type b
file appendonly.aof.1.base.rdb seq 1 type h
file appendonly.aof.1.incr.aof seq 1 type h
file appendonly.aof.2.incr.aof seq 2 type i

here , This time AOFRW The result of Redis so ,HISTORY AOF Will be Redis Asynchronous cleanup .

backgroundRewriteDoneHandler The function realizes the above logic in seven steps :

  • In the modification memory server.aof_manifest front , First dup A temporary manifest structure , The next changes will be for this temporary manifest Conduct . The advantage of this is , Once the next step fails , We can simply destroy temporary manifest Thereby rolling back the entire operation , Avoid pollution server.aof_manifest Global data structure
  • From temporary manifest Get new BASE AOF file name ( Write it down as new_base_filename), And before ( If there is ) Of BASE AOF Marked as HISTORY
  • Will be generated by the subprocess temp-rewriteaof-bg-pid.aof Rename the temporary file to new_base_filename
  • Will be temporary manifest Last time in the structure INCR AOF All marked as HISTORY type
  • Will be temporary manifest The corresponding information is persisted to disk (persistAofManifest The interior will guarantee manifest Atomicity modified by itself )
  • If the above steps are successful , We can safely store server.aof_manifest The pointer points to temporary manifest structure ( And release the previous manifest structure ), So far, the whole modification is correct Redis so
  • clear HISTORY Type of AOF, This step allows failure , Because it will not cause data consistency problems
void backgroundRewriteDoneHandler(int exitcode, int bysignal) {
    snprintf(tmpfile, 256, "temp-rewriteaof-bg-%d.aof",
        (int)server.child_pid);
    /* 1. Dup a temporary aof_manifest for subsequent modifications. */
    temp_am = aofManifestDup(server.aof_manifest);
    /* 2. Get a new BASE file name and mark the previous (if we have)
     * as the HISTORY type. */
    new_base_filename = getNewBaseFileNameAndMarkPreAsHistory(temp_am);
    /* 3. Rename the temporary aof file to 'new_base_filename'. */
    if (rename(tmpfile, new_base_filename) == -1) {
        aofManifestFree(temp_am);
        goto cleanup;
    }
    /* 4. Change the AOF file type in 'incr_aof_list' from AOF_FILE_TYPE_INCR
     * to AOF_FILE_TYPE_HIST, and move them to the 'history_aof_list'. */
    markRewrittenIncrAofAsHistory(temp_am);
    /* 5. Persist our modifications. */
    if (persistAofManifest(temp_am) == C_ERR) {
        bg_unlink(new_base_filename);
        aofManifestFree(temp_am);
        goto cleanup;
    }
    /* 6. We can safely let `server.aof_manifest` point to 'temp_am' and free the previous one. */
    aofManifestFreeAndUpdate(temp_am);
    /* 7. We don't care about the return value of `aofDelHistoryFiles`, because the history
     * deletion failure will not cause any problems. */
    aofDelHistoryFiles();
}

Support AOF truncate

Appear in the process Crash when AOF The file is likely to be written incompletely , For example, only MULTI, But not yet EXEC when Redis Just Crash. By default ,Redis Unable to load this incomplete AOF, however Redis Support AOF truncate function ( adopt aof-load-truncated Configuration on ). The principle is to use server.aof_current_size track AOF The last correct file offset , And then use ftruncate Function to delete all the contents of the file after the offset , Although some data may be lost in this way , But it can be guaranteed AOF The integrity of .

stay MP-AOF in ,server.aof_current_size It no longer represents a single AOF The size of the file is all AOF Total file size . Because there is only one last INCR AOF Only then can the problem of incomplete writing occur , So we introduced a separate field server.aof_last_incr_size Used to track the last INCR AOF File size . When the last one INCR AOF When an incomplete write occurs , We just need to put server.aof_last_incr_size Then delete the contents of the file .

if (ftruncate(server.aof_fd, server.aof_last_incr_size) == -1) {
      // Other details are omitted here ...
 }

AOFRW Current limiting

Redis stay AOF Support automatic execution when the size exceeds a certain threshold AOFRW, When a disk failure occurs or a code is triggered bug Lead to AOFRW When the failure ,Redis Will keep repeating AOFRW Until we succeed . stay MP-AOF Before appearance , There seems to be no big problem ( At most, it consumes some CPU Time and fork expenses ). But in MP-AOF in , Because every time AOFRW Will open a INCR AOF, And only in AOFRW Only when you succeed will you be the last INCR and BASE To HISTORY And delete . therefore , Successive AOFRW Failure is bound to lead to multiple INCR AOF Coexisting problems . In extreme cases , If AOFRW The retry frequency is very high, and we will see hundreds of INCR AOF file .

So , We introduced AOFRW Current limiting mechanism . When AOFRW Has failed three times in a row , Next time AOFRW Will be forcibly delayed 1 Minute execution , If the next time AOFRW Still failed , It will delay 2 minute , And so on 4、8、16..., The current maximum delay time is 1 Hours .

stay AOFRW Current limiting period , We can still use bgrewriteaof The command is executed immediately AOFRW.

if (server.aof_state == AOF_ON &&
    !hasActiveChildProcess() &&
    server.aof_rewrite_perc &&
    server.aof_current_size > server.aof_rewrite_min_size &&
    !aofRewriteLimited())
{
    long long base = server.aof_rewrite_base_size ?
        server.aof_rewrite_base_size : 1;
    long long growth = (server.aof_current_size*100/base) - 100;
    if (growth >= server.aof_rewrite_perc) {
        rewriteAppendOnlyFileBackground();
    }
}

AOFRW Introduction of current limiting mechanism , It can also effectively avoid AOFRW High frequency retry brings CPU and fork expenses .Redis Many of them RT Jitters are all related to fork It matters .

summary

MP-AOF The introduction of , Successfully solved the previous AOFRW Existing memory and CPU The cost is right Redis The adverse impact of instance and even business access . meanwhile , In the process of solving these problems , We have also encountered many unexpected challenges , These challenges mainly come from Redis Huge user groups 、 Diversified usage scenarios , Therefore, we must consider the use of... By users in various scenarios MP-AOF Possible problems . Such as compatibility 、 Ease of use and for Redis The code is as intrusive as possible . This is all Redis The most important thing in the evolution of community function .

meanwhile ,MP-AOF The introduction of is also Redis Data persistence brings more imagination . If you are opening aof-use-rdb-preamble when ,BASE AOF The essence is a RDB file , Therefore, we do not need to perform a full backup alone BGSAVE operation . Direct backup BASE AOF that will do .MP-AOF Support turning off automatic cleaning HISTORY AOF The ability of , So those historical AOF The opportunity will be preserved , And now Redis Already supported on AOF Add timestamp annotation, So based on these, we can even implement a simple PITR Ability ( point-in-time recovery).

MP-AOF The design prototype comes from Tair for redis Enterprise Edition binlog Realization , This is a set in Alibaba cloud Tair Proven core functions in services , On this core function, Alibaba cloud Tair We have successfully built a global multi activity network 、PITR And other enterprise level capabilities , Enable users to meet their needs in more business scenarios . Today we contribute this core competence to Redis Community , I hope community users can also enjoy these enterprise features , And through these enterprise level features to better optimize , Create your own business code . of MP-AOF More details of , Please refer to the relevant PR(#9788), There are more original designs and complete code .

Link to the original text
This article is the original content of Alibaba cloud , No reprint without permission .

原网站

版权声明
本文为[Alibaba cloud yunqi]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/185/202202141629038140.html