当前位置:网站首页>Redis persistence - detailed analysis of RDB source code | nanny level analysis! The most complete network
Redis persistence - detailed analysis of RDB source code | nanny level analysis! The most complete network
2022-07-26 17:05:00 【InfoQ】
One 、 background

Two 、fork() function And Copy On Write

3、 ... and 、redis RDB Data that may be used in the process
struct redisServer {
...
// rdb In the process COW Consume d Memory size
size_t stat_rdb_cow_bytes; /* Copy on write bytes during RDB saving. */
// The last time rdb End to this rdb The number of key value pairs changed at the beginning
long long dirty; /* Changes to DB from the last save */
// perform bgsave The process of the process id
pid_t rdb_child_pid; /* PID of RDB saving child */
// save Parameter saved rdb Generation strategy
struct saveparam *saveparams; /* Save points array for RDB */
// save Number of parameters
int saveparamslen; /* Number of saving points */
// rdb File name
char *rdb_filename; /* Name of RDB file */
// Whether the rdb File compression LZF Algorithm
int rdb_compression; /* Use compression in RDB? */
// Whether to carry out rdb Check the file
int rdb_checksum; /* Use RDB checksum? */
// Last time save Time to succeed
time_t lastsave; /* Unix time of last successful save */
// Last attempt to execute bgsave Time for
time_t lastbgsave_try; /* Unix time of last attempted bgsave */
// Last execution bgsave Time spent
time_t rdb_save_time_last; /* Time used by last RDB save run. */
// What is currently being implemented rdb Starting time
time_t rdb_save_time_start; /* Current RDB save start time. */
// bgsav Scheduling status , by 1 Only when bgsave
int rdb_bgsave_scheduled; /* BGSAVE when possible if true. */
// rdb Execution type , Local rdb Persistence or through socket Send out
int rdb_child_type; /* Type of save by active child. */
// Last execution bgsav The state of
int lastbgsave_status; /* C_OK or C_ERR */
// If the last time bgsave Failure , increase redis Data writing is no longer supported
int stop_writes_on_bgsave_err; /* Don't allow writes if can't BGSAVE */
...
};struct saveparam {
time_t seconds; // Number of seconds
int changes; // Number of changes
};Four 、redis in RDB Code execution process

5、 ... and 、redis In the relevant RDB Parameters and their interpretation
- save <seconds> <changes>
- stop-writes-on-bgsave-error yes
- rdbcompression yes
- rdbchecksum yes
- dbfilename dump.rdb
- dir ./
- bgsave_cpulist 1,10-11
- latency-monitor-threshold 0
6、 ... and 、RDB Main executive function
// Macro definition Return value
#define C_OK 0
#define C_ERR -1
int rdbSaveBackground(char *filename, rdbSaveInfo *rsi) {
pid_t childpid; // Subprocesses process id
long long start; // fork The corresponding microsecond value at the beginning
// If it already exists aof perhaps rdb Subprocesses Then stop this rdb operation
if (server.aof_child_pid != -1 || server.rdb_child_pid != -1) return C_ERR;
// Last time rdb From the beginning to this rdb At the beginning of the The number of changes of key value pairs
server.dirty_before_bgsave = server.dirty;
server.lastbgsave_try = time(NULL); // take Last try dbsave The time of is set to the current timestamp
openChildInfoPipe(); // Open process communication between the current process and child processes
start = ustime(); // Record fork Starting time
// if The conditional expression passes fork Turn on Subprocesses , here The current process and subprocess start different work contents , Subprocess on rdb, While the current process will normally provide external services after completing some statistical information
// Non blocking means rdb The process does not block the main process , however From the beginning fork To fork Completion of this period of time is still blocked
if ((childpid = fork()) == 0) {
int retval;
/* Child */
closeClildUnusedResourceAfterFork(); // Close the unnecessary resources inherited from the parent process by the child process
redisSetProcTitle("redis-rdb-bgsave"); // Set the name of the child process
retval = rdbSave(filename,rsi); // take redis Save content to On disk formation rdb file
// rdb After the file is saved Yes rdb Calculate the resource consumption in the process
if (retval == C_OK) {
size_t private_dirty = zmalloc_get_private_dirty(-1);
// COW Memory consumed
if (private_dirty) {
serverLog(LL_NOTICE,
"RDB: %zu MB of memory used by copy-on-write",
private_dirty/(1024*1024));
}
server.child_info_data.cow_size = private_dirty;
sendChildInfo(CHILD_INFO_TYPE_RDB); // The subprocess COW Data is sent to the parent process
}
exitFromChild((retval == C_OK) ? 0 : 1);// close Subprocesses , This is the end of the work of the subprocess
} else { // The current process continues to run , The subprocess goes on rdb operation
/* Parent */
server.stat_fork_time = ustime()-start; // fork The number of microseconds the function takes
// fork rate GB/s
server.stat_fork_rate = (double) zmalloc_used_memory() * 1000000 / server.stat_fork_time / (1024*1024*1024); /* GB per second. */
// And redis In the configuration file latency-monitor-threshold Parameters related , If the configured value is exceeded , Record the delay caused by this operation
latencyAddSampleIfNeeded("fork",server.stat_fork_time/1000);
if (childpid == -1) { // Subprocesses Generate failure , Update statistics about bgsave Status and print error log
closeChildInfoPipe();
server.lastbgsave_status = C_ERR;
serverLog(LL_WARNING,"Can't save in background: fork: %s",
strerror(errno));
return C_ERR;
}
serverLog(LL_NOTICE,"Background saving started by pid %d",childpid);
server.rdb_save_time_start = time(NULL); // Record bgsave Starting time
server.rdb_child_pid = childpid; // perform rdb Can be inherited by child processes. id
server.rdb_child_type = RDB_CHILD_TYPE_DISK; // rdb The file is saved on disk
updateDictResizePolicy(); // prohibit rehash operation
return C_OK;
}
return C_OK; /* unreached */
}/* Save the DB on disk. Return C_ERR on error, C_OK on success. */
int rdbSave(char *filename, rdbSaveInfo *rsi) {
char tmpfile[256]; // Definition rdb Temporary file name
char cwd[MAXPATHLEN]; /* Current working dir path for error messages. */
FILE *fp;
rio rdb;
int error = 0;
// Generate rdb Temporary file name ,"temp-" As before ",".rdb" As a suffix , Processes with child processes in the middle id
snprintf(tmpfile,256,"temp-%d.rdb", (int) getpid());
fp = fopen(tmpfile,"w"); // Set the file mode to writable
if (!fp) { // rdb Failed to open temporary file
char *cwdp = getcwd(cwd,MAXPATHLEN); // Get absolute path
serverLog(LL_WARNING,
"Failed opening the RDB file %s (in server root dir %s) "
"for saving: %s",
filename,
cwdp ? cwdp : "unknown",
strerror(errno));
return C_ERR;// return -1
}
// initialization File object IO ;rio yes redis In the abstract IO layer , Realized buffer 、 file io and socket io, The subsequent reading and writing of documents are passed rio Realization
rioInitWithFile(&rdb,fp);
if (server.rdb_save_incremental_fsync) // Determine whether increment is enabled rdb
rioSetAutoSync(&rdb,REDIS_AUTOSYNC_BYTES);// The buffer size is set to 32MB, Write when it exceeds rdb file
// take redis Save content to rdb In file
if (rdbSaveRio(&rdb,&error,RDB_SAVE_NONE,rsi) == C_ERR) {
errno = error;
goto werr;
}
/* Make sure data will not remain on the OS's output buffers */
// Refresh buffer
if (fflush(fp) == EOF) goto werr;
if (fsync(fileno(fp)) == -1) goto werr;
if (fclose(fp) == EOF) goto werr;
/* Use RENAME to make sure the DB file is changed atomically only
* if the generate DB file is ok. */
// Will be temporary rdb The file is changed to that configured in the configuration file rdb name , If it fails, record the log and delete the temporary rdb file
if (rename(tmpfile,filename) == -1) {
char *cwdp = getcwd(cwd,MAXPATHLEN);
serverLog(LL_WARNING,
"Error moving temp DB file %s on the final "
"destination %s (in server root dir %s): %s",
tmpfile,
filename,
cwdp ? cwdp : "unknown",
strerror(errno));
unlink(tmpfile);
return C_ERR;
}
serverLog(LL_NOTICE,"DB saved on disk");
server.dirty = 0; // rdb complete , Set the number of change key value pairs at this time to 0 , Start counting again
server.lastsave = time(NULL); // Record this time rdb Completion time
server.lastbgsave_status = C_OK; // Record this time rdb Completion status
return C_OK;
werr:
serverLog(LL_WARNING,"Write error saving DB on disk: %s", strerror(errno));
fclose(fp);
unlink(tmpfile);
return C_ERR;
}int rdbSaveRio(rio *rdb, int *error, int flags, rdbSaveInfo *rsi) {
dictIterator *di = NULL; // Define a dictionary iterator , Follow up from redis Read all key value pairs in
dictEntry *de; // Every key value pair is saved in dictEntry in
char magic[10]; // rdb Fixed setting at the beginning of the file ”REDIS“ Add a rdb The corresponding version number
int j;
uint64_t cksum;
size_t processed = 0;
if (server.rdb_checksum)
rdb->update_cksum = rioGenericUpdateChecksum; // Set the inspection function
snprintf(magic,sizeof(magic),"REDIS%04d",RDB_VERSION);
if (rdbWriteRaw(rdb,magic,9) == -1) goto werr; // take magic Write to rio in
if (rdbSaveInfoAuxFields(rdb,flags,rsi) == -1) goto werr; /* Save a few default AUX fields with information about the RDB generated. */
/* Iterate over modules, and trigger rdb aux saving for the ones modules types
* who asked for it. */
if (rdbSaveModulesAux(rdb, REDISMODULE_AUX_BEFORE_RDB) == -1) goto werr;
for (j = 0; j < server.dbnum; j++) { // Traverse redis All of the db , Default configuration is 16 individual
redisDb *db = server.db+j;
dict *d = db->dict; // Get redis in hash The location of the table ,ht[0] ht[1]
if (dictSize(d) == 0) continue; // If hash There are no key value pairs in the table skip
di = dictGetSafeIterator(d); // For the dictionary e Generate a secure iterator
/* Write the SELECT DB opcode */
// Write the database selection identification code ,254 ; stay rdb In file 254 Express Execution is select DB operation
if (rdbSaveType(rdb,RDB_OPCODE_SELECTDB) == -1) goto werr;
if (rdbSaveLen(rdb,j) == -1) goto werr; // because db Small total , therefore Use the low in a byte 6 To said db Count , high 2 Bits represent types
/* Write the RESIZE DB opcode. We trim the size to UINT32_MAX, which
* is currently the largest type we are able to represent in RDB sizes.
* However this does not limit the actual size of the DB to load since
* these sizes are just hints to resize the hash tables. */
uint64_t db_size, expires_size;
db_size = dictSize(db->dict);// At present db The total number of key value pairs in
expires_size = dictSize(db->expires); // At present db The total number of key value pairs with expiration time set in
if (rdbSaveType(rdb,RDB_OPCODE_RESIZEDB) == -1) goto werr;
if (rdbSaveLen(rdb,db_size) == -1) goto werr;
if (rdbSaveLen(rdb,expires_size) == -1) goto werr;
/* Iterate this DB writing every entry */
while((de = dictNext(di)) != NULL) {// dictNext Function traverses all of buckets,bucket The key value pairs stored in are stored in the form of a single linked list
sds keystr = dictGetKey(de); // obtain key
robj key, *o = dictGetVal(de); // obtain value
long long expire;
initStaticStringObject(key,keystr);// Will get key Save it in the stack ?avoid bugs like bug #85 ?
expire = getExpire(db,&key); // Get expiration time
if (rdbSaveKeyValuePair(rdb,&key,o,expire) == -1) goto werr; // Will be complete key 、value 、 The expiration time is written rio object
/* When this RDB is produced as part of an AOF rewrite, move
* accumulated diff from parent to child while rewriting in
* order to have a smaller final write. */
// aof Mixed persistence is used
if (flags & RDB_SAVE_AOF_PREAMBLE &&
rdb->processed_bytes > processed+AOF_READ_DIFF_INTERVAL_BYTES)
{
processed = rdb->processed_bytes;
aofReadDiffFromParent();
}
}
dictReleaseIterator(di);
di = NULL; /* So that we don't release it again on error. */
}
/* If we are storing the replication information on disk, persist
* the script cache as well: on successful PSYNC after a restart, we need
* to be able to process any EVALSHA inside the replication backlog the
* master will send us. */
// preservation lua Script
if (rsi && dictSize(server.lua_scripts)) {
di = dictGetIterator(server.lua_scripts);
while((de = dictNext(di)) != NULL) {
robj *body = dictGetVal(de);
if (rdbSaveAuxField(rdb,"lua",3,body->ptr,sdslen(body->ptr)) == -1)
goto werr;
}
dictReleaseIterator(di); // Release iterators
di = NULL; /* So that we don't release it again on error. */
}
// all db After data writing , Write a terminator
if (rdbSaveModulesAux(rdb, REDISMODULE_AUX_AFTER_RDB) == -1) goto werr;
/* EOF opcode */
if (rdbSaveType(rdb,RDB_OPCODE_EOF) == -1) goto werr; // write in rdb End of file
/* CRC64 checksum. It will be zero if checksum computation is disabled, the
* loading code skips the check in this case. */
cksum = rdb->cksum;// CRC64 checksum
memrev64ifbe(&cksum);
if (rioWrite(rdb,&cksum,8) == 0) goto werr;
return C_OK;
// rdb An error occurred during file generation , Then close out
werr:
if (error) *error = errno;
if (di) dictReleaseIterator(di);
return C_ERR;
}7、 ... and 、RDB File Protocol



00000000 52 45 44 49 53 30 30 30 39 |REDIS0009 00000000 fa 09 72 65 64 69 73 | .redis|
00000010 2d 76 65 72 06 35 2e 30 2e 31 34 fa 0a 72 65 64 |-ver.5.0.14..red|
00000020 69 73 2d 62 69 74 73 c0 40 fa 05 63 74 69 6d 65 |[email protected]|
00000030 c2 de 67 5f 62 fa 08 75 73 65 64 2d 6d 65 6d c2 |..g_b..used-mem.|
00000040 40 15 0f 00 fa 0c 61 6f 66 2d 70 72 65 61 6d 62 |@.....aof-preamb|
00000050 6c 65 c0 00 |le00000050 fe 00 fb 01 00 00 05 68 65 6c 6c 6f | .......hello|
00000060 03 61 61 69 fe 01 fb 01 00 00 05 68 65 6c 6c 6f |.aai.......hello|
00000070 05 61 74 6f 6d 65 fe 02 fb 01 00 00 05 68 65 6c |.atome.......hel|
00000080 6c 6f 05 67 69 6e 65 65 |lo.ginee00000080 ff 67 09 f2 9d eb ea ec | .g......|
00000090 bc |.|8、 ... and 、RDB Possible faults
- fork Process right redis It is blocked , So it's possible that rdb From time to tome “ Carton ” The phenomenon
- rdb If redis Its key value pair changes too fast , May lead to rdb During the process, the system memory increases abruptly, accompanied by the sudden increase of page loss interruption of the operating system
- Single machine multi instance deployment redis when ,rdb The process may cause disk IO Overload , And it's on aof In case of redis Slow response
Nine 、 Conclusion
About lingchuang group (Advance Intelligence Group)
Looking back BREAK AWAY
边栏推荐
- 视频媒介video
- Vs2017 opens the project and prompts the solution of migration
- How can win11 system be reinstalled with one click?
- [Development Tutorial 9] crazy shell arm function mobile phone-i2c tutorial
- How does win11 automatically clean the recycle bin?
- Win11 auto delete file setting method
- Alibaba Cloud Toolkit —— 项目一键部署工具
- Chapter 1 Overview - Section 1 - 1.3 composition of the Internet
- How does win11 reinstall the system?
- Current limiting comparison: how to choose sentinel vs hystrix?
猜你喜欢

Understanding JS foundation and browser engine

2022-2023 topic recommendation of information management graduation project

C#事件和委托的区别

Operating system migration practice: deploying MySQL database on openeuler
![[development tutorial 7] crazy shell · open source Bluetooth heart rate waterproof sports Bracelet - capacitive touch](/img/b8/cf563fa54f8a8a2e051bbf585a0a68.png)
[development tutorial 7] crazy shell · open source Bluetooth heart rate waterproof sports Bracelet - capacitive touch

第一章概述-------第一节--1.3互联网的组成

Packet capturing and streaming software and network diagnosis
It turns out that cappuccino information security association does this. Let's have a look.

Recurrence of historical loopholes in ThinkPHP

Marxan model, reserve optimization and protection vacancy selection technology, application in invest ecosystem
随机推荐
导数、微分、偏导数、全微分、方向导数、梯度的定义与关系
正则表达式
Win11怎么重新安装系统?
After Oracle creates a table partition, the partition is not given during the query, but the value specified for the partition field will be automatically queried according to the partition?
The difference between anonymous methods and lambda expressions
Definition and relationship of derivative, differential, partial derivative, total derivative, directional derivative and gradient
匿名方法和lambda表达式使用的区别
regular expression
Is it safe for Guosen Securities to open an account? How can I find the account manager
公安部发出暑期旅游客运交通安全预警:手握方向盘 绷紧安全弦
Take you a minute to learn about symmetric encryption and asymmetric encryption
Detailed explanation of tcpdump command
"Green is better than blue". Why is TPC the last white lotus to earn interest with money
Docker install redis? How to configure persistence policy?
Configmap of kubernetes
如何保证缓存和数据库一致性
Digital currency of quantitative transactions - merge transaction by transaction data through timestamp and direction (large order consolidation)
2022-2023 topic recommendation of information management graduation project
It turns out that cappuccino information security association does this. Let's have a look.
Realizing DDD based on ABP -- related concepts of DDD