当前位置:网站首页>Introduction to redis - Chapter 1 - data structures and objects - simple dynamic string (SDS)

Introduction to redis - Chapter 1 - data structures and objects - simple dynamic string (SDS)

2022-06-23 11:44:00 tiger’s bytes

Simple dynamic string (simple dynamic string)( Abbreviation SDS ) yes Redis The default string representation of , stay Redis Inside ,C The string will only be used as a string literal where there is no need to modify the string value .

struct sdshdr {
  int  len;   //  Number of bytes used by the byte array , At the end of the  '\0'  The terminator does not count , This null character is transparent to the user , Its space allocation and addition are made by  SDS  The function does that automatically , Follow tail add  '\0'  The benefits of the terminator are ,SDS  You can use some directly  C  Functions in string function library 
  int  free;  //  Number of bytes unused in the byte array 
  char buf[]; //  Byte array , Used to hold  SDS 
}

SDS Can be in O(1) Time complexity gets its own length , At the same time, we are modifying SDS Content time ,SDS API Will automatically check SDS Whether the byte array of is enough to store the modified contents , If not enough ,API It will be automatically expanded , Eliminate the possibility of buffer overflow .

because C A string does not record its own length , So for one that contains N A character C String , The underlying implementation of this string is always a N + 1 An array of characters ( An extra character space is used to save empty characters '\0' ). because C This correlation exists between the length of the string and the length of the underlying array , So every increase or decrease C character string , The program always has to save this C The underlying array of the string performs a memory reallocation operation , Otherwise, memory overflow or memory leak will occur .

In a normal program , If modifying the length of a string is less common , So it's acceptable to perform a memory reallocation with each modification , however Redis As a database , It's often used for speed requirements 、 Where data is frequently modified , If you have to perform a memory reallocation every time you modify the string , Then the time of memory reallocation alone will take up a large part of the time of modifying the string , If this modification happens frequently , There may also be a performance impact . for fear of C String this flaw ,SDS Uncorrelating string length with underlying array length through unused space , stay SDS in ,buf The length of an array is not necessarily the number of characters plus one , The array can contain unused bytes , And the number of these bytes is determined by SDS Of free Property record , Unused bytes through byte array (free) This member variable ,SDS Two optimization strategies of space preallocation and lazy space release are realized .

Space is pre-allocated for optimization SDS Byte array growth operation , When SDS Of API To a SDS  Make changes , And when capacity expansion is required , Not only will the program be SDS Allocate the necessary space , Also for SDS Allocate free space . among , The extra free space allocated is determined by the following policy :

  • If after expansion len Will be less than 1 MB , Then the allocated free space will be the same as len It's the same size , namely free be equal to len, Now buf The actual byte size of the array is free + len + 1.
  • If after expansion len Will be bigger than the 1 MB , Then the allocated free space is 1MB size .

Lazy space release for optimization SDS String shortening operation , When SDS Of API When you need to shorten the saved string , The program does not immediately try to reallocate memory to reclaim extra bytes , But use free Property records the number of these bytes , And wait for future use . meanwhile ,SDS The corresponding API , So that when we need to , The real release SDS Of free space , So don't worry about the memory waste caused by the idle space release strategy .

Through this pre allocation strategy ,SDS Will continue to grow N The number of memory reallocation times required by the secondary string must be N Down to the most N Time . Besides ,C The characters in the string must match some encoding , And besides the end of the string , The string cannot contain empty characters , Otherwise, the first null character read by the program will be mistaken for the end of the string , So these restrictions make C Strings can only hold text data , Instead of saving pictures 、 Audio 、 video 、 Compressed files and other binary data . Such as the following string , use  C String to save the following words ’Now‘ Will be ignored .

because SDS Is to rely on len Property to determine whether you have finished reading buf Of all characters in the array , So there won't be C String problem , So we say SDS It's binary safe , all SDS API Will be handled in a binary way SDS Store in buf The data in the array , The program doesn't make any restrictions on the data 、 Filter or assume , What data looks like when it is written , What is it like when you read it out . That's what we're going to do SDS Of buf Property is called a byte array , because Redis Instead of using this array to hold characters , Instead, it holds a series of binary data .

surface 1.1 C String and SDS The difference between

C character string SDS The time complexity of getting the string length is O(N) The time complexity of getting the string length is O(1) API It's not safe , It may cause buffer overflow or memory leak API Is safe , No buffer overflows or memory leaks Change string length N It is necessary to carry out N Secondary memory reallocation Change string length N At most times N Secondary memory reallocation Only text data can be saved Can save text or binary data You can use all <string.h> Functions in the library Available part <string.h> Functions in the library

surface 1.1 C String and SDS The difference between
C character string SDS
The time complexity of getting the string length is O(N) The time complexity of getting the string length is O(1)
API It's not safe , It may cause buffer overflow or memory leak API Is safe , No buffer overflows or memory leaks
Change string length N It is necessary to carry out N Secondary memory reallocation Change string length N At most times N Secondary memory reallocation
Only text data can be saved Can save text or binary data
You can use all <string.h> Functions in the library Available part <string.h> Functions in the library

原网站

版权声明
本文为[tiger’s bytes]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/174/202206231134145939.html