当前位置：网站首页>Redis underlying data structure

Redis underlying data structure

2022-06-26 05:54:00 【Juvenile deer】

Redis The underlying data structure consists of 6 Kind of , Simple dynamic character string 、 Double linked list 、 Compressed list 、 Hashtable 、 Jump tables and integer arrays . The corresponding relationship between them and data types is shown in the figure below ：

String Data structure type

 data structure 

redis 3.2  before 
struct sdshdr {
    int len;
    int free;
    char buf[];
};
redis 3.2  after 

typedef char *sds;

struct __attribute__ ((__packed__)) sdshdr5 {
    unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr8 {
    uint8_t len; /* used */
    uint8_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr16 {
    uint16_t len; /* used */
    uint16_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr32 {
    uint32_t len; /* used */
    uint32_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr64 {
........

static inline char sdsReqType(size_t string_size) {
    if (string_size < 32)  
        return SDS_TYPE_5;
    if (string_size < 0xff) //2^8 -1  
        return SDS_TYPE_8;
    if (string_size < 0xffff) // 2^16 -1  
        return SDS_TYPE_16;
    if (string_size < 0xffffffff)  // 2^32 -1 
        return SDS_TYPE_32;
    return SDS_TYPE_64;
}

redis Is a modifiable string , The bottom byte is implemented by array , stay C In languages, strings end with NULL Express , But to get NULL The end string length uses strlen function , The algorithm complexity of this function is O(n), The array needs to be traversed and scanned ,redis You can't afford this overhead with a single thread . therefore redis Use bytes \0 As the end of the string , Easy to use glibc String handling functions for .
redis Specifies that the string length must not exceed 512M byte .
redis There are two ways to store emb and raw, When the length exceeds 44 When using raw.
redis Header object data structure , You can see the occupancy 128bit,16 Bytes .

SDS comparison C The advantages of strings ：

SDS Save the length of the string , and C String does not save length , You need to traverse the entire array （ find ’\0’ until ） To get the string length .
modify SDS when , Check the given SDS Is there enough space , If it is not enough, it will expand first SDS Space , Prevent buffer overflow .C The string does not check whether the string space is sufficient , It is easy to cause buffer overflow when calling some functions （ such as strcat String concatenation function ）.
SDS The mechanism of pre allocating space , You can reduce the number of times you reallocate space for Strings .

Whole Redis The main data structure is as follows ：

List data structure

List It's an order ( Sort by order of joining ) Data structure of ,Redis use quicklist（ Double ended linked list ） and ziplist As List The underlying implementation of . You can set each ziplist Maximum capacity of ,quicklist Data compression range of , Improve data access efficiency

list-max-ziplist-size  -2        //   Single ziplist The maximum storage capacity of a node is   8kb  , If it exceeds, it splits , Store data in a new ziplist In nodes    Corresponding redis.conf Configuration of  8KB  Setting the data too large is not recommended   The compression efficiency is low and takes up space 
list-compress-depth  1        //  0  For all nodes , No compression ,1,  Represents a step back from the head node , You don't have to compress a node forward , Everything else is compressed ,2,3,4 ...  And so on


robj *createZiplistObject(void) {
    unsigned char *zl = ziplistNew();
    robj *o = createObject(OBJ_LIST,zl);
    o->encoding = OBJ_ENCODING_ZIPLIST;
    return o;
}

 unsigned char *ziplistNew(void) {
    unsigned int bytes = ZIPLIST_HEADER_SIZE+ZIPLIST_END_SIZE;
    unsigned char *zl = zmalloc(bytes);
    ZIPLIST_BYTES(zl) = intrev32ifbe(bytes);
    ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE);
    ZIPLIST_LENGTH(zl) = 0;
    zl[bytes-1] = ZIP_END;
    return zl;
}

robj *createObject(int type, void *ptr) {
    robj *o = zmalloc(sizeof(*o));
    o->type = type;
    o->encoding = OBJ_ENCODING_RAW;
    o->ptr = ptr;
    o->refcount = 1;
    if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
        o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;  
    } else {
        o->lru = LRU_CLOCK();   //  obtain  24bit  Current time seconds 
    }
    return o;
}

zlbytes:32bit, Express ziplist The total number of bytes occupied .
zltail:    32bit, Express ziplist The last item in the table （entry） stay ziplist The number of offset bytes in . adopt zltail We can easily find the last one , So that we can be in ziplist The tail end performs quickly push or pop operation 
zlen:     16bit,  Express ziplist Data item in （entry） The number of .
entry: Represents the data item that actually stores the data , The length is variable 
zlend: ziplist Last 1 Bytes , It's an end mark , The value is fixed equal to 255.
prerawlen:  Previous entry The data length of .
len: entry The length of the data in 
data:  Real data storage

quickList data structure 
robj *createQuicklistObject(void) {
    quicklist *l = quicklistCreate();
    robj *o = createObject(OBJ_LIST,l);
    o->encoding = OBJ_ENCODING_QUICKLIST;
    return o;
}

quicklist *quicklistCreate(void) {
    struct quicklist *quicklist;

    quicklist = zmalloc(sizeof(*quicklist));
    quicklist->head = quicklist->tail = NULL;
    quicklist->len = 0;
    quicklist->count = 0;
    quicklist->compress = 0;
    quicklist->fill = -2;
    quicklist->bookmark_count = 0;
    return quicklist;
}

typedef struct quicklist {
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;       
    unsigned long len;           
    int fill : QL_FILL_BITS;                
    unsigned int compress : QL_COMP_BITS;  
    unsigned int bookmark_count: QL_BM_BITS;
    quicklistBookmark bookmarks[];
} quicklist;

typedef struct quicklistNode {
    struct quicklistNode *prev;
    struct quicklistNode *next;
    unsigned char *zl;
    unsigned int sz;            
    unsigned int count : 16;    
    unsigned int encoding : 2;    
    unsigned int container : 2;  
    unsigned int recompress : 1; 
    unsigned int attempted_compress : 1; 
    unsigned int extra : 10;  
} quicklistNode;

Hash

Hash The underlying data structure is implemented as a dictionary ( dict ), It's also RedisBb Used to store K-V Data structure of , When the amount of data is small , Or the single element is relatively small , The underlying use ziplist Storage , When the amount of data or the number of data is limited, use hashtable Storage , Data size and element number threshold can be set by the following parameters .

hash-max-ziplist-entries 512 // ziplist The number of elements exceeds 512 , Change to hashtable code
hash-max-ziplist-value 64 // The size of a single element exceeds 64 byte when , Change to hashtable code

First, through hset When writing data , The data structure into which data is written is a dictionary , The dictionary is out of order , When hset When the amount of data written in is relatively small hget The data is written in the same order , Because the bottom layer is used zipList De storage , When the written data is large ,haget Time data is not the style when it is written in

String  and hash type 
String set When key Of string type  
hashSet When   It's right key Of hashtable code 

hset ha k1 v1 k2 v2 k3 v3
 Use  type ha（hash Type data ）
 Output  hash
 Use  object encoding ha
 Output  ziplist
 If set The value is larger than 65 byte   Will be zipList Convert to hashTable code

Set

Set For disordered , Set data type for automatic de duplication ,Set The underlying data structure is implemented as a value by null Of Dictionaries ( dict ), When data can be represented by shaping ,Set The set will be encoded as intset data structure . When any two conditions are satisfied Set Will use hashtable Store the data .

1, The number of elements is greater than set-max-intset-entries , 2 , The element cannot be represented by an integer

 
set-max-intset-entries 512       // intset  The maximum number of elements that can be stored , If you exceed it, use hashtable code

typedef struct intset {
    uint32_t  encoding;
    uint32_t  length;
    int8_t      contents[];
} intset; 
 
#define INTSET_ENC_INT16 (sizeof(int16_t))
#define INTSET_ENC_INT32 (sizeof(int32_t))
#define INTSET_ENC_INT64 (sizeof(int64_t))


 The set of integers is an ordered , A structure for storing integer data . Integers are set in Redis Can be saved in int16_t,int32_t,int64_t Integer data of type , And can guarantee 
 There is no duplicate data in the collection .

encoding:  The encoding type 
length:  Element number 
contents[]:  Element store

zset

zset Each element in contains the data itself and a corresponding score (score).ZSet For an orderly , Set data type for automatic de duplication ,ZSet The underlying implementation of data structure is Dictionaries (dict) + Jump watch (skiplist) , When the data is small , use ziplist The encoding structure stores .

zset The data itself cannot be duplicated , however score Allow repetition .

zset Underlying implementation principle ：

When there is little data , Use ziplist：ziplist Occupy continuous memory , Each element is （ data +score） Continuous storage , according to score Sort from small to large .ziplist To save memory , The space occupied by each element can be different , For big data （long long）, Just use more bytes to store , And for small data （short）, Use fewer bytes to store . Therefore, when searching, you need to traverse in order .ziplist Save memory, but search efficiency is low .
Long time data , Using dictionaries + Jump watch ：
The jump list is based on an ordered single linked list , Improve search efficiency by building indexes , Space for time , The search method is to search from the top linked list layer by layer , Finally, find the corresponding node in the lowest linked list ：

see zset Help document

help @sorted_set

zset-max-ziplist-entries  128    //  The number of elements exceeds 128 , Will use skiplist code 
zset-max-ziplist-value     64     //   The size of a single element exceeds  64 byte,  Will use  skiplist code

Zset  data structure



//  establish zset  data structure :  Dictionaries  +  Jump watch 
robj *createZsetObject(void) {
    zset *zs = zmalloc(sizeof(*zs));
    robj *o;
    // dict It is used to query the correspondence between data and scores ,  Such as  zscore  It can be directly based on   Elements get points  
    zs->dict = dictCreate(&zsetDictType,NULL);
    
    // skiplist Used to query data based on scores （ It could be range lookup ）
    zs->zsl = zslCreate();
    //  Set the object type  
    o = createObject(OBJ_ZSET,zs);
     //  Set encoding type  
    o->encoding = OBJ_ENCODING_SKIPLIST;
    return o;
}


typedef struct zskiplistNode {
    sds ele;
    double score;
    struct zskiplistNode *backward;
    struct zskiplistLevel {
        struct zskiplistNode *forward;
        unsigned long span;
    } level[];
} zskiplistNode;
typedef struct zskiplist {
    struct zskiplistNode *header, *tail;
    unsigned long length;
    int level;
} zskiplist;
typedef struct zset {
    dict *dict;
    zskiplist *zsl;
} zset;

skipList

GeoHash Algorithm

GeoHash It's a geolocation coding method . from Gustavo Niemeyer and G.M. Morton On 2008 The invention of , It encodes geography into a short string of letters and numbers . It's a hierarchical spatial data structure , Subdivide space into mesh shaped buckets , It's called z One of the many applications of sequence curves , It's usually a space filling curve .

GeoHash Latitude and longitude coding

The longitude range is east longitude 180 To the west longitude 180, The latitude range is south latitude 90 To north latitude 90, We set west longitude as negative , South latitude is negative , So the longitude of the earth is [-180, 180], The latitude range is [-90,90]. If you take the prime meridian 、 The equator is the boundary , The earth can be divided into 4 Parts of . If the latitude range [-90°, 0°) In binary 0 representative ,（0°, 90°] In binary 1 representative , Longitude range [-180°, 0°) In binary 0 representative ,（0°, 180°] In binary 1 representative , So the earth can be divided into the following ( On the left )4 Parts of

 adopt GeoHash Algorithm , You can turn the two-dimensional coordinates of latitude and longitude into a sortable one 、 Comparable string encoding .  In coding 
 Each character represents an area , And the preceding character is the parent region of the following character . The process of the algorithm is as follows ：


 according to GeoHash  To calculate    Latitudinal   Binary code 

 The latitude range of the earth is [-90,90],  If a certain latitude is 39.92324, You can code dimensions by the following algorithm :

1） Section [-90,90] To divide into two parts [-90,0),[0,90], It's called the left and right intervals , Can be determined 39.92324 It belongs to the right range [0,90], Mark as 1

2） And then I'll take the interval [0,90] To divide into two parts  [0,45),[45,90], Can be determined 39.92324 It belongs to the left range  [0,45), Mark as 0

3） Recurse the above procedure 39.92324 Always in a certain range [a,b]. With each iteration interval [a,b] Always shrinking , And getting closer 39.928167

4） If given latitude （39.92324） It belongs to the left range , Record 0, If it belongs to the right interval, record 1, So as the algorithm goes on 
      To produce a sequence 1011 1000 1100 0111 1001, The length of a sequence is related to the number of times a given interval is divided .

 advantage ：

GeoHash utilize Z Order curve ,Z Second order curves can transform all points in two dimensions into first order curves . Geographic coordinates are encoded into one-dimensional values , utilize   Ordered data structures such as B Trees 、SkipList etc. , You can do a range search . So use GeoHash It's faster to find neighboring points 


 shortcoming ：

Z  There is a serious problem with the order curve , Although there is local order preservation , But it also has mutations . At every  Z  The corner of the letter , It's possible to have a sequence mutation .

原网站

版权声明
本文为[Juvenile deer]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/177/202206260550370286.html