当前位置:网站首页>Redis underlying data structure - underlying principle of hash table

Redis underlying data structure - underlying principle of hash table

2022-06-13 07:36:00 A hard-working dog

Hashtable

Structure definition

typedef struct dictht{
	// Hash table array 
    dictEntry **table;
    // Hash table size 
    unsigned long size;
    // Hash table size mask , Used to calculate index values 
    unsigned long sizemask;
    // The number of existing nodes in the hash table 
   unsigned long used ;
}dictht 

table Property is an array , Each element in the array is a point to dict.h/dictEntry Pointer to structure , Every dictEntry Structures hold key value pairs .

dictEntry data structure

typedef struct dictEntry {
	// key 
    void * key;
    // value 
    union{
    void  * val;
    uint64_t u64;
    int64_t s64;
    }
    // Point to the next hash table node , Form a linked list , Used to resolve hash conflicts 
    struct dictEntry *next;
}dictEntry;

* The value in a key value pair can be a pointer to the actual value , Or an unsigned 64 Bit integer or signed 64 Bit integer or double Value of class , The advantage of this is that you can save memory space , Because when the value is an integer or floating point number , Data of values can be embedded in dictEntry Inside the structure , There is no need to point to the actual value with a pointer , This saves memory space .

The data structure of the dictionary

typedef struct dict{
	// Type specific function 
    dictType * type;
    // Private data 
    void * privdata;
    // Hashtable , Used to solve rehash
    dicth ht[2];
    //rehash Indexes , When rehash When not in progress , The value is -1
    int trehashind ;      
}

type Properties and privdata Property is for different types of key value pairs , To create a polymorphic dictionary :

  • type Attribute points to dictType Pointer to structure , Every dictType Structure holds a set of functions that operate on key value pairs of a specific type ,Redis Different type specific functions will be set for different dictionaries .
  • privdata Properties hold optional parameters that need to be passed to those type specific functions .
typedf struct dictType{
	// Function to calculate hash value 
    unsigned int (*hashFunction)(const void *key);
    // Copy function of key 
    void *(*keyDup)(void * privdata,const void *key);
    // Function to copy values 
    void *(*valDup)(void * privdata,const void *obj);
    // Functions of comparison keys 
    int (*keyCompare)(void * privdata,const void * key1,const void * key2);
    // Function to destroy key 
    void (*keyDestructor)(void *privdata,void *key);
    // Function to destroy value 
    void (*valDestructor)(void *privdata,void *obj);  
}dictType;
  • ht Property is an array of two items , Each item in the array is a dictht Hashtable , In general , The dictionary only uses ht[0] Hashtable ,ht[1] Hash table will only be on ht[0] Hash table rehash When using .
  • rehashidx Used to record rehash Current progress , If not at present rehash, Then the value is -1、

Resolve hash conflict

Reids Hash table uses chain address method to solve hash offset .

because dictEntry The linked list composed of nodes has no pointer to the end of the linked list , So for speed , The program always adds new nodes to the header of the linked list , In front of other existing nodes .

REHASH

In order to keep the load factor of hash table in a reasonable range , When the hash table holds too many or too few key value pairs , The program needs to expand or shrink the size of the hash table . The work of expanding and shrinking the hash table can be done by rehash( Re hash ) Operate to complete ,Redis Execute... On the hash table of the dictionary rehash Steps for

  1. For the dictionary ht[1] Hash table allocation space , The size of the hash table depends on the operation to be performed , as well as ht[0] The number of key value pairs currently contained ( Can pass ht[0].used The value of attributes is )
    1. If you are performing an extension operation , that ht[1] The size of the first is greater than or equal to ht[0].used* Of 2 Of n Power .
    2. If you are performing a shrink operation , that ht[1] The size of the first is greater than or equal to ht[0].used Of 2 Of n Power .
  1. Will be saved in ht[0] All key value pairs in rehash To ht[1] above :rehash It refers to recalculating the hash value and index value of the key , Then place the key value pair in ht[1] On the specified location of the hash table .
  2. When ht[0] The key value pairs contained are migrated to ht[1] after (ht[0] Becomes an empty table ), Release ht[0], take ht[1] Set to ht[0], And in ht[1] Create a new blank hash table , For the next time rehash To prepare for .

Expansion and contraction of hash table

When any of the following conditions is met , The program will automatically start extending the hash table :

  • The server is not currently executing BGSAVE Order or BGREWRITEAOF command , And the load factor of hash table is greater than or equal to 1.
  • The server is currently executing BGSAVE Order or BGREWRITEAOF command , And the load factor of the hash table is greater than 5.

according to BGSAVE Order or BGREWRITEAOF Whether the order is being executed , The load factor required by the server to perform the expansion operation is not the same , This is because of the implementation BGSAVE Order or BGREWRITEAOF In the course of the order ,Redis You need to create a child process of the current server , Most operating systems use write time replication technology to optimize the efficiency of sub processes , So during the existence of the child process , The server increases the load factor required to perform the expansion operation , So as to avoid the expansion of hash table during the existence of child process , This avoids unnecessary memory writes , Maximize memory savings .

Progressive type rehash

When there are too many key value pairs in the hash table , To set these key values to all at once rehash To ht[1] Words , The huge amount of computing may cause the server to stop serving for a period of time . So to avoid rehash Impact on server performance , The server is not going to ht【0】 All the key values in it are all rehash To ht【1】, It's a lot of times 、 Step by step ht【0】 The key value pairs inside slowly rehash To ht【1】.

  1. by ht【1】 Allocate space , Let the dictionary hold ht【0】 and ht【1】 Two hash tables .
  2. Maintain an index counter variable in the dictionary rehashidx, And set its value to 0, Express rehah Work officially begins .
  1. stay rehash During the process , Every time the dictionary is added 、 Delete 、 When searching or updating operations , In addition to performing the specified operations , By the way ht[0] The hash table is in rehashindx All key value pairs on the index rehahs To ht[1], When rehash When the work is done , Program will rehashidx The value of the property is increased by one .
  2. With the continuous operation of the dictionary , At some point in time ,ht[0] All key value pairs of will be rehash to ht[1], At this point the program will rehashidx Property is set to -1, Express rehash Operation is completed .

Progressive type rehash Hash table operations during execution

In a progressive way rehash During the process , The deletion of the dictionary 、 lookup 、 Update and other operations will be performed on two hash tables .( The program will take precedence over ht[0] Operation on top , If the goal doesn't exist ht[0] Words , It will continue until ht[1] Search inside ).

In a progressive way rehash During execution , All new key value pairs added to the dictionary will be saved to ht[1] Inside . and ht[0] Then no adding operation will be performed .

原网站

版权声明
本文为[A hard-working dog]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202270548049213.html