当前位置:网站首页>[redis]-[redis underlying data structure] - Dictionary

[redis]-[redis underlying data structure] - Dictionary

2022-06-21 07:51:00 Pacifica_

Preface

Dictionaries , Also known as the symbol table , Associate arrays or maps , Is a data structure used to store key value pairs

Dictionary implementation

Redis The dictionary uses Hashtable As an underlying implementation

Hashtable

typedef struct dictht{
    
	dictEntry** table;       // Hash table array 
	unsigned long size;      // Hash table size 
	unsigned long sizemask;  // Hash table size mask , Always equal to  size - 1
	unsigned long used;      // The number of existing nodes in the hash table , That is, the number of saved key value pairs 
}dictht;

table Property is an array , Each of these elements is a point to dictEntry Pointer to structure , every last dictEntry The structure holds a key value pair

Hash table node

typedef struct dictEntry{
    
	void *key;               // key 
	union{
    
		void *val;
		uint64_tu64;
		int64_ts64;
	}v;                      // value 
	struct dictEntry *next;  // Point to the next hash table node , Form a linked list 
}dictEntry;

It can be seen that ,Redis Hash table usage in Chain address Resolving hash conflicts . And because the pointer to the end of the linked list is not saved , For the sake of speed , The program uses Head insertion , The complexity is O(1)

Dictionaries

typedef struct dict{
    
	dictType *type;  // Type specific function 
	void *privdata;  // Private data 
	dictht ht[2];    // Hashtable 
	int rehashidx;   //rehash  Indexes , When rehash When the process is not in progress , The value is -1
}dict;

type Properties and privdata Property is for different types of key value pairs , To create a polymorphic dictionary .
among ,type Property is a point dictType Pointer to structure , Every dictType Structure holds a set of functions that operate on a particular type of key value pair ,Redis Different functions will be set for dictionaries with different purposes
ht Property is an array of two elements , Each item is a hash table , It's commonly used ht[0] Hashtable ,ht[1] Only in rehash Will use

The hash algorithm

When adding a new key value pair to the dictionary :

  1. Calculate the hash value according to the key of the key value pair , Use the hash function in the dictionary to calculate
    hash = dict->type->hashFunction(key)
  2. Mask according to the hash value and the size of the hash table , Perform phase and operation , Calculate the index value ,
    index = hash & dict->ht[x].sizemask
  3. Then put the key value pair on the specified index of the hash table array

When a dictionary is used as the underlying implementation of a database , Or the underlying implementation of hash key ,Redis Use MurmurHash2 Algorithm to calculate the hash value of the key

rehash

In order to keep the load factor of the hash table within a reasonable range , When the hash table holds too many or too few key value pairs , The program needs to expand or shrink the size of the hash table , Through execution rehash Operate to complete :

  1. For the dictionary ht[1] Hash table allocation space , The size of the hash table space depends on the operation to be performed and the current ht[0] Number of key value pairs contained
    >>> If the execution is Expand operation , that ht[1] The size is The first is greater than or equal to ht[0].used * 2 Of 2 Of n Power of power
    >>> If the execution is shrinkage operation , that ht[1] The size is The first is greater than or equal to ht[0].used Of 2 Of n Power of power
  2. Will be saved in ht[0] All key value pairs in rehash To ht[1] above .rehash That is to recalculate the hash value and index value of the key , Then, according to the new index value, it is placed in ht[1] At the corresponding position of the hash table
  3. When ht[0] All key value pairs contained are migrated to ht[1] after , Release ht[0], take ht[1] Set to ht[0], And in ht[1] Create a new blank hash table , For the next time rehash To prepare for

The timing of expansion or contraction

When The server is not currently executing BGSAVE Order or BGREWRITEAOF command , And the load factor of hash table is greater than or equal to 1, perhaps The server is executing BGSAVE Order or BGREWRITEAOF command , And the load factor of hash table is greater than or equal to 5, The program will automatically extend the hash table . The load factor is load_factor = ht[0].used / ht[0].size

Why according to BGSAVE Order or BGREWRITEAOF Whether the order is being executed , The selected extensions require different load factors ?
This is because of the implementation BGSAVE Order or BGREWRITEAOF In the course of the order ,Redis Need to create the current server process Subprocesses , And most operating systems will use... When creating and using child processes When writing copy Technology , At this point, if the hash table is expanded , Will trigger the memory copy operation of the child process to the parent process . So when a child process exists , The server increases the load factor required to perform the expansion operation , Thus, the extension operation during the existence of the child process is avoided as far as possible , Avoid unnecessary memory write operations , Maximize memory savings

When the load factor of the hash table is less than 0.1 when , The program automatically starts to shrink the hash table

Progressive type rehash

rehash The process is not one-off , Done centrally , It's a lot of times , Completed gradually

as a result of , If it's done in one go , about ht[0] It is acceptable to save fewer key value pairs in , Because one-time rehash It won't take much time ; And if the ht[0] Save many key value pairs in , Tens of millions , So you have to set all of these key values at once rehash To ht[1] In the words of , The huge amount of computation and workload may cause the server to stop serving for a period of time

Progressive type rehash The steps are :

  1. by ht[1] Allocate space , Let the dictionary hold ht[0] and ht[1]
  2. Index counter variables in the dictionary rehashidx Set to 0, Express rehash Work officially begins
  3. stay rehash During the process , Add to the dictionary every time , Delete , When searching or updating operations , The program will perform the specified operation , By the way ht[0] The hash table is in rehashidx All key value pairs on the index rehash To ht[1] , When rehash When it's done , Program will rehashidx The value of the attribute of is increased by one , Get ready rehash Key value pair at next index position . The search performed here ( Delete , The update will also involve finding ) The operation will be performed on two hash tables , First in ht[0] To find , If not , Continue until ht[1] To find ; Add operation is always correct ht[1] Conduct , Guarantee ht[0] The number of key value pairs contained will only decrease, not increase , It must eventually become an empty table
  4. With the continuous execution of dictionary operation , At some point in time ,ht[0] All key value pairs of will be rehash To ht[1] in , Then the program will rehashidx The value of the set -1, Represents the whole rehash Operation completed
原网站

版权声明
本文为[Pacifica_]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/172/202206210748361654.html