当前位置：网站首页>Code types and data structures corresponding to the five object types of redis

Code types and data structures corresponding to the five object types of redis

2022-06-12 10:11:00 【Wang Rushuang】

One . data structure

SDS Simple dynamic string （string）

Redis No direct use C The traditional string representation of the language ( An array of characters ending in empty characters , hereinafter referred to as C character string ), Instead, I built a simple dynamic string ( simple dynamic string,SDS) Abstract type of , And will SDS Used as a Redis The default string representation of , Is a string that can be modified , In memory, it's in the form of an array of bytes
Insert picture description here
1. Memory expansion strategy ： Space preallocation

After modification sds Of len Less than 1MB,free=len
After modification sds Of len Greater than 1MB,free=1MB

2. Memory shrink policy ： Inert release

sds Of len When shortened , The program does not immediately use memory reallocation to reclaim the extra characters after shortening , But use free Record , Really free up unused space when necessary , No memory waste

Space pre allocation and inert release are reduced sds Memory reallocation

3. sds and C String comparison

	SDS	C character string
Get string length complexity	O(1) : take len value	O(n): Traverse from the beginning until you encounter ’\0’ The null character ends , It does not record its own length
API Whether binary security	Security : use len End of string flag , Save text and pictures , Audio and other binary data in any format	unsafe ： Use the empty character group to mark the end of the string , Only text format data can be saved
Whether the buffer overflows	Can't : Automatically check before modifying sds Is there enough space , namely free length	Meeting : Not enough memory to modify string
Number of memory reallocations	Reduce the number of memory reallocation : modify N The secondary string can execute at most N Secondary memory reallocation	modify N The secondary string must be executed N Secondary memory reallocation

dict Dictionaries （set,zset,hash）

c There is no such data structure as dictionary , therefore redis I have realized the data structure of dictionary , Each key is unique , similar java Medium ,Hash Dictionaries are also one of the underlying implementations of hash keys , When a hash key contains more key value pairs or the elements in the key value pairs are long strings ,redis Will use the dictionary as the underlying implementation of the hash key
The dictionary uses a hash table as the underlying implementation , A hash table can have multiple hash table nodes , Each hash table node holds a key value pair in the dictionary , When there is a key conflict , The hash table node uses next The pointer forms a single Necklace Watch ,next That is, the nodes assigned to the same index
Insert picture description here
1. Expansion strategy
Load factor = Hash table node has saved the number of nodes / Hash table size ：load_factor=ht[0].used / ht[0].size

Expansion and contraction of hash table when any of the following conditions are met , The program will automatically start extending the hash table

The server is not currently executing BGSAVE Order or BGREWRITEAOF command , And the load factor of hash table is greater than or equal to 1
The server is currently executing BGSAVE Order or BGREWRITEAOF command , And the load factor of hash table is greater than or equal to 5

Depending on whether the program is currently in progress BGSAVE The relevant operation , The load factor conditions required for capacity expansion are different , This is because of the ongoing BGSAVE In operation , There are child processes , The operating system will use When writing copy (Copy On Write) To optimize the efficiency of child processes .Redis Try to avoid capacity expansion when there are child processes ( Because it's going on rehash Time is wasted h[0] Memory footprint ), Try to save memory

2. Volume reduction strategy
When the load factor of the hash table is less than 0.1 when , The program automatically starts to shrink the hash table

3. Progressive type rehash The process
rehash It refers to recalculating the hash value and index value of the key , Then place the key value pair in ht[1] On the specified location of the hash table ( Occurs at the time of capacity reduction and expansion )

by ht[1] Allocate space , Let the dictionary hold ht[0] and ht[1] Two hash tables
Maintain an index counter variable in the dictionary rehashidx, And set its value to 0, Express rehash Work officially begins
stay rehash During the process , Every time the dictionary is added 、 Delete 、 When searching or updating operations , In addition to performing the specified operations , By the way ht[0] The hash table is in rehashidx All key value pairs on the index rehash To ht[1], When rehash When the work is done , Program will rehashidx The value of the property is increased by one
With the continuous execution of dictionary operation , At some point in time ,ht[0] All key value pairs of will be rehash to ht[1] (ht[0] Becomes an empty table ), Release ht[0], At this point the program will rehashidx Property is set to -1, And in ht[1] Create a new blank hash table , For the next time rehash To prepare for , Express rehash Operation completed

Progressive type rehash The good thing about it is that it takes Divide and rule The way , take rehash The calculation of key value pairs is equally distributed to each addition to the dictionary 、 Delete 、 Search and update operations , Thus avoiding centralized rehash And the huge amount of computation

For the dictionary ht[1] Hash table allocating space , The size of the hash table depends on the operation to be performed , as well as ht[O] The number of key value pairs currently contained ( That is to say ht[0].used The value of the property )

If you are performing an extension operation , that ht[1] The size of the first is greater than or equal to ht[0].used*2 namely 2ⁿ(2 Of n Power of power )
If you are performing a shrink operation , that ht[1] The size of the first is greater than or equal to ht[o].used namely 2ⁿ

4.rehash Hash table operation during

The deletion of the dictionary ( delete)、 lookup (fnd)、 to update ( update) The operation will be performed on two hash tables . for example , To find a key in a dictionary , The program will start with ht[0] Search inside , If you don't find it , It will continue until ht[1] Search inside
All new key value pairs added to the dictionary will be saved to ht[1] Inside , and ht[0] No more add operations , This measure guarantees ht[0] The number of key value pairs contained will only decrease but not increase , And with rehash The execution of the operation turns into an empty table

5.Redis The dictionary structure of rehash Time and Java Of HashMap Of Rehash Different ：
Java Of HashMap The way to do it is ： Create a new hash table , All current nodes will be deleted at one time rehash complete , Then release the original hash surface ; and Redis It's progressive hash The way

Two end endless linked list （list）

c There is no data structure like linked list , therefore redis I realized the linked list , It can be understood as java Medium List( But the structure is different ), Linked list redis It is widely used in , such as , When a list key contains a large number of elements , Or when the elements in the list are long strings , Redis Will use the linked list as the underlying implementation of the list key
Insert picture description here
1. advantage ：

The acquisition length complexity is O(1)
Because every node holds perv and next The pointer , Therefore, the complexity of obtaining the front and back nodes is O(1)
because list The structure has head and tail The pointer , Therefore, the complexity of obtaining the head and tail nodes of the linked list is O(1)
list Structure through dup( Copy the value saved by the linked list node )、free( Release the value saved by the linked list node )、match( Compare whether the value saved in the linked list node is equal to another input value ) Property to set a type specific function for the node , So the linked list can store various types of values

2. shortcoming ：

Find the time complexity of an element O(N)
Pointers take up a bit more memory , Memory waste
Each node allocates memory separately , When there are too many nodes , Caused too many memory fragments . Affect the efficiency of memory management

intset Set of integers （set）

When a set contains only integer value elements , And when the number of elements in this collection is small , Redis We will use the integer set as the underlying implementation of the set key , Sets are ordered from small to large

upgrade

According to the type of new elements , Expand the space size of the array at the bottom of the integer set , And allocate space for new elements
Convert all existing elements of the underlying array to the same type as the new elements , And put the converted element in the correct bit , And in the process of placing elements , We need to keep the ordered properties of the underlying array unchanged
Add new elements to the underlying array
modify length Property value

The location where the upgrade element is stored

Because the elements causing the upgrade are either larger than all existing elements , Or it's smaller than all the existing elements , therefore

Less than ： The new element is placed in the index 0 It's about
Greater than ： The new element is placed in the index length-1 It's about

intset Downgrade operation not supported , That is, once the data is upgraded , The code will always remain in the upgraded state

advantage ：

Store in the smallest code that can hold numbers , Can effectively save memory
An integer set encapsulates a pair of int16t、int32t、int64t Conversion between three integers , We don't need to consider the wrong type when using , You can continuously add integers to the integer set . Improved operational flexibility （c Is statically compiled ）

shortcoming ：

Every time a new element is added to an integer set, it may cause an upgrade , Each upgrade requires type conversion and memory reallocation of all existing elements in the underlying array , So the time complexity of adding new elements to an integer set is O(n)

skiplist Skip list （zset）

When there are many elements in an ordered set , Or the elements in the ordered set are long strings ,redis You will use the jump table as the underlying implementation of an ordered set
Insert picture description here
Be careful

The floor height of each node is 1 To 32 Random number between
The objects saved in each node are unique , When multiple nodes save scores score Can be the same
When the scores are the same , Nodes are sorted by the size of the member object , The small object is placed in the front node , Large back nodes

ziplist Compressed list （list,hash,zset,hash）

When a list contains a small number of elements , And each element is either a small integer value , Or it's a shorter string ,redis It will be the underlying implementation of the compressed list as the list key .
The compressed list is redis A data structure developed to save memory , It is a data structure composed of a series of specially encoded continuous memory blocks . A compressed list can contain any number of nodes , Each node holds a byte array or an integer value ,redis To save memory

Insert picture description here
advantage

To save memory

shortcoming

When adding or deleting nodes , It may cause chain update errors , But the probability of occurrence is not high , So don't worry too much about , Even if , As long as the number of update nodes is small , It doesn't matter much

It says redis Commonly used data structures , however redis There is no direct use of these data structures to implement key value pair databases , Instead, an object system is created based on these data structures , This system contains string objects 、 List objects 、 Hash object 、 There are five types of objects: collection objects and ordered collection objects , Each object uses at least one of the data structures we described earlier . From this we can get ： Every time we are in Redis Create a new key value pair in the database , We will create at least two objects , An object is used as the key of a key value pair ( Key object ), Another object is used as the value of a key value pair ( The value object )

Two . object

Redis Each object in is represented by a redisobject Structural representation , The three attributes of this structure related to data preservation are type attribute 、 encoding Properties and ptr attribute ：

typedef struct redisobject
// object type , It can be （REDIS_STRING: String object ,REDIS_LIST: List objects ,REDIS_HASH: Hash object ,REDIS_SET: A collection of objects ,REDIS_ZSET: An orderly collection of objects ） In a 
unsigned type:4 ;
// The encoding used by the , That is, what data structure the object uses as its underlying implementation 
unsigned encoding: 4;
/ Point to the underlying implementation data structure of the object , from encoding decision 
 void  *ptr;
 //....

The following table lists the encoding formats and usage of each object ：
Insert picture description here

object type

1. String object

The encoding type	data structure	usage	transformation
int	character string	Save integer values	int->raw
embstr	character string	It's a string , And the length is less than 32 byte	embstr->raw
row	character string	It's a string , And the length is greater than 32 byte

2. List objects

The encoding type	data structure	usage	transformation
ziplist	Compressed list	All elements of the list are integers or the string length is less than 64 own , And the number of elements is less than 512 individual	ziplist->linkedList
linkedlist	Double ended list	dissatisfaction ziplist Any one of

3. Hash object

The encoding type	data structure	usage	transformation
ziplist	Compressed list	The key and value strings of all key value pairs are less than 64 own , And the number of key value pairs is less than 512（ Put the key at the end of the compressed list before adding , Then put the value at the end of the compressed list , That is, the node of the key comes first , The value node is later , The key value stored first is in front of the node , The key value stored after the node ）	ziplist->hashtable
hashtable	Dictionaries	dissatisfaction ziplist One of the conditions （ Every key in the dictionary is a string object , Save the key ; Each value is a string object , Save key value ）

4. A collection of objects

The encoding type	data structure	usage	transformation
intset	Set of integers	All elements are integer values , And the number of elements does not exceed 512	intset->hashtable
hashtable	Dictionaries	dissatisfaction intset One of the conditions （ Each key in the dictionary is a string object , Each string object contains a collection element , Value to null）

5. An orderly collection of objects

The encoding type	data structure	usage	transformation
ziplist	Compressed list	Element length is less than 64 own , And the number of elements is less than 128 individual （ Each element is saved with two compressed list nodes next to each other , The first node holds the value , The second node saves scores ）	ziplist->skiplist
skiplist	zset（ Skip list + Dictionaries ）	dissatisfaction ziplist One of the conditions （zsl Jump table node object Save the value ,score Save the score ; Dictionary keys hold values , Dictionary values save scores , Available through a dictionary O(1) Find scores for multiple given members .zscore The command uses this feature . Dictionaries and jump tables share values and scores of elements through pointers , So it will not cause data duplication ）

Some properties of objects

1. Memory recovery

c There is no memory reclamation function ,redis A method named " Reference counting technology " Realize memory recycling , Through this mechanism , Programs can count information by tracking references to objects , The reference count value becomes 0 when , The memory occupied by the object is recycled , This is also associated with the following shared objects

typedef struct redisobject{
    
    // Reference count 
	int refcount;
} robi ;

2. Shared objects

To save memory ,redis Share string objects of integer values , That is, when there are multiple keys with the same value , The database stores only one object value , Realize object sharing through object reference mechanism
redis When initializing the server , establish 0 To 9999 String object of , When the server needs to use the value of this interval , The server will use these shared objects , Instead of creating new objects

3. The idle time of the object lru

typedef struct redisobject{
    
    // Record the last time the object was accessed by the command program 
	unsigned lru : 22 ;
} robi ;

Server on maxmemory Options , And the memory recycling algorithm is volatile-lru or allkeys-lru when , The server memory exceeds the configured maxmemory, Some keys with long idle time are released by the server first , To reclaim memory .

Most of the above content comes from 《Redis Design and implementation 》 excerpts , It introduces redis Of 5 The encoding type corresponding to each object type And data structure

原网站

版权声明
本文为[Wang Rushuang]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/03/202203010528030534.html