当前位置:网站首页>Code types and data structures corresponding to the five object types of redis

Code types and data structures corresponding to the five object types of redis

2022-06-12 10:11:00 Wang Rushuang

One . data structure

SDS Simple dynamic string (string)

      Redis No direct use C The traditional string representation of the language ( An array of characters ending in empty characters , hereinafter referred to as C character string ), Instead, I built a simple dynamic string ( simple dynamic string,SDS) Abstract type of , And will SDS Used as a Redis The default string representation of , Is a string that can be modified , In memory, it's in the form of an array of bytes
 Insert picture description here
1. Memory expansion strategy : Space preallocation

  1. After modification sds Of len Less than 1MB,free=len
  2. After modification sds Of len Greater than 1MB,free=1MB

2. Memory shrink policy : Inert release

  1. sds Of len When shortened , The program does not immediately use memory reallocation to reclaim the extra characters after shortening , But use free Record , Really free up unused space when necessary , No memory waste

Space pre allocation and inert release are reduced sds Memory reallocation

3. sds and C String comparison

SDSC character string
Get string length complexity O(1) : take len value O(n): Traverse from the beginning until you encounter ’\0’ The null character ends , It does not record its own length
API Whether binary security Security : use len End of string flag , Save text and pictures , Audio and other binary data in any format unsafe : Use the empty character group to mark the end of the string , Only text format data can be saved
Whether the buffer overflows Can't : Automatically check before modifying sds Is there enough space , namely free length Meeting : Not enough memory to modify string
Number of memory reallocations Reduce the number of memory reallocation : modify N The secondary string can execute at most N Secondary memory reallocation modify N The secondary string must be executed N Secondary memory reallocation

dict Dictionaries (set,zset,hash)

      c There is no such data structure as dictionary , therefore redis I have realized the data structure of dictionary , Each key is unique , similar java Medium ,Hash Dictionaries are also one of the underlying implementations of hash keys , When a hash key contains more key value pairs or the elements in the key value pairs are long strings ,redis Will use the dictionary as the underlying implementation of the hash key
The dictionary uses a hash table as the underlying implementation , A hash table can have multiple hash table nodes , Each hash table node holds a key value pair in the dictionary , When there is a key conflict , The hash table node uses next The pointer forms a single Necklace Watch ,next That is, the nodes assigned to the same index
 Insert picture description here
1. Expansion strategy
       Load factor = Hash table node has saved the number of nodes / Hash table size :load_factor=ht[0].used / ht[0].size

       Expansion and contraction of hash table when any of the following conditions are met , The program will automatically start extending the hash table

  1. The server is not currently executing BGSAVE Order or BGREWRITEAOF command , And the load factor of hash table is greater than or equal to 1
  2. The server is currently executing BGSAVE Order or BGREWRITEAOF command , And the load factor of hash table is greater than or equal to 5

       Depending on whether the program is currently in progress BGSAVE The relevant operation , The load factor conditions required for capacity expansion are different , This is because of the ongoing BGSAVE In operation , There are child processes , The operating system will use When writing copy (Copy On Write) To optimize the efficiency of child processes .Redis Try to avoid capacity expansion when there are child processes ( Because it's going on rehash Time is wasted h[0] Memory footprint ), Try to save memory

2. Volume reduction strategy
       When the load factor of the hash table is less than 0.1 when , The program automatically starts to shrink the hash table

3. Progressive type rehash The process
       rehash It refers to recalculating the hash value and index value of the key , Then place the key value pair in ht[1] On the specified location of the hash table ( Occurs at the time of capacity reduction and expansion )

  1. by ht[1] Allocate space , Let the dictionary hold ht[0] and ht[1] Two hash tables
  2. Maintain an index counter variable in the dictionary rehashidx, And set its value to 0, Express rehash Work officially begins
  3. stay rehash During the process , Every time the dictionary is added 、 Delete 、 When searching or updating operations , In addition to performing the specified operations , By the way ht[0] The hash table is in rehashidx All key value pairs on the index rehash To ht[1], When rehash When the work is done , Program will rehashidx The value of the property is increased by one
  4. With the continuous execution of dictionary operation , At some point in time ,ht[0] All key value pairs of will be rehash to ht[1] (ht[0] Becomes an empty table ), Release ht[0], At this point the program will rehashidx Property is set to -1, And in ht[1] Create a new blank hash table , For the next time rehash To prepare for , Express rehash Operation completed

       Progressive type rehash The good thing about it is that it takes Divide and rule The way , take rehash The calculation of key value pairs is equally distributed to each addition to the dictionary 、 Delete 、 Search and update operations , Thus avoiding centralized rehash And the huge amount of computation

       For the dictionary ht[1] Hash table allocating space , The size of the hash table depends on the operation to be performed , as well as ht[O] The number of key value pairs currently contained ( That is to say ht[0].used The value of the property )

  1. If you are performing an extension operation , that ht[1] The size of the first is greater than or equal to ht[0].used*2 namely 2n(2 Of n Power of power )
  2. If you are performing a shrink operation , that ht[1] The size of the first is greater than or equal to ht[o].used namely 2n

4.rehash Hash table operation during

  1. The deletion of the dictionary ( delete)、 lookup (fnd)、 to update ( update) The operation will be performed on two hash tables . for example , To find a key in a dictionary , The program will start with ht[0] Search inside , If you don't find it , It will continue until ht[1] Search inside
  2. All new key value pairs added to the dictionary will be saved to ht[1] Inside , and ht[0] No more add operations , This measure guarantees ht[0] The number of key value pairs contained will only decrease but not increase , And with rehash The execution of the operation turns into an empty table

5.Redis The dictionary structure of rehash Time and Java Of HashMap Of Rehash Different :
       Java Of HashMap The way to do it is : Create a new hash table , All current nodes will be deleted at one time rehash complete , Then release the original hash surface ; and Redis It's progressive hash The way

Two end endless linked list (list)

      c There is no data structure like linked list , therefore redis I realized the linked list , It can be understood as java Medium List( But the structure is different ), Linked list redis It is widely used in , such as , When a list key contains a large number of elements , Or when the elements in the list are long strings , Redis Will use the linked list as the underlying implementation of the list key
 Insert picture description here
1. advantage :

  1. The acquisition length complexity is O(1)
  2. Because every node holds perv and next The pointer , Therefore, the complexity of obtaining the front and back nodes is O(1)
  3. because list The structure has head and tail The pointer , Therefore, the complexity of obtaining the head and tail nodes of the linked list is O(1)
  4. list Structure through dup( Copy the value saved by the linked list node )、free( Release the value saved by the linked list node )、match( Compare whether the value saved in the linked list node is equal to another input value ) Property to set a type specific function for the node , So the linked list can store various types of values

2. shortcoming :

  1. Find the time complexity of an element O(N)
  2. Pointers take up a bit more memory , Memory waste
  3. Each node allocates memory separately , When there are too many nodes , Caused too many memory fragments . Affect the efficiency of memory management

intset Set of integers (set)

       When a set contains only integer value elements , And when the number of elements in this collection is small , Redis We will use the integer set as the underlying implementation of the set key , Sets are ordered from small to large

upgrade

  1. According to the type of new elements , Expand the space size of the array at the bottom of the integer set , And allocate space for new elements
  2. Convert all existing elements of the underlying array to the same type as the new elements , And put the converted element in the correct bit , And in the process of placing elements , We need to keep the ordered properties of the underlying array unchanged
  3. Add new elements to the underlying array
  4. modify length Property value

The location where the upgrade element is stored

       Because the elements causing the upgrade are either larger than all existing elements , Or it's smaller than all the existing elements , therefore

  1. Less than : The new element is placed in the index 0 It's about
  2. Greater than : The new element is placed in the index length-1 It's about

intset Downgrade operation not supported , That is, once the data is upgraded , The code will always remain in the upgraded state

advantage :

  1. Store in the smallest code that can hold numbers , Can effectively save memory
  2. An integer set encapsulates a pair of int16t、int32t、int64t Conversion between three integers , We don't need to consider the wrong type when using , You can continuously add integers to the integer set . Improved operational flexibility (c Is statically compiled )

shortcoming :

  1. Every time a new element is added to an integer set, it may cause an upgrade , Each upgrade requires type conversion and memory reallocation of all existing elements in the underlying array , So the time complexity of adding new elements to an integer set is O(n)

skiplist Skip list (zset)

       When there are many elements in an ordered set , Or the elements in the ordered set are long strings ,redis You will use the jump table as the underlying implementation of an ordered set
 Insert picture description here
Be careful

  1. The floor height of each node is 1 To 32 Random number between
  2. The objects saved in each node are unique , When multiple nodes save scores score Can be the same
  3. When the scores are the same , Nodes are sorted by the size of the member object , The small object is placed in the front node , Large back nodes

ziplist Compressed list (list,hash,zset,hash)

       When a list contains a small number of elements , And each element is either a small integer value , Or it's a shorter string ,redis It will be the underlying implementation of the compressed list as the list key .
The compressed list is redis A data structure developed to save memory , It is a data structure composed of a series of specially encoded continuous memory blocks . A compressed list can contain any number of nodes , Each node holds a byte array or an integer value ,redis To save memory

 Insert picture description here
advantage

  1. To save memory

shortcoming

  1. When adding or deleting nodes , It may cause chain update errors , But the probability of occurrence is not high , So don't worry too much about , Even if , As long as the number of update nodes is small , It doesn't matter much

       It says redis Commonly used data structures , however redis There is no direct use of these data structures to implement key value pair databases , Instead, an object system is created based on these data structures , This system contains string objects 、 List objects 、 Hash object 、 There are five types of objects: collection objects and ordered collection objects , Each object uses at least one of the data structures we described earlier . From this we can get : Every time we are in Redis Create a new key value pair in the database , We will create at least two objects , An object is used as the key of a key value pair ( Key object ), Another object is used as the value of a key value pair ( The value object )

Two . object

       Redis Each object in is represented by a redisobject Structural representation , The three attributes of this structure related to data preservation are type attribute 、 encoding Properties and ptr attribute :

typedef struct redisobject
// object type , It can be (REDIS_STRING: String object ,REDIS_LIST: List objects ,REDIS_HASH: Hash object ,REDIS_SET: A collection of objects ,REDIS_ZSET: An orderly collection of objects ) In a 
unsigned type:4 ;
// The encoding used by the , That is, what data structure the object uses as its underlying implementation 
unsigned encoding: 4;
/ Point to the underlying implementation data structure of the object , from encoding decision 
 void  *ptr;
 //....

The following table lists the encoding formats and usage of each object :
 Insert picture description here

object type

1. String object

The encoding type data structure usage transformation
int character string Save integer values int->raw
embstr character string It's a string , And the length is less than 32 byte embstr->raw
row character string It's a string , And the length is greater than 32 byte

2. List objects

The encoding type data structure usage transformation
ziplist Compressed list All elements of the list are integers or the string length is less than 64 own , And the number of elements is less than 512 individual ziplist->linkedList
linkedlist Double ended list dissatisfaction ziplist Any one of

3. Hash object

The encoding type data structure usage transformation
ziplist Compressed list The key and value strings of all key value pairs are less than 64 own , And the number of key value pairs is less than 512( Put the key at the end of the compressed list before adding , Then put the value at the end of the compressed list , That is, the node of the key comes first , The value node is later , The key value stored first is in front of the node , The key value stored after the node )ziplist->hashtable
hashtable Dictionaries dissatisfaction ziplist One of the conditions ( Every key in the dictionary is a string object , Save the key ; Each value is a string object , Save key value )

4. A collection of objects

The encoding type data structure usage transformation
intset Set of integers All elements are integer values , And the number of elements does not exceed 512intset->hashtable
hashtable Dictionaries dissatisfaction intset One of the conditions ( Each key in the dictionary is a string object , Each string object contains a collection element , Value to null)

5. An orderly collection of objects

The encoding type data structure usage transformation
ziplist Compressed list Element length is less than 64 own , And the number of elements is less than 128 individual ( Each element is saved with two compressed list nodes next to each other , The first node holds the value , The second node saves scores )ziplist->skiplist
skiplistzset( Skip list + Dictionaries ) dissatisfaction ziplist One of the conditions (zsl Jump table node object Save the value ,score Save the score ; Dictionary keys hold values , Dictionary values save scores , Available through a dictionary O(1) Find scores for multiple given members .zscore The command uses this feature . Dictionaries and jump tables share values and scores of elements through pointers , So it will not cause data duplication )

Some properties of objects

1. Memory recovery

      c There is no memory reclamation function ,redis A method named " Reference counting technology " Realize memory recycling , Through this mechanism , Programs can count information by tracking references to objects , The reference count value becomes 0 when , The memory occupied by the object is recycled , This is also associated with the following shared objects

typedef struct redisobject{
    
    // Reference count 
	int refcount;
} robi ;

2. Shared objects

       To save memory ,redis Share string objects of integer values , That is, when there are multiple keys with the same value , The database stores only one object value , Realize object sharing through object reference mechanism
redis When initializing the server , establish 0 To 9999 String object of , When the server needs to use the value of this interval , The server will use these shared objects , Instead of creating new objects

3. The idle time of the object lru

typedef struct redisobject{
    
    // Record the last time the object was accessed by the command program 
	unsigned lru : 22 ;
} robi ;

       Server on maxmemory Options , And the memory recycling algorithm is volatile-lru or allkeys-lru when , The server memory exceeds the configured maxmemory, Some keys with long idle time are released by the server first , To reclaim memory .


Most of the above content comes from 《Redis Design and implementation 》 excerpts , It introduces redis Of 5 The encoding type corresponding to each object type And data structure

原网站

版权声明
本文为[Wang Rushuang]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/03/202203010528030534.html