当前位置：网站首页>Redis core technology and practice - learning notes (IX): slicing cluster

Redis core technology and practice - learning notes (IX): slicing cluster

2022-07-03 17:51:00 【Tom Kong】

One . Cluster slicing

Cluster slicing , Also called fragment cluster , It means starting multiple Redis Instances form a cluster , Follow the rules , Divide the data received into multiple copies , Each copy is saved with an instance .

Use scenarios , take 25GB The data is saved with Two kinds of schemes ：

Cluster slicing , Also called fragment cluster , Namely Refers to starting multiple Redis Instances form a cluster , And then according to certain rules , Divide the received data into multiple copies , Each copy is saved with an instance . Back in the scene we just had , If you put 25GB The data are divided equally into 5 Share （ Of course , You can't even it ）, Use 5 An instance to save , Each instance only needs to be saved 5GB data . As shown in the figure below ：

Multi instance save data slice , It can be saved 25GB Data can avoid fork The child process blocks the main thread, causing the response to slow down .

Two . How to save more data ？

Purpose ： Save a lot of data

Method ： expand Redis Instance memory （ Vertical expansion scale up） And slice clusters ( Horizontal scaling scale out) The two methods .

Vertical expansion ： Upgrade single Redis Instance resource allocation , Include increase Memory Capacity , increase disk Capacity , Use a higher configuration of CUP.
Horizontal scaling ： Horizontal increase At present Redis Number of instances .

Vertical expansion

advantage ： Simple , direct
shortcoming ：
One . Use RDB When data is persisted , If the amount of data increases , It will also increase the memory , The main thread fork Child processes will block . If persistence is not required Redis data , Then vertical expansion is also a good choice .
Two . Hardware cost limitation .

Horizontal expansion is a more scalable solution , Just add Redis The number of instances , Don't worry about the hardware and cost limitations of a single instance , For millions , Tens of millions of users , Horizontal scaling Redis Slicing clusters would be a good choice .

The corresponding distribution relationship between data slice and instance

Slice clustering is a general mechanism for storing large amounts of data , This mechanism can have different implementation schemes .

Redis 3.0 Later, it will be officially provided Redis cluster programme , For slicing clusters .

Redis Cluster The scheme adopts Hash slot （Hash Slot） To deal with it Data and examples Mapping between . A slice cluster has 16384 Hash slot , These hash slots are similar to data partitions , Each key value pair will be based on its key Map to a hash slot .

First of all, according to the key value pair key, according to CRC16 Algorithm Calculate a 16bit Value ; Then use this 16bit It's worth it 16384 modulus , obtain 0~16384 Modulus in range , Each module represents a corresponding numbered Hashi trough .
How hash slots are mapped to specific Redis example ： If it's not Redis Cluster Scenario time , have access to cluster create Command create cluster , here Redis These slots are automatically evenly distributed on cluster instances . If there is N An example , However, the number of slots on each instance is 16384/N individual .

We can use cluster meet Command to manually establish a connection between instances , Forming clusters , In the use of cluster addslots command , Specify the number of hash slots on each instance .

Different in the cluster Redis The memory size of the instance is configured differently , If you divide the Hashi trough equally on each instance , When saving the same number of key value pairs , Compared with instances with large memory , The memory capacity of the instance will be smaller . At this time, you can configure resources according to different instances , Use cluster addslots Command to manually allocate hash slots .

The slice cluster has a total of 3 An example , Colleagues assume that 5 Hash slot , The Hashi slot can be manually assigned by the following command ：

example 1 Save Hashi slot 0 and 1, example 2 Save Hashi slot 2 and 3, example 3 Save Hashi slot 4.


redis-cli -h 127.0.0.1 –p 6379 cluster addslots 0,1
redis-cli -h 127.0.0.1 –p 6380 cluster addslots 2,3
redis-cli -h 127.0.0.1 –p 6381 cluster addslots 4

When manually allocating hash slots , Need to put 16384 All the slots are allocated , otherwise Redis The cluster is not working properly .

How the client locates the data ？

When locating key value pair data , The hash slot in which he is located can be calculated , This calculation is in When the client sends a request . however , To further navigate to the instance , You also need to know which instance the hash slots are distributed on .

After the connection between the client and the cluster instance is established , example The allocation information of the hash slot will be sent to the client .
When the cluster was first established , Each cluster only knows which hash slots are allocated to it , Do not know the hash slot information corresponding to other instances .
Redis The instance will send its own hash slot information to other instances connected to it , To complete the hash slot allocation information diffusion . When instances are connected to each other , All instances know the mapping relationship of all hash slots . When the client accesses any instance , Can get all the hash slot information .
After the client receives the hash slot information , Will be able to The hash slot information is cached locally , When a client requests a key value pair , The hash value of the key will be calculated first , In this way, you can send a request directly to the corresponding instance .

Redis Need to reallocate the Hashi trough

Clients cannot pass messages to each other , Get the latest hash slot allocation information , But the client cannot actively perceive .

What if the cached allocation information is inconsistent with the latest allocation information ？

Redirect

The client sends data read / write operations to an instance , There is no corresponding data on the instance , The client needs to send operation commands to a new instance .

When the client sends an operation request of a key value pair to an instance , If this instance does not have a hash slot mapped by this key value pair , Then this instance will return to the client move command In response to the results , This result includes the access address of the new instance .


GET hello:key
(error) MOVED 13320 172.16.19.5:6379

MOVED The command says , The hash slot where the key value pair requested by the client is located 13320, It's actually in 172.16.19.5 For instance .

MOVE How to use the redirect command ：
Because of load balancing ,slot2 The data in has been removed from the instance 2 Migrate to instance 3
The client cache still records slot2 In the instance 2 Information about , So I'll give you an example 2 dispatch orders .
example 2 Return a message to the client move command , hold slot2 The latest location of （ example 3）, Return to the client .
The client will send a message to the instance again 3 Send a request , It also updates the local cache , hold slot2 The correspondence with the instance is updated .

Slot 2 All the data in has been migrated to the instance 3 Only in this case move

ASK

If slot2 There is too much data in the , Customer to instance 2 When sending a request , here slot2 Only part of the data in is migrated to the instance 3, There is still some data not migrated . When this migration is partially completed , The client will receive a message ASK Error message .


GET hello:key
(error) ASK 13320 172.16.19.5:6379

In this result ASK An order means client The hash slot corresponding to the request key value pair 13320, In the instance 3 On , However, the hash slot is being migrated to the instance 3.
At this time, the client needs to give the instance 3 Send a ASKING command ： Let this instance allow the client to execute the following commands .
The client is sending... To this instance GET command , To read this data .

Example ：

slot2 From instance 2 To instance 3 transfer ,key1 and key2 It's been moved in the past ,key3 and key4 It's still an example 2 in .
Client to instance 2 request key2 after , You will receive an instance 2 Back to ASK command .
ASK The two meanings of the command ：
Express slot Data is still migrating .
ASK The command returns the latest instance address of the data requested by the client to the client
The client needs to give the instance 3 send out ASKING command , And then send the operation command .

ASK The command does not update the hash slot allocation information of the client cache .

If the client requests again Slot 2 Data in , It still gives examples 2 Send a request . That means ,ASK The purpose of the command is to enable the client to send a request to the new instance , Not like MOVED Order that , Changes the local cache , Let all subsequent commands be sent to the new instance .

After class questions ：

Redis Cluster The scheme allocates key value pairs to different instances through hash slot , This process requires key value pairs key Conduct CRC Calculation , Then map with the hash slot , What are the benefits of doing so ？ If you directly use a hash table to record the corresponding relationship between key value pairs and instances , When you use it like this, you can look it up directly , Why not ？

The storage capacity of the whole cluster key Too many , Record each key The mapping table with the instance will be very large , Both the storage server and the client occupy a very large memory space .
Redis cluster Adopt a decentralized model , The client accesses a key, If this key Not on this node , This node needs to have the ability to correct the client routing to the right node （move Respond to ）.
Routing tables need to be exchanged between nodes , Each stage has a complete routing relationship , If the storage is key Mapping table with instance , Then the exchange of information between nodes will become huge , Consume network resources , At the same time, memory takes up a lot of resources .
The cluster expansion , Shrinkage capacity , When data is balanced , Data migration occurs between nodes , During migration, you need to modify each key The mapping relation of , Maintenance costs are high .
Add a hash slot in the middle , The relationship between data and nodes can be decoupled ,key adopt hash The calculation only requires which Hashi slot the relationship is mapped to , Then find the node through the mapping table between the hash slot and the node , It's equivalent to consuming very little cpu resources , Let the data be evenly distributed , Make the mapping table smaller , It is helpful for the client and server to save , Exchanging information between nodes is lighter .
When the cluster is expanding 、 Shrinkage capacity 、 When data is balanced , Operations between nodes, such as data migration , All operations are based on Hash slots , Simplify node expansion 、 The difficulty of volume reduction , Easy to maintain and manage the cluster .

原网站

版权声明
本文为[Tom Kong]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202150325488821.html