当前位置:网站首页>Take another picture of cloud redis' improvement path
Take another picture of cloud redis' improvement path
2022-06-29 16:30:00 【Take another picture of cloud】
As the first programmable in China CDN Professional cloud service providers for services , And take pictures of the clouds CDN Scale and performance of edge network , Allow customers to customize writing rules to meet common business scenarios . In order to guarantee these source data , Such as edge redirection 、 Request speed limit 、 Custom error page 、 Access the immobilizer control 、 HTTP Head management, etc , It can quickly synchronize to the node servers at the edge , After comparing several schemes , Take another picture of the cloud 2014 Start using... At the beginning of the year Redis2.8 Version as a data synchronization solution .
The initial architecture is as follows :

Keep talking Redis Before improvement , We need to know about technical debt first . The technical debt mentioned here refers to the technical debt , Usually developers want to speed up software development , Compromise is possible when the best solution should be adopted , Switch to a solution that can accelerate software development in the short term . And this kind of scheme will bring extra development burden to itself in the future . Although it seems that you can get benefits at present , But the option of having to repay in the future , Like debt , So it is called technical debt .
The scheme we mentioned above has buried the introduction of technical debt . It has played an important role in the past few years , But the shortcomings of the architecture are obvious , And with the increase of the number of edge servers and the amount of synchronized data , In addition, the aging and failure of the server hardware , Caused a lot of problems , For example, the following questions :
For safety reasons , mutual Redis The communication data between them needs to be encrypted , but Redis Itself does not support SSL encryption . Therefore, all edge servers must pass stunnel Socket connection is used as transit server . However, in the actual working state ,stunnel The performance of , Cause server CPU Overload .
Redis Both the master and slave data are connected for a long time and try to keep synchronization from the same source , Therefore, the early edge servers obtained the source servers through domain name resolution IP Address . This has the advantage of simple implementation and deployment , The disadvantage is that DNS Unable to know the processing capacity of the back-end server , Resulting in unbalanced load on long connections on each machine . And after the back-end service fails DNS Nor can it be handled automatically , Even in time DNS Switch analysis is carried out , Because of TTL The vacuum period before the entry into force causes different data , As a result, only old data can be used for emergency response .
For historical reasons , edge Redis Most versions are 2.x Low version , And the lower version can only pass sync Do full synchronization . Therefore, the abnormality of the transit server and the master server will cause the avalanche effect of the whole network , So as to synchronously block , Unable to quickly synchronize metadata to the edge .
Because in the early days Redis Only the master-slave mode can be used , And the transformation of sentinels and clusters has not been realized . So now the primary server has become a single point of risk , It is easy to cause major failure at the source .
Problems and side effects caused by previous compromises , So that we now have to pay extra time and energy to refactor , Improve the architecture to the best way .
We divided the transformation process into several steps :
To strengthen SSL Safety protection of , Upgrade to... Whenever possible OpenSSL The latest stable version
SSL It may be one of the Internet security protocols that we have more contact with , General website address used “https://” start , That is to say SSL Security protocols .OpenSSL It's an open source SSL Realization , It is used to realize high-strength encryption of network communication , Now it is widely used in various network applications . Such an important project has been facing the dilemma of insufficient funds and manpower for many years , Most of the work is done by a small number of hackers, enthusiasts and volunteers . Fortunately, it is now included in Linux The objects of the foundation's funds , However, there are still new loopholes exposed , Need timely attention and follow-up .
Refer to the latest OpenSSL Vulnerability hazard level report :

Whereas RC4 There are too many security holes in the algorithm , It is recommended to disable... During compilation .
Use the latest stunnel edition , Optimize performance , Based on security OpenSSL Dependency Library , Support TLSv1.2+ above
As can be seen from the red box in the following figure ,stunnel The performance is the strongest under some algorithms , Therefore, it is recommended to use... First in the configuration file :
[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-9VIyr2ZQ-1656469291932)(https://upload-images.jianshu.io/upload_images/27822061-e7ec4e3c99217def.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)]
./configure --prefix=/opt/stunnel --with-ssl=/opt/openssl
Take a look at the optimization options in the recommended configuration :
verify = 3
sslVersionMax = TLSv1.3
sslVersionMin = TLSv1.2
options = NO_SSLv2
options = NO_SSLv3
.......
ciphers = ECDHE-RSA-AES128-GCM-SHA256:ECDHE:ECDH:AES:HIGH:!NULL:!aNULL:!MD5:!ADH:!RC4:!DH:!DHE
You can do it through the Asia integrity website HTTPS Detection and verification of trust level .

Compile the latest Redis-6.2.x Stable version , Powerful and rich functions without relying on the high version of GCC
Redis6.2 And 7.0 By comparison , Must be 7.0 The version is a little more powerful .Redis7.0 It includes incremental improvements in almost all aspects , One of the most noteworthy is Redis Functions、ACLv2、command introspection and Sharded Pub/Sub.7.0 Version added near 50 New commands and options to support this evolution and extend Redis Existing features of .
But even though Redis7.0 More powerful , But taking into account the original Redis Full code compatibility , And the stability of the production environment , We finally chose Redis6.2. because Redis6.2 There are enough advantages of , It's also powerful , And it can better meet the requirements of our production environment , such as :
Multithreading IO(Threaded I/O)
Lots of new modules (modules)API
Better expiration cycle (expire cycle)
Support SSL
ACLs Access control
RESP3 agreement
Client cache (Client side caching)
Copy without disk &PSYNC2
Redis-benchmark Support clusters
Redis-cli Optimize 、 rewrite Systemd Support
Redis Cluster agent and Redis6 Release together ( But in different repo)
RDB Load faster
SRANDMEMBER And similar commands have better distribution
STRALGO command
With timeout Redis Commands are easier to use
Let's focus on PSYNC2 Characteristics of , This is also one of the key features of our architecture improvement and upgrade .
stay Redis cluster In actual production and operation , Instance maintenance restart 、 Failover of the primary instance ( Such as cluster failover) Operations are common ( Such as instance upgrade 、rename command And releasing instance memory fragments ). And in the Redis4.0 Before version , This kind of maintenance processing Redis Full resynchronization will occur , It leads to a small amount of damage to performance sensitive services . and PSYNC2 Mainly let Redis In the scenarios of secondary instance restart and primary instance failover , You can also use partial resynchronization .
Download the source code and compile it directly :
# make BUILD_TLS=no
The recommended configuration , Add the following options to enhance performance :
io-threads-do-reads yes
io-threads 8
aof-use-rdb-preamble yes
During our testing , Find out Redis+TLS There are several problems :
Redis Turn on TLS after , Performance degradation 30%.
Redis Yes OpenSSL Strong dependence of . in consideration of OpenSSL In the past, high-risk vulnerabilities continue to exist , If you want to constantly fix the vulnerability, you need to recompile Redis, The cost of O & M update is too high .
Redis After upgrading , To resynchronize data , Increase the probability of failure or shut down production .
therefore , We decided to use a third-party program stunnel To reinforce safety , Easy to upgrade and fix vulnerabilities . Without affecting the back-end connection , In order to protect Redis Work continuity and stable reliability .
be based on APISIX+TLS trusteeship , Use TCP The hash consistency of is replaced by load balancing DNS The polling , Significant effectiveness
APISIX Use TCP agent , After this part is directly configured, you can use , and Redis Transformation has little to do with , Let's just skip , You can directly take a look at the statistical screenshot of the number of connections after the transformation . From the practical APISIX It can be seen from the number of connections that the load is evenly distributed to different back ends , And edge server restart also takes advantage of PSYNC2 Fast incremental synchronization .

Use Redis-shake Do customized data synchronization
In the process of architecture improvement , We also saw redis-shake This tool , It's Alibaba cloud Redis&MongoDB Team open source for Redis Data synchronization tools . It supports analysis 、 recovery 、 Backup 、 Sync Four functions . I will mainly introduce synchronization to you sync:
recovery restore: take RDB File recovery to destination Redis database .
Backup dump: Will source Redis The full amount of data through RDB Back up the files .
analysis decode: Yes RDB File read , And json Format parsing storage .
Sync sync: Support source Redis And purpose Redis Data synchronization for , Supports full and incremental data migration .
Sync rump: Support source Redis And purpose Redis Data synchronization for , Only full migration is supported . use scan and restore Command to migrate , Different cloud vendors are supported Redis Version migration .
We used to have a modified source code Redis, Only the desired space will be synchronized . Although easy to use , But you still need to recompile one on the new code , But the original person in charge can no longer be found . This is also a common problem of many long-term disrepair projects , But through redis-shake Such open source tools , As long as we simply configure it, we can achieve the functions we want :
- filter.db.whitelist / blacklist
- filter.key.whitelist / blacklist
- filter.command.whitelist / blacklist
Current architecture and future outlook

In the current architecture , We are based on the original three-tier architecture , Split and strengthen the three-tier architecture :
DNS Layer resolves to VIP,VIP Take advantage of BGP/OSPF Dynamic gateway routing protocol , Corresponding to the following group of server cluster services .
Load balancing layer : utilize “apisix ”+ “tls1.2+ ”+ “tcp Hash consistent connection for ”, hold Redis The master-slave connection is balanced , Fail over .
edge CDN node , utilize Redis The technology bonus brought by the high version ,psync Incremental synchronization of , add stunnel+tls1.2 Realize the encrypted transmission .
Next stage , And continue to put the data center Redis The main change causes Redis Sentinel mode ( Considering the compatibility transformation of the sentinel mode in the program code , The first stage will not start , Everything is for stability in the production environment ).
Reference documents :
How to check the website TLS edition :https://wentao.org/post/2020-11-29-ssl-version-check/
Redis Feature replication enhanced PSYNC2:https://www.modb.pro/db/79478
Easy to understand Redis Architecture mode details :https://www.cnblogs.com/mrhelloworld/p/redis-architecture.html
边栏推荐
- 加速智能驾驶项目落地?你还缺一套真值测评系统
- Key sprite fighting monsters - window binding protection skills and click skills
- 星环科技数据安全管理平台 Defensor重磅发布
- Cerebral cortex: predicting children's mathematical skills from task state and resting state brain function connections
- Mysql database foundation: DDL data definition language
- Key wizard play monster learning - multi window and multi thread background judgment of character, pet blood volume and pet happiness
- 论文笔记:E(n) Equivariant Graph Neural Networks
- Mysql database Basics: introduction to data types
- How to install WordPress on a web site
- 贪婪的苹果计划提高iPhone14的价格,这将为中国手机提供机会
猜你喜欢
![leetcode:139. Word splitting [DFS + memory]](/img/6f/8936ed3579c6a6dc3d8d312b413aff.png)
leetcode:139. Word splitting [DFS + memory]

C language -- printf print base prefix
![leetcode:232. Realize queue with stack [two stacks, one auxiliary and one simulated queue]](/img/be/844772e761c0ea6002c25483be93c0.png)
leetcode:232. Realize queue with stack [two stacks, one auxiliary and one simulated queue]

代码大全读后感

Cerebral Cortex:从任务态和静息态脑功能连接预测儿童数学技能

DAP large screen theme development description

MySQL foundation - transaction

Small programs have a "big" role in the industrial Internet

Which version of JVM is the fastest?

Sophon AutoCV:助力AI工业化生产,实现视觉智能感知
随机推荐
代码大全读后感
apache atlas断点查看
UWB precise positioning scheme, centimeter level high-precision technology application, intelligent pairing induction technology
Key wizard play monster learning - multi window and multi thread background judgment of character, pet blood volume and pet happiness
图文带你彻底弄懂MySQL事务原子性之UndoLog
Huaxia Fund: sharing of digital transformation practice achievements in the fund industry
哪个版本的JVM最快?
The understanding of industrial Internet is directly related to how we view it
What are the financial products suitable for the poor in 2022?
Sophon AutoCV:助力AI工业化生产,实现视觉智能感知
Key sprite fighting monsters - window binding protection skills and click skills
Sophon base 3.1 launches mlops function to provide wings for enterprise AI capability operation
Apache atlas breakpoint view
What is the strength of a software testing engineer who can get a salary increase twice a year?
Mysql database Basics: introduction to data types
Differences between virtual hosts, WordPress hosts and virtual hosts
Cerebral cortex: predicting children's mathematical skills from task state and resting state brain function connections
Which version of JVM is the fastest?
如何配置 logback?30分鐘讓你徹底學會代碼熬夜敲
【第28天】给定一个字符串S,请你判断它是否为回文字符串 | 回文的判断