当前位置:网站首页>High availability architecture design to deal with network failure of operators
High availability architecture design to deal with network failure of operators
2022-06-24 04:04:00 【User 3971906】
Background description
Some Tencent cloud customers , Based on Tencent cloud product capability , In different availability zones in the same region , Quickly build a business level dual active architecture in the same city ( Here's the picture ). With a single product / High availability of single link , At the same time, it also has the ability of disaster recovery in case of abnormal single availability zone in the same city .
The following is the architecture diagram of a customer , Green background , Due to strong dependence on communication with the public network , In the following two scenarios , Users of corresponding lines , Will all Cannot access this customer's APP; The customer's business process , also Cannot access the public network interface since it ( For example, payment interface ), Great impact on the business :
- The operator's man fails , However, the operator is connected with other operators .
- Operators and Tencent cloud single line ( Such as telecommunication lines ) fault .
Problem analysis
As can be seen from the above architecture diagram ,clb、 The business layer 、 Data layer to layer communication , All through the intranet , This part is not affected by the public network .
The user accesses the access layer public network CLB The incoming traffic and business services pass through NAT Outbound traffic of gateway accessing public network resources , These two links are affected by the quality of network connectivity .
therefore , As long as the fault occurs , We can quickly restore the incoming traffic mentioned above 、 Outgoing flow , These two failure scenarios can be effectively avoided .
This article mainly discusses When the outgoing flow is damaged How to deal with it .
Solution
The general solution under the cloud is , The transformation business supports the remote disaster recovery mode or the two places three centers mode . This program , There are several questions :
- The cost of business transformation is huge
- Operation and maintenance 、 There are big changes in the operation system
- Switching costs 、 Time consuming
On the cloud , Cloud networking based on Tencent cloud ( Between different regions VPC Intranet interworking ) Ability , Can be extremely fast 、 Solve this problem at low cost .
Outgoing flow
Use the intranet CLB+Nginx( Deployed on the public network IP Of CVM above ) Build a forward proxy cluster in a different place . For zero intrusion into business processes , We chose a four layer transparent proxy .
When the fault occurs , Will need to access the public domain name , Perform intranet resolution hijacking , Hijack to a different place CLB Intranet address . It can realize remote disaster recovery for outgoing traffic of intranet services
The architecture of the final scheme is as follows :
Cloud networking introduction reference :https://cloud.tencent.com/document/product/877/18675
Switching action in case of failure
- Listen to the cloud and dial the cloud , Monitoring entrance CLB、NAT Gateway connectivity quality , When abnormal quality is found , Trigger the following actions
- For damaged users in Shanghai , adopt dnspod analysis , Dispatch the damaged domain name to another region CLB; The CLB Provide services through cloud networking back to source Shanghai business . See the green line above
- For business services accessing public network resources , Change Intranet dns analysis , Go to the proxy cluster . See the red line above
matters needing attention
Entrance and exit convergence
This scheme is preceded by , Business needs to be transformed , And realize export 、 Convergence of entrance ( See the first architecture diagram ). meanwhile , Business services access the core public domain name , The filing system is recommended , Do a good job in the intranet of the forward agent side dns Resolved plan .
IP The selection
It is recommended to forward proxy cluster EIP, Select the... With the main region NAT gateway EIP Different types . That is, the main region EIP by BGP IP, Disaster preparedness IP Select triple play IP, To realize heterogeneous disaster tolerance .
If both sides EIP Select the same type , On Tencent cloud side IP Link failure scenario , The above scheme fails .
Access delay
If triple play is selected IP, The quality of some access links will be affected . See the green background section below :
Delay description :
- In the picture "50ms Inside " Uncommitted value , From Guangzhou to Beijing in China ( The farthest distance ) scenario , Test value . The specific value shall be subject to the feedback of your account manager
- Such as delay sensitive , Consider Shanghai / Nanjing is divided into two regions , The delay will decrease a lot
Disaster recovery component management
In the two scenarios described above ,nginx Used by the forward proxy dns server It may also be affected . therefore dns Recommended configurations are and nginx The public IP Consistent model selection DNS.
If nginx The public IP The choice is telecommunications IP Not Tencent cloud bgp IP, be nginx In the configuration file resolver Instructions , You can choose 114.114.114.114.
Because the proxy service configuration is simple , And there are few opportunities for change , So the management cost here is low .
Specific implementation cases
Test resource distribution
Instance name | purpose | regional / Availability zone | ID | IP |
|---|---|---|---|---|
Cloud networking | many VPC Interworking | —— | ccn-j622dwcj | —— |
VPC | Service availability zone | Shanghai | vpc-hbeixq6k | 10.100.0.0/16 |
VPC | Disaster recovery availability zone | Guangzhou | vpc-0xoknyzt | 10.11.0.0/22 |
CLB | Forward proxy cluster CLB | Guangzhou intranet CLB | lb-mpv2j1ym | 10.11.0.17 |
CVM | Forward agency | Guangzhou three districts | ins-4ufdkjk6 | 10.11.0.4 |
Build cloud networking
explain :CLB Cross region binding CVM, need CLB and CVM Networking in the same cloud .
1、 Create cloud networking
2、 Create a back view
Build a forward proxy cluster
explain : To simplify the configuration of various business services , choice nginx Build a four layer transparent proxy , compatible https request
1、 Log in to the three districts of Guangzhou and act as agent CVM 10.11.0.4, And install the configuration nginx, Then start nginx
[[email protected] nginx]# rpm -q nginx nginx-1.20.1-1.el7.ngx.x86_64 [[email protected] nginx]# cat /etc/nginx/nginx.conf user nginx; worker_processes auto; ....... ....... stream { log_format proxy '$remote_addr [$time_local] ' '$protocol $status $bytes_sent $bytes_received ' '$session_time "$upstream_addr" ' '"$upstream_bytes_sent" "$upstream_bytes_received" "$upstream_connect_time"'; access_log /var/log/nginx/443.log proxy; open_log_file_cache off; resolver 114.114.114.114; server { listen 443; ssl_preread on; proxy_connect_timeout 5s; proxy_pass $ssl_preread_server_name:$server_port; } } [[email protected] nginx]# netstat -tunlp | grep 443 tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 10751/nginx: master
2、 Put the above CVM Join the Guangzhou forward agent cluster CLB 10.11.0.17( Pay attention to the intranet CLB)
3、 In Shanghai, CVM 10.100.4.17 Access address , give the result as follows
[[email protected] ~]# cat /etc/hosts 127.0.0.1 VM-4-17-tlinux VM-4-17-tlinux 127.0.0.1 localhost.localdomain localhost 127.0.0.1 localhost4.localdomain4 localhost4 ::1 VM-4-17-tlinux VM-4-17-tlinux ::1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 [[email protected] ~]# curl "https://zhidao.baidu.com/daily/view?id=239227" > /tmp/nat.html % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 65969 0 65969 0 0 141k 0 --:--:-- --:--:-- --:--:-- 141k
4、 In Shanghai, CVM 10.100.4.17, Configure the domain name zhidao.baidu.com Resolve to a forward proxy cluster CLB IP, The results are as follows
[[email protected] ~]# cat /etc/hosts 127.0.0.1 VM-4-17-tlinux VM-4-17-tlinux 127.0.0.1 localhost.localdomain localhost 127.0.0.1 localhost4.localdomain4 localhost4 ::1 VM-4-17-tlinux VM-4-17-tlinux ::1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 10.11.0.17 zhidao.baidu.com [[email protected] ~]# curl "https://zhidao.baidu.com/daily/view?id=239227" > /tmp/proxy.html % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 65912 0 65912 0 0 83569 0 --:--:-- --:--:-- --:--:-- 83538
from 3、4 As a result , adopt NAT And through the forward proxy cluster , The same visit URL, The result is the same . The forward proxy cluster capability is verified
边栏推荐
- From virtual to real, digital technology makes rural funds "live"
- Several good books for learning data
- 多任务视频推荐方案,百度工程师实战经验分享
- How to identify information more quickly and accurately through real-time streaming media video monitoring?
- 内存泄漏之KOOM
- Black hat SEO practice: General 301 weight PR hijacking
- Have you learned all these routines to solve bugs?
- The practice of tidb slow log in accompanying fish
- golang clean a slice
- Black hat actual combat SEO: never be found hijacking
猜你喜欢

halcon知识:区域(Region)上的轮廓算子(2)

openEuler社区理事长江大勇:共推欧拉开源新模式 共建开源新体系

ModStartCMS 主题入门开发教程

Clang代码覆盖率检测(插桩技术)

祝贺钟君成为 CHAOSS Metric Model 工作组的 Maintainer

Brief ideas and simple cases of JVM tuning - how to tune

618 promotion: mobile phone brand "immortal fight", high-end market "who dominates the ups and downs"?

Black hat SEO practice: General 301 weight PR hijacking

Koom of memory leak

ClickHouse(02)ClickHouse架构设计介绍概述与ClickHouse数据分片设计
随机推荐
openEuler社区理事长江大勇:共推欧拉开源新模式 共建开源新体系
What is FTP? What is the FTP address of the ECS?
Diskpart San policy is not onlineall, which affects automatic disk hanging
黑帽SEO实战搜索引擎快照劫持
[numpy] numpy's judgment on Nan value
golang clean a slice
Black hat SEO actual combat search engine snapshot hijacking
Making a Chatbot based on gpt2
RPM 包的构建 - SPEC 基础知识
LeetCode 938. Range sum of binary search tree
618 promotion: mobile phone brand "immortal fight", high-end market "who dominates the ups and downs"?
On game safety (I)
Spirit breath development log (17)
halcon知识:区域(Region)上的轮廓算子(2)
After 20 years of development, is im still standing still?
2021 graphic design trend: aesthetic response to chaos
Tencent cloud console work order submission Guide
How to identify information more quickly and accurately through real-time streaming media video monitoring?
Building RPM packages - spec Basics
How to gracefully handle and return errors in go (1) -- error handling inside functions