当前位置:网站首页>Microservice link risk analysis
Microservice link risk analysis
2022-06-30 21:45:00 【51CTO】
Link risk analysis starts from the historical data of link communication , Analyze the current risks of the link , Reduce the hidden trouble of link communication , Improve the overall stability of the system . Link risk analysis can solve many problems , Such as whether the timeout setting is reasonable 、 Whether the retry number setting is reasonable 、 service SLA Whether the indicator setting is reasonable 、 Whether the strength dependency of the service meets the expectation, etc , A large proportion of failures related to service communication are caused by link risks , It can be found and solved in advance through link risk analysis , Avoid failure .
One 、 Overtime and SLA risk
Timeout configuration of client access server , It is inconsistent with the actual visit , Is a very common link risk . The timeout configuration of upstream services is too small , This will cause some requests that could have been returned normally to time out , Affecting service SLA And normal service experience ; Timeout configuration is too large , It will cause the upstream service to wait for too long when the downstream service fails , In serious cases, it will bring down the whole system . therefore , The timeout setting is directly related to the stability of the system , There should be a corresponding mechanism to guide the timeout setting of the service , And timely discover the hidden trouble of overtime configuration in the online system .
There are two main types of overtime configuration risks : One is that the timeout does not match the actual time ; There is also a mismatch between the upstream and downstream timeout settings , Such as the A、B、C 3 A service , service A Access the service B, service B Access the service C, However, services are often encountered in actual business A Access the service B The timeout time is longer than that of the service B Access the service C When the timeout time is small .
Two 、 Strong or weak dependency or retry risk
Communication between microservices , If the link communication fails, the entire request processing will fail , Generally, the relationship between these two microservices is called strong dependency , On the contrary, it is called weak dependence . We can rely on the strength of the service , To downgrade 、 Fuse, etc .
Strength depends on service risk , It means that the relationship between link communication does not match the expectation . such as , service A Call the service B The link is weakly dependent , But as the requirements iterate, the business logic changes , May inadvertently serve A Call the service B In fact, the link becomes strongly dependent , But we still follow the prior knowledge , Think of it as a weak dependency , This is a big risk point . Especially when the link fails , When performing operations such as degradation based on the premise of weak dependency , It may lead to tragedy , Make the whole system unavailable . Therefore, there needs to be a corresponding mechanism , Regularly detect the risk points of the current link relationship .
3、 ... and 、 Cluster or topology risk
Cluster or topology risk is a major source of risk analysis . such as , Some machines in an online cluster are temporarily offline for a period of time due to warranty , However, the machine is not mounted after repair , As a result, some machines are idle for nothing ; service A Call the service B It was originally called by the same computer room , Temporarily switch the calling relationship to calling the services of other machine rooms due to failures or traffic switching drills B service , But I didn't cut it back afterwards , Lead to service A Call the service B Always cross machine room access , Affect user experience and system stability ; A service S The geographical location is not taken into account when deploying online , Deploy too many service nodes to the same switch , When the switch fails, multiple nodes of the service are unavailable at the same time , Insufficient number of available nodes leads to service avalanche .
Four 、 Link call risk
Link real-time topology data is a treasure , Many risks at the link call level can be slowly discovered . such as , The current service call exceeds 20 Downstream services , Fan out too much , Not quite in line with the design criteria of microservices , Consider whether further splitting is necessary .
In microservice Architecture , The link of a single request is particularly long , There will be some performance problems , Therefore, from the global link topology TOP10 Long link , Or the link depth exceeds 6 The links are listed , Feedback to business personnel , See if it is necessary to make architectural adjustments .
In the process of microservice splitting and Design , It is not recommended that two microservices be interdependent , You can find out whether there are currently looped links through the link topology , If it forms a ring , It shows that there are interdependencies between services , Similar risks can be fed back to business personnel for rectification .
Link risk analysis is the process of discovering risks, abstracting risks and establishing automatic detection mechanism , In essence, it is a systematic project for fine management of stability risk , It needs long-term and sustained construction .
Discovering risk is the first step of link risk analysis , In order to continuously discover new risks in the system , It is suggested to combine risk analysis with stability antipattern :① According to the major faults in the system , And some typical problems prone to failure accumulated before , Sort out stability anti pattern , That is, it is easy to make mistakes in stability practice , Some patterns that should not appear ;② Determine whether these anti patterns can be detected in an automated way .
At the same time, in order to facilitate the detection of new risks , A perfect risk analysis framework can be established , Specifically, it includes the risk status quo 、 Risk improvement closed loop 、 Risk Report 、 Automatic risk notification mechanism, etc , The new risk analysis is directly based on the framework development , It is equivalent to adding a plug-in , It can greatly improve the efficiency of risk analysis .
边栏推荐
- Sqlserver gets the data of numbers, Chinese and characters in the string
- Anaconda下安装Jupyter notebook
- 【无标题】第一次参加csdn活动
- Coefficient of variation method matlab code [easy to understand]
- What does grade evaluation mean? What is included in the workflow?
- 1-17 express中间件
- 1-2 install and configure MySQL related software
- 1-21 jsonp interface
- Upgrade Kube with unknown flag: --network plugin
- Nacos部署及使用
猜你喜欢

It is urgent for enterprises to protect API security

asp. Net core JWT delivery

Akk bacteria - the next generation of beneficial bacteria

介绍一款|用于多组学整合和网络可视化分析的在线平台

jupyter notebook/lab 切换conda环境

Five years after graduation, I wondered if I would still be so anxious if I hadn't taken the test

Reading notes of Clickhouse principle analysis and Application Practice (3)

Markdown notes concise tutorial

1-2 安装并配置MySQL相关的软件

Radar data processing technology
随机推荐
CA I ah, several times Oh, ah, a sentence IU home Oh
PyTorch量化实践(1)
《ClickHouse原理解析与应用实践》读书笔记(3)
Sqlserver string type converted to decimal or integer type
1-14 express托管静态资源
Coefficient of variation method matlab code [easy to understand]
Upgrade Kube with unknown flag: --network plugin
1-7 path module
Why have the intelligent investment advisory products collectively taken off the shelves of banks become "chicken ribs"?
测试媒资缓存问题
Ml & DL: introduction to hyperparametric optimization in machine learning and deep learning, evaluation index, over fitting phenomenon, and detailed introduction to commonly used parameter adjustment
Analysis and proposal on the "sour Fox" vulnerability attack weapon platform of the US National Security Agency
1-7 Path路径模块
12345
1-13 express listens to get and post requests & processes requests
.netcore redis GEO类型
全面认识痛风:症状、风险因素、发病机理及管理
Bloom filter
1-1 basic concepts of database
升级kube出现unknown flag: --network-plugin