当前位置:网站首页>Chaos engineering platform chaosblade box new heavy release
Chaos engineering platform chaosblade box new heavy release
2022-07-02 13:45:00 【Alibaba cloud native】
author : Mingshao
What is chaos Engineering
The system architecture has experienced from stand-alone to distributed , And now the cloud native architecture , Its complexity is increasing , The difficulty of problem location also rises . In the face of faults that may occur at any time , Is there any way to solve this dilemma .
Chaos Engineering (Chaos Engineering) The discipline of conducting experiments on distributed systems , By actively injecting faults , Identify weaknesses in the system ahead of time , Promote the improvement of Architecture , Ultimately, business resilience . So as to avoid failure in the online operation environment .
Here is an example of cloud native architecture , Why can chaos engineering solve the problems existing in the system architecture . The correspondence between cloud native architecture principle and chaos engineering principle can be found , Explain with the principle of service , The fundamental principle of service is how to manage services , That is, the problem of judging the strong and weak dependence between upstream and downstream Services . Through chaos Engineering , You can locate the request to a specific machine , Then reduce to the application of specific machines , Constantly minimize the explosion radius , By injecting faults between applications , Judge whether the upstream and downstream services are normal , To judge its strong and weak dependence .
The goal of chaos engineering is to achieve a resilient architecture , There are two parts here : Ductile system and ductile structure . Resilient systems have redundancy 、 Extensibility 、 Immutable infrastructure 、 Stateless application 、 Avoid cascading failures . Resilient organizations include efficient delivery 、 Failure plan 、 Emergency response mechanism . Highly resilient systems can also have unexpected failures , So the tough organization can make up for the missing part of the tough system , Building the ultimate resilience architecture through chaos Engineering .
Chaos engineering is a way of injecting faults actively , Identify weaknesses in the system ahead of time , Promote architecture improvement , Ultimately, business resilience . For people with different functions, chaos engineering is introduced , Its business value is different :
- Architects : It can help them verify the fault tolerance of the architecture
- Development / Operation and maintenance : It can improve the emergency efficiency of its failure
- test : Help them expose online problems in advance , Reduce the failure recurrence rate
- product / Design : Remind the customer of the use experience
How to implement chaos Engineering
For enterprises or businesses, how to implement chaos Engineering ? Is there any tool or platform that can help it land quickly ?
ChaosBlade It is a tool of chaos experiment execution that follows chaos experiment model , It has high scene richness , Simple and easy to use , Support for multiple platforms 、 Multilingual environment , Include Linux、Kubernetes and Docker platform , Support Java、NodeJS、C++、Golang Language application . Support 200 Multiple scenes ,3000 Multiple parameters . It is a fault injection tool for the end side , But when the business is launched , There will be the following problems :
- How to visualize the fault injection process ?
- How to inject faults into multiple clusters or hosts at the same time ?
- How to get the statistics of the overall drill
- …
So in ChaosBlade It also needs a platform layer , Manage and rehearse the implementation tools of chaotic Engineering .
ChaosBlade-Box It is oriented to multiple clusters 、 Multilingual 、 Multiple environments , Open source cloud native chaos engineering console .
The overall architecture of the open source platform and injection tool is as follows , It mainly includes several constituent modules :
- ChaosBlade-Box Console : Chaos experiment user interface
- ChaosBlade-Box:Server The back-end service , It mainly includes the arrangement of drill scenes and security control 、 Chaos engineering tool deployment (ChaosBlade、LitmusChaos…)、 Support probe management and multi-dimensional experiments
- Agent: probe , There are mainly (ChaosBlade-Box)Server Connect and keep your heart beating 、 Report K8s related data 、 Exercise functions such as command distribution channel
- ChaosBlade: Deployed on the business host or K8s Within cluster , Tools for drilling on the end side
new edition ChaosBlade-Box The platform is a multi cluster oriented 、 Multiple environments 、 Multi language cloud native chaos engineering platform . Support international Chinese English switching , Support global namespace , So that the same user can according to their own needs , Set different global namespaces , Such as : Test space 、 Sandbox space and online space . Provide automated tool deployment , Simplify tool installation steps , Improve execution efficiency . The platform supports probe installation and drilling in different environments , Such as host and Kubernetes, among Kubernetes Support in the environment Node、Pod、Container Drill under dimension . stay Kubernetes In the environment, it will automatically collect Pod related data , And unified management in application management , This simplifies the user's query steps , There is no need to go to the cluster to view the applications to be rehearsed Pod Name or Container name . And support one click migration to enterprise , Synchronize the drill data of the community version to the enterprise version on demand .
The following is in the new version ChaosBlade-Box The whole process of a drill on the platform , Support sequential execution 、 Stage performs two process choreography , Sequential execution means that multiple drill scenarios take effect in turn , The value of stage execution is that multiple drill scenarios take effect at the same time . Ensure that the drill is restored through a variety of security strategies , Such as manual punishment and automatic stop , Automatic stop is configured by setting the timeout parameter during the drill configuration , So even if the platform and probe (Agent) Out of contact , When manual stop is not possible , Also when the timeout time arrives , Automatic recovery of faults .
What are the advantages of the new version
Compared with the old version , The front-end interface is unified with the enterprise version , Simplify the switching cost of using habits , More perfect international Chinese English switching , And support global namespace switching ; The back end provides a smoother rehearsal , Perfect application management , And strengthened the control of the probe , And support one click migration to enterprise ; The function of the probe is strengthened , It provides a more perfect API, It supports multi environment deployment and can be used as a drill channel in different environments , Support automatic installation and uninstallation , And collect and report data to simplify the drill .
Related links
Middleware developer conference address ( Speech PDF Downloadable ):
https://developer.aliyun.com/topic/middleware/developer/summit
MSE First purchase of professional edition of registration configuration center 9 A discount ,MSE Full specification of cloud native gateway prepaid 85 A discount .
边栏推荐
- 能自动更新的万能周报模板,有手就会用!
- Common options of tcpdump command: Three
- Qt如何设置固定大小
- [document tree, setting] font becomes smaller
- Qt-制作一个简单的计算器-实现四则运算
- (POJ - 1984) navigation nightare (weighted and search set)
- Nohup command
- selenium,元素操作以及浏览器操作方法
- Explanation of 34 common terms on the Internet
- [OpenGL] notes 29. Advanced lighting (specular highlights)
猜你喜欢
2、 Frame mode MPLS operation
【蓝桥杯选拔赛真题43】Scratch航天飞行 少儿编程scratch蓝桥杯选拔赛真题讲解
Don't spend money, spend an hour to build your own blog website
题解:《你的飞碟在这儿》、《哥德巴赫猜想》
中文姓名提取(玩具代码——准头太小,权当玩闹)
Student course selection information management system based on ssm+jsp framework [source code + database]
Independent and controllable 3D cloud CAD: crowncad enables innovative design of enterprises
Essential for operation and maintenance - Elk log analysis system
研究表明“气味相投”更易成为朋友
三翼鸟两周年:羽翼渐丰,腾飞指日可待
随机推荐
Japan bet on national luck: Web3.0, anyway, is not the first time to fail!
D language, possible 'string plug-ins'
linux下清理系统缓存并释放内存
P3807 [template] Lucas theorem /lucas theorem
What are eNB, EPC and PGW?
JS逆向之行行查data解密
[Unity]使用GB2312,打包后程序不正常解决方案
ArrayList and LinkedList
Error function ERF
D如何检查null
[technology development-22]: rapid overview of the application and development of network and communication technology-2-communication Technology
基于ssm+jsp框架实现的学生选课信息管理系统【源码+数据库】
BeanUtils--浅拷贝--实例/原理
Quantum three body problem: Landau fall
leetcode621. task scheduler
Explanation of 34 common terms on the Internet
JS reverse row query data decryption
A better database client management tool than Navicat
The xftp connection Haikang camera reported an error: the SFTP subsystem application has been rejected. Please ensure that the SFTP subsystem settings of the SSH connection are valid
Chinese name extraction (toy code - accurate head is too small, right to play)