当前位置:网站首页>Chaos engineering platform chaosblade box new heavy release
Chaos engineering platform chaosblade box new heavy release
2022-07-02 13:45:00 【Alibaba cloud native】
author : Mingshao
What is chaos Engineering
The system architecture has experienced from stand-alone to distributed , And now the cloud native architecture , Its complexity is increasing , The difficulty of problem location also rises . In the face of faults that may occur at any time , Is there any way to solve this dilemma .
Chaos Engineering (Chaos Engineering) The discipline of conducting experiments on distributed systems , By actively injecting faults , Identify weaknesses in the system ahead of time , Promote the improvement of Architecture , Ultimately, business resilience . So as to avoid failure in the online operation environment .

Here is an example of cloud native architecture , Why can chaos engineering solve the problems existing in the system architecture . The correspondence between cloud native architecture principle and chaos engineering principle can be found , Explain with the principle of service , The fundamental principle of service is how to manage services , That is, the problem of judging the strong and weak dependence between upstream and downstream Services . Through chaos Engineering , You can locate the request to a specific machine , Then reduce to the application of specific machines , Constantly minimize the explosion radius , By injecting faults between applications , Judge whether the upstream and downstream services are normal , To judge its strong and weak dependence .

The goal of chaos engineering is to achieve a resilient architecture , There are two parts here : Ductile system and ductile structure . Resilient systems have redundancy 、 Extensibility 、 Immutable infrastructure 、 Stateless application 、 Avoid cascading failures . Resilient organizations include efficient delivery 、 Failure plan 、 Emergency response mechanism . Highly resilient systems can also have unexpected failures , So the tough organization can make up for the missing part of the tough system , Building the ultimate resilience architecture through chaos Engineering .

Chaos engineering is a way of injecting faults actively , Identify weaknesses in the system ahead of time , Promote architecture improvement , Ultimately, business resilience . For people with different functions, chaos engineering is introduced , Its business value is different :
- Architects : It can help them verify the fault tolerance of the architecture
- Development / Operation and maintenance : It can improve the emergency efficiency of its failure
- test : Help them expose online problems in advance , Reduce the failure recurrence rate
- product / Design : Remind the customer of the use experience

How to implement chaos Engineering
For enterprises or businesses, how to implement chaos Engineering ? Is there any tool or platform that can help it land quickly ?
ChaosBlade It is a tool of chaos experiment execution that follows chaos experiment model , It has high scene richness , Simple and easy to use , Support for multiple platforms 、 Multilingual environment , Include Linux、Kubernetes and Docker platform , Support Java、NodeJS、C++、Golang Language application . Support 200 Multiple scenes ,3000 Multiple parameters . It is a fault injection tool for the end side , But when the business is launched , There will be the following problems :
- How to visualize the fault injection process ?
- How to inject faults into multiple clusters or hosts at the same time ?
- How to get the statistics of the overall drill
- …
So in ChaosBlade It also needs a platform layer , Manage and rehearse the implementation tools of chaotic Engineering .

ChaosBlade-Box It is oriented to multiple clusters 、 Multilingual 、 Multiple environments , Open source cloud native chaos engineering console .
The overall architecture of the open source platform and injection tool is as follows , It mainly includes several constituent modules :
- ChaosBlade-Box Console : Chaos experiment user interface
- ChaosBlade-Box:Server The back-end service , It mainly includes the arrangement of drill scenes and security control 、 Chaos engineering tool deployment (ChaosBlade、LitmusChaos…)、 Support probe management and multi-dimensional experiments
- Agent: probe , There are mainly (ChaosBlade-Box)Server Connect and keep your heart beating 、 Report K8s related data 、 Exercise functions such as command distribution channel
- ChaosBlade: Deployed on the business host or K8s Within cluster , Tools for drilling on the end side

new edition ChaosBlade-Box The platform is a multi cluster oriented 、 Multiple environments 、 Multi language cloud native chaos engineering platform . Support international Chinese English switching , Support global namespace , So that the same user can according to their own needs , Set different global namespaces , Such as : Test space 、 Sandbox space and online space . Provide automated tool deployment , Simplify tool installation steps , Improve execution efficiency . The platform supports probe installation and drilling in different environments , Such as host and Kubernetes, among Kubernetes Support in the environment Node、Pod、Container Drill under dimension . stay Kubernetes In the environment, it will automatically collect Pod related data , And unified management in application management , This simplifies the user's query steps , There is no need to go to the cluster to view the applications to be rehearsed Pod Name or Container name . And support one click migration to enterprise , Synchronize the drill data of the community version to the enterprise version on demand .




The following is in the new version ChaosBlade-Box The whole process of a drill on the platform , Support sequential execution 、 Stage performs two process choreography , Sequential execution means that multiple drill scenarios take effect in turn , The value of stage execution is that multiple drill scenarios take effect at the same time . Ensure that the drill is restored through a variety of security strategies , Such as manual punishment and automatic stop , Automatic stop is configured by setting the timeout parameter during the drill configuration , So even if the platform and probe (Agent) Out of contact , When manual stop is not possible , Also when the timeout time arrives , Automatic recovery of faults .


What are the advantages of the new version
Compared with the old version , The front-end interface is unified with the enterprise version , Simplify the switching cost of using habits , More perfect international Chinese English switching , And support global namespace switching ; The back end provides a smoother rehearsal , Perfect application management , And strengthened the control of the probe , And support one click migration to enterprise ; The function of the probe is strengthened , It provides a more perfect API, It supports multi environment deployment and can be used as a drill channel in different environments , Support automatic installation and uninstallation , And collect and report data to simplify the drill .

Related links
Middleware developer conference address ( Speech PDF Downloadable ):
https://developer.aliyun.com/topic/middleware/developer/summit
MSE First purchase of professional edition of registration configuration center 9 A discount ,MSE Full specification of cloud native gateway prepaid 85 A discount .
边栏推荐
- 【OpenGL】笔记二十九、高级光照(镜面高光)
- ArrayList and LinkedList
- Node. JS accessing PostgreSQL database through ODBC
- Security RememberMe原理分析
- D为何链接不了dll
- The xftp connection Haikang camera reported an error: the SFTP subsystem application has been rejected. Please ensure that the SFTP subsystem settings of the SSH connection are valid
- Skillfully use SSH to get through the Internet restrictions
- selenium,元素操作以及浏览器操作方法
- P1347 排序(拓扑 + spfa判断环 or 拓扑[内判断环])
- 你的 Sleep 服务会梦到服务网格外的 bookinfo 吗
猜你喜欢

OpenFOAM:lduMatrix&lduAddressing

Bridge of undirected graph

Solve "sub number integer", "jump happily", "turn on the light"

The second anniversary of the three winged bird: the wings are getting richer and the take-off is just around the corner

基于ssm+jsp框架实现的学生选课信息管理系统【源码+数据库】

混沌工程平台 ChaosBlade-Box 新版重磅发布

最近公共祖先LCA的三种求法

记忆函数的性能优化

Performance optimization of memory function

二、帧模式 MPLS 操作
随机推荐
Memory management 01 - link script
2022 zero code / low code development white paper [produced by partner cloud] with download
你的 Sleep 服务会梦到服务网格外的 bookinfo 吗
运维必备——ELK日志分析系统
Solve "sub number integer", "jump happily", "turn on the light"
uniapp小程序 subPackages分包配置
【模板】最长公共子序列 (【DP or 贪心】板子)
nohup命令
三翼鸟两周年:羽翼渐丰,腾飞指日可待
中文姓名提取(玩具代码——准头太小,权当玩闹)
[Unity]使用GB2312,打包后程序不正常解决方案
Tupang multi-target tracking! BOT sort: robust correlated multi pedestrian tracking
How much do you know about free SSL certificates? The difference between free SSL certificate and charged SSL certificate
Astro learning notes
错误:EACCES:权限被拒绝,访问“/usr/lib/node_modules”
EasyDSS点播服务分享时间出错如何修改?
JS reverse row query data decryption
Qt-制作一个简单的计算器-实现四则运算
如何设置Qt手工布局
Fundamentals of machine learning (II) -- division of training set and test set