当前位置:网站首页>Multi dimensional monitoring: the data base of intelligent monitoring
Multi dimensional monitoring: the data base of intelligent monitoring
2022-07-03 11:22:00 【Blue whale Zhiyun】
Preface
Take component monitoring as an example , Introduce the roadmap for monitoring products
The function of operation and maintenance monitoring system is self-evident , Throughout the operation and maintenance 5 Functions : Release 、 change 、 Fault handling 、 Experience Optimization 、 Daily needs , Ensure the service availability of the above functions .
From the characteristics of big data ( Large amount of data 、 multidimensional 、 completeness )[1] Look at , The construction of operation and maintenance monitoring system can be divided into 2 Stages : Multidimensional monitoring ( Accumulate data ) and Intelligent monitoring ( Using data ), Through multi-dimensional monitoring, the fault can be seen 、 Yes , Intelligent monitoring detects risks in advance 、 Find out the cause of the fault .
Component monitoring is the first step of a multi-dimensional monitoring system 3 layer , Mainly for Common open source components 、 Monitor the performance indicators of middleware , such as Nginx The performance indicators are Active Connections( Current number of client connections )、Waiting( Number of connections waiting ) etc. ,Oracle The performance indicators are SQL Hard resolution rate 、 Table space usage, etc .
By collecting the key performance indicators of components , Learn the health status of components in real time , Find problems ahead of time , Instead of just monitoring whether a process or port is alive ( When the process or port is normal , Does not mean that services can be provided ).
This paper takes the construction component monitoring as an example , from The composition of multi-dimensional monitoring 、 Monitoring the product to solve 3 A question 、 Technology selection of component monitoring 、 Cloud distribution collector configuration 、 The openness of the community To introduce the monitoring product design roadmap .
1. The composition of multi-dimensional monitoring
From the perspective of user access to the link , The dimensions of monitoring indicators are divided into User level 、 application layer 、 Component layer 、 Host layer 、 The network layer . User level , Simulate the user's access behavior through service dial-up testing , You don't have to wait for users to complain ; application layer , Trace the call status of the application through the call chain ; The other three layers are easy to understand and will not be introduced .
Through this 5 layer + Other key indicators ( Like a journal 、 Business KPI Curves, etc ), Build multi-dimensional monitoring capability of monitoring system , Provide data support for the second stage of intelligent monitoring .
2. Monitoring the product to solve 3 A question
In addition to obtaining key performance indicators , Monitoring products still need to be solved 3 A question , Failure correlation analysis can be carried out for fallback , The intelligent scenario of operation and maintenance can be built .
2.1 Yes IT Autonomous Control of the system
because Yes IT Lack of autonomous control ability of the system ," Replacing IT System " and " Trend replacement IT On the way to the system , Is part of 、 Large enterprises in " Internet +" Actively embracing the current situation of the Internet under the tide .
In view of this situation , Some industries have made it clear that [2][3], We must pay more attention to IT The ability of the system to control itself .
therefore , Product design , It should be considered that users of the monitoring system can participate in the development or partial development of the monitoring system .
2.2 Refuse to build another chimney
The shaft structure is estimated to be built by most enterprises IT The state of the system , There is no correlation between each system , Each purchase of a system is equivalent to building an information island , Extremely low added value .
If you want to realize fallback, you can perform fault correlation analysis , The intelligent scenario of operation and maintenance can be built , Can be based on PaaS On the operation and maintenance platform [4], adopt iPaaS Get through all the inside of the enterprise IT Operating system .
2.3 There are many components , It's not very realistic to be completely self-study
There are a wide variety of components used in the industry , From database 、 Storage 、HTTP Service to message queue, etc 100+, It's certainly unrealistic to make a complete self-study .
A good way is to study the core by yourself 、 Components with poor industry support , The rest rely on the accumulated capacity of the industry for many years , Make fewer wheels , Save electricity for the society .
3. Technology selection of component monitoring
stay 2.3 Self research is mentioned in + The first 3 The idea of open source collector , Here is the open source collector Prometheus Exporter For example . Prometheus Exporter Our community is very active [5], Support 100+ Common open source components , Some large factories even specially write corresponding Prometheus Exporter, such as Oracle Compiling Weblogic Exporter,IBM Compiling IBM MQ exporter,k8s、etcd Even built-in based on Exporter canonical metrics.
According to this scheme , Just do one Protocol conversion You can stock in indicators
4. Experience Optimization : Cloud distribution collector configuration
After solving the basic requirements , You need to optimize your experience right away .
Send the collector or configuration to the monitored host , Generally, you need to manually deploy or use third-party tools ( Such as Ansible).
Switch multiple systems to accomplish one thing , The experience is very bad .
There is an optimization scheme , adopt iPaaS Use the file distribution and command execution capabilities of the control platform layer [4], Let users complete the configuration process in one page , Improve efficiency .
5. The openness of the community
After meeting the basic functions and optimizing the product experience , Next, consider Product scalability .
First, it solves the convenience of users' one click Import of self-developed components , Next, provide a communication platform for community users to share freely .
While gaining the open source capability of the community , It also needs to feed the community .
6. ending
The multi-dimensional monitoring that belongs to the basic monitoring scope is relative to the intelligent monitoring , Not very bright , but It is the data base of intelligent monitoring , There is no data provided by multi-dimensional monitoring , Failure prediction cannot be realized 、 Intelligent monitoring scenarios such as fault root cause analysis .
When traditional enterprises or Internet enterprises embrace the change of the Internet , Need to think calmly , Follow the roadmap step by step .
7. reference
[1] Wu Jun . The age of intelligence : Big data and intelligent revolution redefine the future [M]. Beijing : Citic publishing group ,2016-8.
[2] People's Bank of China . Information technology in China's financial industry “ Much starker choices-and graver consequences-in ” development planning [EB/OL]. 2017.06
[3] China Banking Regulatory Commission . China's banking information technology “ Much starker choices-and graver consequences-in ” Regulatory guidance on Development Planning ( Solicitation draft )[EB/OL]. 2016.07.15
[4] China Communications Standardization Association . Cloud computing operation and maintenance platform reference framework and technical requirements [EB/OL]. 2017.11.16
[5] Prometheus. EXPORTERS AND INTEGRATIONS [EB/OL].
Blue whale wisdom cloud
This article is edited and released by Tencent blue whale Zhiyun , Tencent blue whale Zhiyun ( Short for blue whale ) The software system is a set of systems based on PaaS Technology solutions for , Committed to building an industry-leading one-stop automatic operation and maintenance platform . At present, the community version has been launched 、 Enterprise Edition , Welcome to experience .
- Official website :https://bk.tencent.com/
- Download link :https://bk.tencent.com/download/
- Community :https://bk.tencent.com/s-mart/community/question
边栏推荐
- 【Proteus仿真】74HC154 四线转12线译码器组成的16路流水灯
- Processes and threads
- 反正切熵(Arctangent entropy):2022.7月最新SCI论文
- 活动预告 | 直播行业“内卷”,以产品力拉动新的数据增长点
- The five-year itch of software testing engineers tells the experience of breaking through bottlenecks for two years
- [proteus simulation] 16 channel water lamp composed of 74hc154 four wire to 12 wire decoder
- 00后抛弃互联网: 毕业不想进大厂,要去搞最潮Web3
- 2021 reading summary (continuously updating)
- Matlab memory variable management command
- 基于I2C协议的驱动开发
猜你喜欢
Tencent micro app to get wechat user information
The element form shows the relationship between elementary transformation and elementary matrix
"Core values of testing" and "super complete learning guide for 0 basic software testing" summarized by test engineers for 8 years
The testing department of the company came to the king of the Post-00 roll, and the veteran exclaimed that it was really dry, but
Probability theory: application of convolution in calculating moving average
Google Earth engine (GEE) - ghsl global population grid dataset 250 meter resolution
Unique in the industry! Fada electronic contract is on the list of 36 krypton hard core technology enterprises
[OBS] configFile in ini format of OBS
数据库增量备份 - DB INCR DB FULL
Google Earth Engine(GEE)——GHSL 全球人口网格数据集250米分辨率
随机推荐
redis那些事儿
1. Hal driven development
Probability theory: application of convolution in calculating moving average
(二)进制
在职美团测试工程师的这八年,我是如何成长的,愿技术人看完都有收获
如何:配置 ClickOnce 信任提示行为
AMS series - application startup process
Solve undefined reference to`__ aeabi_ Uidivmod 'and undefined reference to`__ aeabi_ Uidiv 'error
ConstraintLayout跟RelativeLayout嵌套出现的莫名奇妙的问题
Hal - General
Balance between picture performance of unity mobile game performance optimization spectrum and GPU pressure
帝国cms 无缩略图 灵动标签(e:loop)判断有无标题图片(titlepic)的两种写法
Software testing e-commerce projects that can be written into your resume, don't you come in and get it?
C language two-dimensional array
2. Hal hardware abstraction layer
用了这么久线程池,你真的知道如何合理配置线程数吗?
The manuscript will be revised for release tonight. But, still stuck here, maybe what you need is a paragraph.
Encapsulation attempt of network request framework of retro + kotlin + MVVM
CorelDRAW Graphics Suite 2022新版功能详情介绍
Error installing the specified version of pilot