当前位置:网站首页>Multi dimensional monitoring: the data base of intelligent monitoring
Multi dimensional monitoring: the data base of intelligent monitoring
2022-06-28 13:39:00 【Tencent blue whale assistant】
Preface
Take component monitoring as an example , Introduce the roadmap for monitoring products
The function of operation and maintenance monitoring system is self-evident , Throughout the operation and maintenance 5 Functions : Release 、 change 、 Fault handling 、 Experience Optimization 、 Daily needs , Ensure the service availability of the above functions .
From the characteristics of big data ( Large amount of data 、 multidimensional 、 completeness )1 Look at , The construction of operation and maintenance monitoring system can be divided into 2 Stages : Multidimensional monitoring ( Accumulate data ) and Intelligent monitoring ( Using data ), Through multi-dimensional monitoring, the fault can be seen 、 Yes , Intelligent monitoring detects risks in advance 、 Find out the cause of the fault .
Component monitoring is the first step of a multi-dimensional monitoring system 3 layer , Mainly for Common open source components 、 Monitor the performance indicators of middleware , such as Nginx The performance indicators are Active Connections( Current number of client connections )、Waiting( Number of connections waiting ) etc. ,Oracle The performance indicators are SQL Hard resolution rate 、 Table space usage, etc .
By collecting the key performance indicators of components , Learn the health status of components in real time , Find problems ahead of time , Instead of just monitoring whether a process or port is alive ( When the process or port is normal , Does not mean that services can be provided ).
This paper takes the construction component monitoring as an example , from The composition of multi-dimensional monitoring 、 Monitoring the product to solve 3 A question 、 Technology selection of component monitoring 、 Cloud distribution collector configuration 、 The openness of the community To introduce the monitoring product design roadmap .
1. The composition of multi-dimensional monitoring
From the perspective of user access to the link , The dimensions of monitoring indicators are divided into User level 、 application layer 、 Component layer 、 Host layer 、 The network layer .
User level , Simulate the user's access behavior through service dial-up testing , You don't have to wait for users to complain ; application layer , Trace the call status of the application through the call chain ; The other three layers are easy to understand and will not be introduced .
Through this 5 layer + Other key indicators ( Like a journal 、 Business KPI Curves, etc ), Build multi-dimensional monitoring capability of monitoring system , Provide data support for the second stage of intelligent monitoring .
2. Monitoring the product to solve 3 A question
In addition to obtaining key performance indicators , Monitoring products still need to be solved 3 A question , Failure correlation analysis can be carried out for fallback , The intelligent scenario of operation and maintenance can be built .
2.1 Yes IT Autonomous Control of the system
because Yes IT Lack of autonomous control ability of the system ," Replacing IT System " and " Trend replacement IT On the way to the system , Is part of 、 Large enterprises in " Internet +" Actively embracing the current situation of the Internet under the tide .
In view of this situation , Some industries have made it clear that 2, We must pay more attention to IT The ability of the system to control itself .
therefore , Product design , It should be considered that users of the monitoring system can participate in the development or partial development of the monitoring system .
2.2 Refuse to build another chimney
The shaft structure is estimated to be built by most enterprises IT The state of the system , There is no correlation between each system , Each purchase of a system is equivalent to building an information island , Extremely low added value .
If you want to realize fallback, you can perform fault correlation analysis , The intelligent scenario of operation and maintenance can be built , Can be based on PaaS On the operation and maintenance platform 4, adopt iPaaS Get through all the inside of the enterprise IT Operating system .
2.3 There are many components , It's not very realistic to be completely self-study
There are a wide variety of components used in the industry , From database 、 Storage 、HTTP Service to message queue, etc 100+, It's certainly unrealistic to make a complete self-study .
A good way is to study the core by yourself 、 Components with poor industry support , The rest rely on the accumulated capacity of the industry for many years , Make fewer wheels , Save electricity for the society .
3. Technology selection of component monitoring
stay 2.3 Self research is mentioned in + The first 3 The idea of open source collector , Here is the open source collector Prometheus Exporter For example .
Prometheus Exporter Our community is very active 5, Support 100+ Common open source components , Some large factories even specially write corresponding Prometheus Exporter, such as Oracle Compiling Weblogic Exporter,IBM Compiling IBM MQ exporter,k8s、etcd Even built-in based on Exporter canonical metrics.
According to this scheme , Just do one Protocol conversion You can stock in indicators
4. Experience Optimization : Cloud distribution collector configuration
After solving the basic requirements , You need to optimize your experience right away .
Send the collector or configuration to the monitored host , Generally, you need to manually deploy or use third-party tools ( Such as Ansible).
Switch multiple systems to accomplish one thing , The experience is very bad .
There is an optimization scheme , adopt iPaaS Use the file distribution and command execution capabilities of the control platform layer 4, Let users complete the configuration process in one page , Improve efficiency .
5. The openness of the community
After meeting the basic functions and optimizing the product experience , Next, consider Product scalability .
First, it solves the convenience of users' one click Import of self-developed components , Next, provide a communication platform for community users to share freely .
While gaining the open source capability of the community , It also needs to feed the community .
6. ending
The multi-dimensional monitoring that belongs to the basic monitoring scope is relative to the intelligent monitoring , Not very bright , but It is the data base of intelligent monitoring , There is no data provided by multi-dimensional monitoring , Failure prediction cannot be realized 、 Intelligent monitoring scenarios such as fault root cause analysis .
When traditional enterprises or Internet enterprises embrace the change of the Internet , Need to think calmly , Follow the roadmap step by step .
7. reference
1 Wu Jun . The age of intelligence : Big data and intelligent revolution redefine the future M. Beijing : Citic publishing group ,2016-8.
2 People's Bank of China . Information technology in China's financial industry “ Much starker choices-and graver consequences-in ” development planning EB/OL. 2017.06
3 China Banking Regulatory Commission . China's banking information technology “ Much starker choices-and graver consequences-in ” Regulatory guidance on Development Planning ( Solicitation draft )EB/OL. 2016.07.15
4 China Communications Standardization Association . Cloud computing operation and maintenance platform reference framework and technical requirements EB/OL. 2017.11.16
5 Prometheus. EXPORTERS AND INTEGRATIONS EB/OL.
Blue whale wisdom cloud
This article is edited and released by Tencent blue whale Zhiyun , Tencent blue whale Zhiyun ( Short for blue whale ) The software system is a set of systems based on PaaS Technology solutions for , Committed to building an industry-leading one-stop automatic operation and maintenance platform . At present, the community version has been launched 、 Enterprise Edition , Welcome to experience .
- Official website :https://bk.tencent.com/
- Download link :https://bk.tencent.com/download/
- Community :https://bk.tencent.com/s-mart/community/question
边栏推荐
- Buuctf:[wustctf2020] plain
- 设计人工智能产品:技术可能性、用户合意性、商业可行性
- thinkphp6 多级控制器目录访问解决方法
- How to solve the data inconsistency between redis and MySQL?
- Luogu_ P1303 A*B Problem_ High precision calculation
- Jeecg 官方组件的使用笔记(更新中...)
- Kubernetes in-depth understanding of kubernetes (I)
- [机缘参悟-32]:鬼谷子-抵巇[xī]篇-面对危险与问题的五种态度
- PHP obtains the beginning and end time of the month according to the month and year
- Google Earth engine (GEE) - Global organic soil area of FAO (1992-2018)
猜你喜欢

Design artificial intelligence products: technical possibility, user acceptability and commercial feasibility

Oracle 云基础设施扩展分布式云服务,为组织提供更高的灵活性和可控性

Hubble database x a joint-stock commercial bank: upgrade the number management system of Guanzi, so that every RMB has an "ID card"

If a programmer goes to prison, will he be assigned to write code?

PostgreSQL超越MySQL

Other domestic mobile phones failed to fill the vacancy of Huawei, and apple has no rival in the high-end mobile phone market

中国广电5G套餐来了,比三大运营商低,却没预期那么低

Mobile web training -flex layout test question 1

MySQL multi table joint query

真香啊!最全的 Pycharm 常用快捷键大全!
随机推荐
Data analysis - promoter evolution analysis
几百行代码实现一个 JSON 解析器
NFT digital collection system development (3D modeling economic model development case)
Professional English calendar questions
新品体验:阿里云新一代本地SSD实例i4开放公测
En parlant d'exception - que se passe - t - il lorsque l'exception est lancée?
How fragrant! The most complete list of common shortcut keys for pychar!
Jupyter notebook中添加虚拟环境
Hang Seng Electronics: lightdb, a financial distributed database, has passed a number of evaluations by China Academy of communications technology
Electronic components distribution 1billion Club [easy to understand]
真香啊!最全的 Pycharm 常用快捷键大全!
Action interprets value. The chairman of chenglian Youpin Han attended the Guangdong Yingde flood fighting donation public welfare event
New product experience: Alibaba cloud's new generation of local SSD instance I4 open beta
黑苹果安装教程OC引导「建议收藏」
DevEco Studio 3.0编辑器配置技巧篇
再谈exception——异常抛出时会发生什么?
CVPR再起争议:IBM中稿论文被指照搬自己承办竞赛第二名的idea
The English translation of heartless sword Zhu Xi's two impressions of reading
Simple understanding of ThreadLocal
Special test for cold and hot start of app