当前位置:网站首页>What's the point of monitoring the involution of the system?
What's the point of monitoring the involution of the system?
2022-08-05 06:15:00 【LinkSLA】
The volume of monitoring, not one day practice
Monitoring is the starting point and difficulty of operation and maintenance.The system needs to perform these functions:
Full stack monitoring;
Association Analysis;
Concatenation across system calls;
Real-time alarm and automatic disposal;
System performance analysis.
Two scenarios of operation and maintenance: abnormal detection and early warning.In other words, good monitoring mainly serves two scenarios: experience and emergency.What is good monitoring?
1. The alarm is timely, and the designated user is notified as soon as possible to solve the fault in time and prevent the fault from spreading.
2. The alarms are accurate and cannot be falsely reported, missed or repeated, and accurate information is pushed to users.
3. The monitoring objects should be comprehensive, from the dynamic ring of the computer room, hardware equipment, operating system, application components to business systems to carry out full-stack monitoring.
4. The operation and maintenance must be closed-loop, from the alarm trigger to the convergence as an event/work order, and the receipt, processing and termination of the work order are completed within the time specified in the SLA.
Traditional O&M relies on manpower to monitor system operating status, performance indicators, and online and change services.
With the advancement of digitization, servers, software modules, and access data have proliferated, the number and complexity of IT systems have increased, and the number of monitoring projects has become difficult to deal with, and accidents cannot be accurately located.change.
Attack: Data Standards and Value Output
Rome was not built in a day, and the operation and maintenance platform has also experienced the development process of manual tooling, tool platformization, and platform intelligence.The more prominent advantages of intelligent operation and maintenance are data standards and delivery value.
To fully mine the data value of operation and maintenance, find all problems, pinpoint problems, and reduce problems.Multi-layer monitoring, breaking the island of operation and maintenance, full-stack monitoring objects, including:
01
Hardware
Hardware equipment is the basis for monitoring operation and maintenance. Monitoring includes: computer room dynamic ring, server, network equipment, storage equipment, etc.
02
Virtualization
For example: vsphere, powerVM, hyper-V, docker, K8S, etc.
03
Operating System
Support Windows, Linux, AIX and other operating systems.
04
App Components
Supports common commercial and open source components including databases and middleware.
05
Business Systems
Supports monitoring each component of the business system as a logical monitoring object through BPV (Business Process View).
Monitoring of full-stack objects improves operation and maintenance efficiency, and solves the problems of inaccurate alarms, difficult problem location, and difficult root cause location.Identify and locate problems proactively, quickly, and accurately.
1. Quality Assurance
Including abnormal detection, fault diagnosis, fault prediction, fault self-healing.
2. Cost management
Metric monitoring, anomaly detection, resource optimization, capacity planning, performance optimization.
3. Efficiency improvement
Smart changes, machine learning algorithms, security.
LinkSLA started algorithm research and selection of specific implementation scenarios in 2018. It has made breakthroughs in single-point application in full-stack monitoring, anomaly detection, and log anomaly detection, and has achieved remarkable results, bringing more data value to business.Provide decision-making basis for enterprise development.
边栏推荐
猜你喜欢
随机推荐
有哪些事情是你做了运维才知道的?
[Day8] Commands involved in using LVM to expand
wc、grep、tar、vi/vim
I/O性能与可靠性
ROS video tutorial
lvm逻辑卷及磁盘配额
Wechat applet page jump to pass parameters
spark source code - task submission process - 3-ApplicationMaster
[Day1] VMware software installation
Remembering my first CCF-A conference paper | After six rejections, my paper is finally accepted, yay!
spark source code - task submission process - 5-CoarseGrainedExecutorBackend
[Paper Intensive Reading] The relationship between Precision-Recall and ROC curves
入门文档03 区分开发与生产环境(生产环境才执行‘热更新’)
User and user group management, file permission management
交换机原理
LinkSLA坚持用户第一,打造可持续的运维服务方案
正则表达式小实例--验证邮箱地址
Why can't I add a new hard disk to scan?How to solve?
账号与权限管理
网络布线与数制转换