当前位置:网站首页>What's the point of monitoring the involution of the system?
What's the point of monitoring the involution of the system?
2022-08-05 06:15:00 【LinkSLA】
The volume of monitoring, not one day practice
Monitoring is the starting point and difficulty of operation and maintenance.The system needs to perform these functions:
Full stack monitoring;
Association Analysis;
Concatenation across system calls;
Real-time alarm and automatic disposal;
System performance analysis.
Two scenarios of operation and maintenance: abnormal detection and early warning.In other words, good monitoring mainly serves two scenarios: experience and emergency.What is good monitoring?
1. The alarm is timely, and the designated user is notified as soon as possible to solve the fault in time and prevent the fault from spreading.
2. The alarms are accurate and cannot be falsely reported, missed or repeated, and accurate information is pushed to users.
3. The monitoring objects should be comprehensive, from the dynamic ring of the computer room, hardware equipment, operating system, application components to business systems to carry out full-stack monitoring.
4. The operation and maintenance must be closed-loop, from the alarm trigger to the convergence as an event/work order, and the receipt, processing and termination of the work order are completed within the time specified in the SLA.
Traditional O&M relies on manpower to monitor system operating status, performance indicators, and online and change services.
With the advancement of digitization, servers, software modules, and access data have proliferated, the number and complexity of IT systems have increased, and the number of monitoring projects has become difficult to deal with, and accidents cannot be accurately located.change.
Attack: Data Standards and Value Output
Rome was not built in a day, and the operation and maintenance platform has also experienced the development process of manual tooling, tool platformization, and platform intelligence.The more prominent advantages of intelligent operation and maintenance are data standards and delivery value.
To fully mine the data value of operation and maintenance, find all problems, pinpoint problems, and reduce problems.Multi-layer monitoring, breaking the island of operation and maintenance, full-stack monitoring objects, including:
01
Hardware
Hardware equipment is the basis for monitoring operation and maintenance. Monitoring includes: computer room dynamic ring, server, network equipment, storage equipment, etc.
02
Virtualization
For example: vsphere, powerVM, hyper-V, docker, K8S, etc.
03
Operating System
Support Windows, Linux, AIX and other operating systems.
04
App Components
Supports common commercial and open source components including databases and middleware.
05
Business Systems
Supports monitoring each component of the business system as a logical monitoring object through BPV (Business Process View).
Monitoring of full-stack objects improves operation and maintenance efficiency, and solves the problems of inaccurate alarms, difficult problem location, and difficult root cause location.Identify and locate problems proactively, quickly, and accurately.
1. Quality Assurance
Including abnormal detection, fault diagnosis, fault prediction, fault self-healing.
2. Cost management
Metric monitoring, anomaly detection, resource optimization, capacity planning, performance optimization.
3. Efficiency improvement
Smart changes, machine learning algorithms, security.
LinkSLA started algorithm research and selection of specific implementation scenarios in 2018. It has made breakthroughs in single-point application in full-stack monitoring, anomaly detection, and log anomaly detection, and has achieved remarkable results, bringing more data value to business.Provide decision-making basis for enterprise development.
边栏推荐
猜你喜欢
The problem of calling ds18b20 through a single bus
RAID磁盘阵列
lvm逻辑卷及磁盘配额
入门文档10 资源映射
增长:IT运维发展趋势报告
Remembering my first CCF-A conference paper | After six rejections, my paper is finally accepted, yay!
dsf5.0新建页面访问时重定向到首页的问题
spark source code - task submission process - 1-sparkSubmit
In-depth Zabbix user guide - from the green boy
时间复杂度和空间复杂度
随机推荐
NIO works is analysed
【Day8】磁盘及磁盘的分区有关知识
Apache配置反向代理
运维工程师,快来薅羊毛
One-arm routing and 30% switch
wc、grep、tar、vi/vim
TCP/IP四层模型
Switch principle
Three modes of vim
静态路由
idea 常用快捷键
Spark source code - task submission process - 4-container to start executor
Cloud computing - osi seven layers and TCP\IP protocol
ACLs and NATs
单臂路由与三成交换机
I/O性能与可靠性
spark source code-RPC communication mechanism
入门文档05 使用cb()指示当前任务已完成
【Day8】 RAID磁盘阵列
Hugo builds a personal blog