当前位置：网站首页>Build Prometheus automatic monitoring and alarm system from scratch

Build Prometheus automatic monitoring and alarm system from scratch

2022-07-26 20:36:00 【Brother Xing plays with the clouds】

Build from scratch Prometheus Monitoring and alarm system

What is? Prometheus?

Prometheus By SoundCloud Open source monitoring and alarm system and time series database developed (TSDB).Prometheus Use Go Language development , yes Google BorgMon Open source version of monitoring system . 2016 Year by year Google launch Linux The original cloud foundation under the foundation (Cloud Native Computing Foundation), take Prometheus Included in its second largest open source project . Prometheus Currently, it is quite active in the open source community . Prometheus and Heapster(Heapster yes K8S A subproject of , Used to get colony Performance data for .) More complete than function 、 More comprehensive .Prometheus The performance is also enough to support tens of thousands of units colony .

Prometheus Characteristics

Multidimensional data model .
Flexible query language .
Do not rely on Distributed Storage , Single The server Nodes are autonomous .
Based on HTTP Of pull Method to collect time series data .
Time series data can be pushed through the intermediate gateway .
Discover target service objects through service discovery or static configuration .
Support a variety of charts and interfaces , such as Grafana etc. .

Official website address ：https://prometheus.io/

Architecture diagram

The basic principle

Prometheus The basic principle of HTTP Protocol periodically grabs the status of monitored components , Any component only needs to provide corresponding HTTP The interface can be connected to the monitoring . No need for any SDK Or other integration processes . This is very suitable for the monitoring system of virtual environment , such as VM、Docker、Kubernetes etc. . Output information of monitored components HTTP Interfaces are called exporter . At present, most of the components commonly used by Internet companies have exporter You can use it directly , such as Varnish、Haproxy、Nginx、MySQL、Linux system information ( Include disk 、 Memory 、CPU、 Network, etc. ).

Service process

Prometheus Daemon Be responsible for regularly grabbing the target metrics( indicators ) data , Each grab target needs to expose one http The interface of the service provides it with regular fetching .Prometheus Support through profile 、 text file 、Zookeeper、Consul、DNS SRV Lookup And so on .Prometheus use PULL Monitoring by , namely The server You can go directly through the target PULL Data or indirectly through the intermediate gateway Push data .
Prometheus Store all fetched data locally , And clear up and organize the data through certain rules , And store the results in a new time series .
Prometheus adopt PromQL And others API Visualize the collected data .Prometheus Support many ways of chart visualization , for example Grafana、 Self contained Promdash And the template engine provided by itself, etc .Prometheus Also provide HTTP API Query mode of , Customize the required output .
PushGateway Support Client Take the initiative to push metrics To PushGateway, and Prometheus It's just going on a regular basis Gateway Grab data up .
Alertmanager Is independent of Prometheus A component of , Can support Prometheus Query statement , Provide very flexible alarm mode .

Three kits

Server Mainly responsible for data acquisition and storage , Provide PromQL Query language support .
Alertmanager Warning Manager , Used for alarm .
Push Gateway Temporary support Job The middle gateway of active push index .

Introduction to this flying pig course

1. Demonstrate the installation Prometheus Server
2. Demonstrated through golang and node-exporter Provide metrics Interface
3. demonstration pushgateway Use
4. demonstration grafana Use
5. demonstration alertmanager Use

Installation preparation

Here mine IP yes 10.211.55.25, Log in , Create corresponding folder

mkdir -p /home/chenqionghe/promethues
mkdir -p /home/chenqionghe/promethues/server
mkdir -p /home/chenqionghe/promethues/client
touch /home/chenqionghe/promethues/server/rules.yml
chmod 777 /home/chenqionghe/promethues/server/rules.yml

Let's start with the three kits

One . install Prometheus Server

adopt docker The way First create a profile /home/chenqionghe/test/prometheus/prometheus.yml You need to change the file permission to 777, Do not cause modification of files on the host Can cause content out of sync problems

global:
  scrape_interval:     15s #  Default fetch interval , 15 Grabs data from target once per second .
  external_labels:
    monitor: 'codelab-monitor'
#  Here is the configuration of the grab object 
scrape_configs:
    # This configuration is an example of time sequence within this configuration , Each one will automatically add this {job_name:"prometheus"} The label of   - job_name: 'prometheus'
    scrape_interval: 5s #  Global fetch interval overridden , from 15 Second rewrite 5 second 
    static_configs:
      - targets: ['localhost:9090']

function

docker rm -f prometheus
docker run --name=prometheus -d \
-p 9090:9090 \
-v /home/chenqionghe/promethues/server/prometheus.yml:/etc/prometheus/prometheus.yml \
-v /home/chenqionghe/promethues/server/rules.yml:/etc/prometheus/rules.yml \
prom/prometheus:v2.7.2 \
--config.file=/etc/prometheus/prometheus.yml \
--web.enable-lifecycle

Add at startup --web.enable-lifecycle Enable remote hot load profile The call instruction is curl -X POST http://localhost:9090/-/reload

visit http://10.211.55.25:9090 We will see the following l Interface

visit http://10.211.55.25:9090/metrics

We've configured 9090 port , Default prometheus I'll grab myself /metrics Interface stay Graph Option can already see the monitored data

Two . Installation client provides metrics Interface

1. adopt golang Provided by the client metrics

mkdir -p /home/chenqionghe/promethues/client/golang/src
cd !$
export GOPATH=/home/chenqionghe/promethues/client/golang/
# Cloning project 
git clone https://github.com/prometheus/client_golang.git
# Installation needs FQ Third party package for 
mkdir -p $GOPATH/src/golang.org/x/
cd !$
git clone https://github.com/golang/net.git
git clone https://github.com/golang/sys.git
git clone https://github.com/golang/tools.git
# Install necessary packages 
go get -u -v github.com/prometheus/client_golang/prometheus
# compile 
cd $GOPATH/src/client_golang/examples/random
go build -o random main.go

function 3 Example metrics Interface

./random -listen-address=:8080 &
./random -listen-address=:8081 &
./random -listen-address=:8082 &

2. adopt node exporter Provide metrics

docker run -d \
--name=node-exporter \
-p 9100:9100 \
prom/node-exporter

Then configure these two interfaces to prometheus.yml, service crond reload curl -X POST http://localhost:9090/-/reload

global:
  scrape_interval:     15s #  Default fetch interval , 15 Grabs data from target once per second .
  external_labels:
    monitor: 'codelab-monitor'
rule_files:
  #- 'prometheus.rules'
#  Here is the configuration of the grab object 
scrape_configs:
  # This configuration is an example of time sequence within this configuration , Each one will automatically add this {job_name:"prometheus"} The label of   - job_name: 'prometheus'
  - job_name: 'prometheus'
    scrape_interval: 5s #  Global fetch interval overridden , from 15 Second rewrite 5 second 
    static_configs:
      - targets: ['localhost:9090']
      - targets: ['http://10.211.55.25:8080', 'http://10.211.55.25:8081','http://10.211.55.25:8082']
        labels:
          group: 'client-golang'
      - targets: ['http://10.211.55.25:9100']
        labels:
          group: 'client-node-exporter'

You can see that the interfaces are working

prometheus A variety of exporter Tools , If you are interested, you can study it

3、 ... and . install pushgateway

pushgateway To allow temporary and batch jobs to disclose their metrics to Prometheus . Because such jobs may not last long , Unable to grab , So they can push the indicators to the push gateway . Prometheus Data collection is used pull Pull model , This is from what we just set up 5 The second parameter will tell . But some data are not suitable for this way , This data can be used Push Gateway service . It's like a cache , When the data collection is completed , Upload here , from Prometheus Later again pull To come over . So let's try that , First start Push Gateway

mkdir -p /home/chenqionghe/promethues/pushgateway
cd !$
docker run -d -p 9091:9091 --name pushgateway prom/pushgateway

visit http://10.211.55.25:9091 already pushgateway It's working

Next we can go pushgateway Push data ,prometheus Multi lingual sdk, The easiest way is through shell

Push an indicator

echo "cqh_metric 3.14" | curl --data-binary @- http://Ubuntu-linux:9091/metrics/job/cqh

Push multiple metrics

cat <<EOF | curl --data-binary @- http://10.211.55.25:9091/metrics/job/cqh/instance/test
#  Price of exercise place 
muscle_metric{label="gym"} 8800
#  Three big data  kg
bench_press 100
dead_lift 160
deep_squal 160
EOF

Then we will pushgateway Configuration to prometheus.yml inside , Overload configuration See that you can search the indicators just pushed

Four . install Grafana Exhibition

Grafana Is an open source program for visualizing large measurement data , It provides a powerful and elegant way to create 、 share 、 Browsing data . Dashboard You're different metric Data in data source . Grafana Most commonly used in Internet infrastructure and application analysis , But it's also useful in other areas , such as ： Industrial sensor 、 Home automation 、 Process control, etc . Grafana Support hot plug control panel and scalable data source , Currently supported Graphite、InfluxDB、OpenTSDB、Elasticsearch、Prometheus etc. .

We use docker install

docker run -d -p 3000:3000 --name grafana grafana/grafana

The default login account and password are admin, The interface after entering is as follows

Let's add a data source

hold Prometheus Fill in the address of

Import prometheus The template of

Open the upper left corner to select the imported template, and you will see that there are already various graphs

Let's add a chart of our own

Specify icons and keywords you want to see , Top right save

See the following data

So far, we have realized the automatic collection and display of data , Let's talk about it prometheus How to alarm automatically

5、 ... and . install AlterManager

Pormetheus Warning of consists of two separate parts . Prometheus Warning rules in the service send warnings to Alertmanager. And then this Alertmanager Manage these warnings . Include silencing, inhibition, aggregation, And how to send notifications , for example ：email,PagerDuty and HipChat. Main steps for establishing warnings and notifications ：

Create and configure Alertmanager
start-up Prometheus The service , adopt -alertmanager.url Flag configuration Alermanager Address , In order to Prometheus Service ability and Alertmanager Establishing a connection .

Create and configure Alertmanager

mkdir -p /home/chenqionghe/promethues/alertmanager
cd !$

create profile alertmanager.yml

global:
  resolve_timeout: 5m
route:
  group_by: ['cqh']
  group_wait: 10s # Group alarm wait time 
  group_interval: 10s # Group alarm interval 
  repeat_interval: 1m # Repeated alarm interval 
  receiver: 'web.hook'
receivers:
  - name: 'web.hook'
    webhook_configs:
      - url: 'http://10.211.55.2:8888/open/test'
inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

It's configured as web.hook The way , When server notice alertmanager Automatically called webhook http://10.211.55.2:8888/open/test

Run below altermanager

docker rm -f alertmanager
docker run -d -p 9093:9093 \
--name alertmanager \
-v /home/chenqionghe/promethues/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml \
prom/alertmanager

visit http://10.211.55.25:9093

I'm going to modify Server End configuration alarm rules and altermanager Address Modify rules /home/chenqionghe/promethues/server/rules.yml

groups:
  - name: cqh
    rules:
      - alert: cqh test 
        expr: dead_lift > 150
        for: 1m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}: Hard pull exceeding standard ！lightweight baby!!!"
          description: "{{$labels.instance}}: Hard pull exceeding standard ！lightweight baby!!!"

This rule means , Hard pull over 150 kg , For one minute , Just call the police And then modify it prometheus add to altermanager To configure

global:
  scrape_interval:     15s #  Default fetch interval , 15 Grabs data from target once per second .
  external_labels:
    monitor: 'codelab-monitor'
rule_files:
  - /etc/prometheus/rules.yml
#  Here is the configuration of the grab object 
scrape_configs:
  # This configuration is an example of time sequence within this configuration , Each one will automatically add this {job_name:"prometheus"} The label of   - job_name: 'prometheus'
  - job_name: 'prometheus'
    scrape_interval: 5s #  Global fetch interval overridden , from 15 Second rewrite 5 second 
    static_configs:
      - targets: ['localhost:9090']
      - targets: ['10.211.55.25:8080', '10.211.55.25:8081','10.211.55.25:8082']
        labels:
          group: 'client-golang'
      - targets: ['10.211.55.25:9100']
        labels:
          group: 'client-node-exporter'
      - targets: ['10.211.55.25:9091']
        labels:
          group: 'pushgateway'
alerting:
  alertmanagers:
    - static_configs:
        - targets: ["10.211.55.25:9093"]

heavy load prometheus To configure , The rules are in effect Now let's look at grafana Changes in data in

And then we click prometheus Of Alert modular , You'll see it's green -> yellow - red , Alarm triggered

Then let's take a look at what's available webhook Interface , I use the interface here golang Written , After receiving the data body Content alarm to nail

The alarm content received by nail is as follows

Come here , Build from scratch Prometheus The introduction of automatic monitoring and alarm is finished , One stop service , Auto grab interface + automatic alarm + Elegant graphic display , What are you waiting for , hurriedly high get up ！

原网站

版权声明
本文为[Brother Xing plays with the clouds]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/207/202207261948488643.html