当前位置:网站首页>Missing monitoring: ZABBIX monitors the status of Eureka instance
Missing monitoring: ZABBIX monitors the status of Eureka instance
2022-07-06 06:55:00 【Uncle MUNE loves operation and maintenance】
background
Before formally introducing monitoring , Let's first understand the two components used in our microservice architecture .
Eureka
Eureka As Spring Cloud Registration Center for , It mainly provides the ability of service registration and discovery .
Eureka use CS(Client/Server, client / The server ) framework , It includes the following two components :
Eureka Server
Eureka Service registry , It is mainly used to provide service registration function . When the microservice starts , Will register their services to Eureka Server.Eureka Server Maintains a list of available services , Store all registered to Eureka Server Information about available services , These available services can be found in Eureka Server It can be seen intuitively in the management interface of .Eureka Client
Eureka client , It usually refers to the micro services in the micro service system , Mainly used for and Eureka Server Interact . After the microservice application starts ,Eureka Client Will send to Eureka Server Send a heartbeat ( The default period is 30 second ). if Eureka Server A message was not received in multiple heartbeat cycles Eureka Client The heart of ,Eureka Server Remove it from the list of available services ( Default 90 second ).
Apollo
Apollo( Apollo ) It is a reliable distributed configuration management center , Born in Ctrip framework R & D department , Able to centralize the management of different environments 、 Configuration of different clusters , After configuration modification, it can be pushed to the application end in real time , And have standard authority 、 Process governance and other features , Applicable to microservice configuration management scenarios .
and Spring Cloud Configuration center integration git Warehouse 、svn The warehouse and other configuration sources obtain the system configuration parameters uniformly , from Spring Cloud Config Client consumption . Its configuration file must follow a strict syntax format , Even extra spaces can cause Client Unable to read the corresponding configuration parameters , Once something goes wrong, it's easy to ignore .
be based on Spring Cloud The defect of configuration center , We use Apollo Has been replaced step by step , Make operation and maintenance 、 Development is more convenient for configuration file management between different environments .
problem
With the help of Apollo Can achieve Configuration changes take effect in real time ( Hot release ), That is, the user Apollo After modifying the configuration and Publishing , The client can be real-time (1 second ) Received the latest configuration , And notify the application .
Apollo The hot release function meets our requirements when changing configuration attributes , Applications can perceive in real time . But in use , If Eureka Client When automatically updating the configuration , Full update , Will lead to Eureka Server The health check status during the heartbeat cycle is as follows :
- client Service discovery status “UNKNOWN”
- client The background is still running normally , Unable to contact Eureka Server Send a normal heartbeat
because Eureka Server Service discovery status is abnormal , At this time, it is impossible to provide external services normally . If the operation and maintenance does not check in time Eureka Manage each client In the state of , Then there will be a production accident .
Be careful : Every client Corresponding to one instance, Now we call it instance.
demand
For the above situations , Although we have been right instance Access to health check , But because of instance There is no alarm during normal operation , It seems that there are still loopholes in our monitoring , So we need to pass Zabbix Yes Eureka instance State monitoring to achieve full coverage of application monitoring ..
Ideas
Eureka Server Registered services with application Dimensions are grouped , Every application There are multiple under instance. So we use Zabbix Autodiscover for , adopt Eureka API You can get all the grouping information , Instead of manually adding monitoring items again every time .
because Zabbix Monitoring items cannot be repeated , So we passed application name /Instance ip Address Name it , Distinguish between different instance, This requires that our application cannot deploy multiple applications on one server , Otherwise, the monitored items will be repeated .
Be careful : We can actually get by InstanceId As a distinction, it is more reasonable , however InstanceId The use of is often not standardized , If included ip、 Host name and so on , Because the characters are too long, it may cause unnecessary trouble .
Eureka API
# obtain Eureka be-all application
http://192.168.3.123:1180/eureka/apps
# Get a application All of them instance
http://192.168.3.123:1180/eureka/apps/application name /
Concrete realization
Because you need to parse Eureka API Returned data , So we use python Parsing json data .
instance Auto discovery
# Execute the script to automatically discover application Name and Instance ip Address
python eureka-instance.py discovery
{
"data":[
{
"{#APP}":"TEST1",
"{#HOSTNAME}":"192.168.3.10"
},
{
"{#APP}":"TEST1",
"{#HOSTNAME}":"192.168.3.11"
}
]
}
By acquiring {#APP} and {#HOSTNAME}, We can combine them into monitoring items corresponding to the naming rules .
# Monitoring item combination
TEST1/192.168.3.10
TEST1/192.168.3.11
Get monitor item status
Data after automatic discovery , We can further obtain the status of the monitored items .
# 1. obtain instance 10 state
python eureka-instance.py status TEST1 192.168.3.10
# Execution results
UP
# 2. obtain instance 11 state
python eureka-instance.py status TEST1 192.168.3.11
# Execution results
UP
According to the different Instance The state of , As long as the result is not “UP” Then alarm .
The final script
#!/usr/local/miniconda/bin/python
#-*- coding:utf-8 -*-
#comment:
#1.zabbix Auto discovery eureka instance
#2. Yes instance Monitor and alarm the status of
import requests
import json
import sys
from copy import deepcopy
# return json Format data , Otherwise return to xml Format data
headers = {'Accept':'text/html, application/xhtml+xml, application/json;q=0.9, */*;q=0.8'}
def instance_discovery():
app_list = []
url="http://192.168.3.123:1180/eureka/apps/"
try:
response=requests.get(url, headers=headers)
if response.status_code == 200:
instance_dic = {}
#for app in response.json()["applications"]["application"][1:2]:
for app in response.json()["applications"]["application"]:
for instance in app['instance']:
instance_dic['{#APP}'] = instance['app']
instance_dic['{#HOSTNAME}'] = instance['hostName']
# deep copy
app_list.append(deepcopy(instance_dic))
#print(app_list)
#json serialize
discovery_app_info = {"data":app_list}
print(json.dumps(discovery_app_info, sort_keys=True, indent=4, separators=(',', ':')))
except Exception as e:
print(e)
def instance_status():
if len(sys.argv) == 4:
try:
url="http://192.168.3.123:1180/eureka/apps/%s/" % (sys.argv[2])
response=requests.get(url, headers=headers)
if response.status_code == 200:
instance_dic = {}
for instance in response.json()["application"]["instance"]:
if sys.argv[3] == instance["hostName"]:
print(instance["status"])
except Exception as e:
print(e)
else:
print("Usage: python eureka-instance.py status app hostName")
if __name__ == '__main__':
if sys.argv[1] == 'discovery':
instance_discovery()
elif sys.argv[1] == 'status':
instance_status()
else:
print("Usage: python eureka-instance.py [discovery]|[status app hostName]")
Access Zabbix
1. The configuration file
vim eureka.conf
UserParameter=instance_discovery,/usr/local/miniconda/bin/python /etc/zabbix/monitor_scripts/eureka-instance.py discovery
UserParameter=instance_status[*],/usr/local/miniconda/bin/python /etc/zabbix/monitor_scripts/eureka-instance.py status "$1" "$2"
2. Auto discovery

3. Monitor item configuration

4. The alarm information
# 1. Status as DOWN, Alarm occurs
Alarm host : middleware _eureka_192.168.3.123
host IP: 192.168.3.123
Host group : middleware _eureka
Alarm time :2022.06.01 14:58:23
recovery time :2022.06.01 15:13:24
Alarm level :High
The alarm information :Eureka/TEST1/192.168.3.10: Status as DOWN
Alarm items :instance_status[TEST1,192.168.3.10]
Details of the problem :
TEST1/192.168.3.10: DOWN
current state :
Alarm occurs
# 2. Status as UP, Restore alarm
Alarm host : middleware _eureka_192.168.3.123
host IP: 192.168.3.123
Host group : middleware _eureka
Alarm time :2022.06.01 14:58:23
recovery time :2022.06.01 15:13:24
Alarm level :High
The alarm information :Eureka/TEST1/192.168.3.10: Status as DOWN
Alarm items :instance_status[TEST1,192.168.3.10]
Details of the problem :
TEST1/192.168.3.10: UP
current state :
Alarm recovery : UP
边栏推荐
- C语言_双创建、前插,尾插,遍历,删除
- 一文读懂简单查询代价估算
- 医疗软件检测机构怎么找,一航软件测评是专家
- 机器学习植物叶片识别
- Misc of BUU (update from time to time)
- Number of query fields
- 成功解决TypeError: data type ‘category‘ not understood
- Pallet management in SAP SD delivery process
- Supporting title of the book from 0 to 1: ctfer's growth road (Zhou Geng)
- Arduino tutorial - Simon games
猜你喜欢

(practice C language every day) reverse linked list II

Blue Bridge Cup zero Foundation National Championship - day 20

After working for 10 years, I changed to a programmer. Now I'm 35 + years old and I'm not anxious
![[unity] how to export FBX in untiy](/img/03/b7937a1ac1a677f52616186fb85ab3.jpg)
[unity] how to export FBX in untiy

机器学习植物叶片识别
![[brush questions] how can we correctly meet the interview?](/img/89/a5b874ba4db97fbb3d330af59c387a.png)
[brush questions] how can we correctly meet the interview?

Introduction and underlying analysis of regular expressions

Leetcode - 152 product maximum subarray

《从0到1:CTFer成长之路》书籍配套题目(周更)

Misc of BUU (update from time to time)
随机推荐
Delete external table source data
Entity Developer数据库应用程序的开发
SQL Server manager studio(SSMS)安装教程
Attributeerror successfully resolved: can only use cat accessor with a ‘category‘ dtype
医疗软件检测机构怎么找,一航软件测评是专家
BUU的MISC(不定时更新)
我的创作纪念日
将ue4程序嵌入qt界面显示
同事上了个厕所,我帮产品妹子轻松完成BI数据产品顺便得到奶茶奖励
LeetCode Algorithm 2181. 合并零之间的节点
【Hot100】739. 每日溫度
AI on the cloud makes earth science research easier
接口自动化测试框架:Pytest+Allure+Excel
SAP SD发货流程中托盘的管理
UDP攻击是什么意思?UDP攻击防范措施
Day 245/300 JS forEach 多层嵌套后数据无法更新到对象中
AttributeError: Can‘t get attribute ‘SPPF‘ on <module ‘models.common‘ from ‘/home/yolov5/models/comm
Day 239/300 注册密码长度为8~14个字母数字以及标点符号至少包含2种校验
Due to high network costs, arbitrum Odyssey activities are suspended, and nitro release is imminent
Monotonic stack