当前位置:网站首页>Missing monitoring: ZABBIX monitors the status of Eureka instance
Missing monitoring: ZABBIX monitors the status of Eureka instance
2022-07-06 06:55:00 【Uncle MUNE loves operation and maintenance】
background
Before formally introducing monitoring , Let's first understand the two components used in our microservice architecture .
Eureka
Eureka As Spring Cloud Registration Center for , It mainly provides the ability of service registration and discovery .
Eureka use CS(Client/Server, client / The server ) framework , It includes the following two components :
Eureka Server
Eureka Service registry , It is mainly used to provide service registration function . When the microservice starts , Will register their services to Eureka Server.Eureka Server Maintains a list of available services , Store all registered to Eureka Server Information about available services , These available services can be found in Eureka Server It can be seen intuitively in the management interface of .Eureka Client
Eureka client , It usually refers to the micro services in the micro service system , Mainly used for and Eureka Server Interact . After the microservice application starts ,Eureka Client Will send to Eureka Server Send a heartbeat ( The default period is 30 second ). if Eureka Server A message was not received in multiple heartbeat cycles Eureka Client The heart of ,Eureka Server Remove it from the list of available services ( Default 90 second ).
Apollo
Apollo( Apollo ) It is a reliable distributed configuration management center , Born in Ctrip framework R & D department , Able to centralize the management of different environments 、 Configuration of different clusters , After configuration modification, it can be pushed to the application end in real time , And have standard authority 、 Process governance and other features , Applicable to microservice configuration management scenarios .
and Spring Cloud Configuration center integration git Warehouse 、svn The warehouse and other configuration sources obtain the system configuration parameters uniformly , from Spring Cloud Config Client consumption . Its configuration file must follow a strict syntax format , Even extra spaces can cause Client Unable to read the corresponding configuration parameters , Once something goes wrong, it's easy to ignore .
be based on Spring Cloud The defect of configuration center , We use Apollo Has been replaced step by step , Make operation and maintenance 、 Development is more convenient for configuration file management between different environments .
problem
With the help of Apollo Can achieve Configuration changes take effect in real time ( Hot release )
, That is, the user Apollo After modifying the configuration and Publishing , The client can be real-time (1 second ) Received the latest configuration , And notify the application .
Apollo The hot release function meets our requirements when changing configuration attributes , Applications can perceive in real time . But in use , If Eureka Client When automatically updating the configuration , Full update , Will lead to Eureka Server The health check status during the heartbeat cycle is as follows :
- client Service discovery status “UNKNOWN”
- client The background is still running normally , Unable to contact Eureka Server Send a normal heartbeat
because Eureka Server Service discovery status is abnormal , At this time, it is impossible to provide external services normally . If the operation and maintenance does not check in time Eureka Manage each client In the state of , Then there will be a production accident .
Be careful : Every client Corresponding to one instance, Now we call it instance.
demand
For the above situations , Although we have been right instance Access to health check , But because of instance There is no alarm during normal operation , It seems that there are still loopholes in our monitoring , So we need to pass Zabbix Yes Eureka instance State monitoring to achieve full coverage of application monitoring ..
Ideas
Eureka Server Registered services with application Dimensions are grouped , Every application There are multiple under instance. So we use Zabbix Autodiscover for , adopt Eureka API You can get all the grouping information , Instead of manually adding monitoring items again every time .
because Zabbix Monitoring items cannot be repeated , So we passed application name /Instance ip Address
Name it , Distinguish between different instance, This requires that our application cannot deploy multiple applications on one server , Otherwise, the monitored items will be repeated .
Be careful : We can actually get by InstanceId
As a distinction, it is more reasonable , however InstanceId The use of is often not standardized , If included ip、 Host name and so on , Because the characters are too long, it may cause unnecessary trouble .
Eureka API
# obtain Eureka be-all application
http://192.168.3.123:1180/eureka/apps
# Get a application All of them instance
http://192.168.3.123:1180/eureka/apps/application name /
Concrete realization
Because you need to parse Eureka API Returned data , So we use python Parsing json data .
instance Auto discovery
# Execute the script to automatically discover application Name and Instance ip Address
python eureka-instance.py discovery
{
"data":[
{
"{#APP}":"TEST1",
"{#HOSTNAME}":"192.168.3.10"
},
{
"{#APP}":"TEST1",
"{#HOSTNAME}":"192.168.3.11"
}
]
}
By acquiring {#APP}
and {#HOSTNAME}
, We can combine them into monitoring items corresponding to the naming rules .
# Monitoring item combination
TEST1/192.168.3.10
TEST1/192.168.3.11
Get monitor item status
Data after automatic discovery , We can further obtain the status of the monitored items .
# 1. obtain instance 10 state
python eureka-instance.py status TEST1 192.168.3.10
# Execution results
UP
# 2. obtain instance 11 state
python eureka-instance.py status TEST1 192.168.3.11
# Execution results
UP
According to the different Instance The state of , As long as the result is not “UP” Then alarm .
The final script
#!/usr/local/miniconda/bin/python
#-*- coding:utf-8 -*-
#comment:
#1.zabbix Auto discovery eureka instance
#2. Yes instance Monitor and alarm the status of
import requests
import json
import sys
from copy import deepcopy
# return json Format data , Otherwise return to xml Format data
headers = {'Accept':'text/html, application/xhtml+xml, application/json;q=0.9, */*;q=0.8'}
def instance_discovery():
app_list = []
url="http://192.168.3.123:1180/eureka/apps/"
try:
response=requests.get(url, headers=headers)
if response.status_code == 200:
instance_dic = {}
#for app in response.json()["applications"]["application"][1:2]:
for app in response.json()["applications"]["application"]:
for instance in app['instance']:
instance_dic['{#APP}'] = instance['app']
instance_dic['{#HOSTNAME}'] = instance['hostName']
# deep copy
app_list.append(deepcopy(instance_dic))
#print(app_list)
#json serialize
discovery_app_info = {"data":app_list}
print(json.dumps(discovery_app_info, sort_keys=True, indent=4, separators=(',', ':')))
except Exception as e:
print(e)
def instance_status():
if len(sys.argv) == 4:
try:
url="http://192.168.3.123:1180/eureka/apps/%s/" % (sys.argv[2])
response=requests.get(url, headers=headers)
if response.status_code == 200:
instance_dic = {}
for instance in response.json()["application"]["instance"]:
if sys.argv[3] == instance["hostName"]:
print(instance["status"])
except Exception as e:
print(e)
else:
print("Usage: python eureka-instance.py status app hostName")
if __name__ == '__main__':
if sys.argv[1] == 'discovery':
instance_discovery()
elif sys.argv[1] == 'status':
instance_status()
else:
print("Usage: python eureka-instance.py [discovery]|[status app hostName]")
Access Zabbix
1. The configuration file
vim eureka.conf
UserParameter=instance_discovery,/usr/local/miniconda/bin/python /etc/zabbix/monitor_scripts/eureka-instance.py discovery
UserParameter=instance_status[*],/usr/local/miniconda/bin/python /etc/zabbix/monitor_scripts/eureka-instance.py status "$1" "$2"
2. Auto discovery
3. Monitor item configuration
4. The alarm information
# 1. Status as DOWN, Alarm occurs
Alarm host : middleware _eureka_192.168.3.123
host IP: 192.168.3.123
Host group : middleware _eureka
Alarm time :2022.06.01 14:58:23
recovery time :2022.06.01 15:13:24
Alarm level :High
The alarm information :Eureka/TEST1/192.168.3.10: Status as DOWN
Alarm items :instance_status[TEST1,192.168.3.10]
Details of the problem :
TEST1/192.168.3.10: DOWN
current state :
Alarm occurs
# 2. Status as UP, Restore alarm
Alarm host : middleware _eureka_192.168.3.123
host IP: 192.168.3.123
Host group : middleware _eureka
Alarm time :2022.06.01 14:58:23
recovery time :2022.06.01 15:13:24
Alarm level :High
The alarm information :Eureka/TEST1/192.168.3.10: Status as DOWN
Alarm items :instance_status[TEST1,192.168.3.10]
Details of the problem :
TEST1/192.168.3.10: UP
current state :
Alarm recovery : UP
边栏推荐
- Windows Server 2016 standard installing Oracle
- GET 和 POST 请求类型的区别
- Introduction to ros2 installation and basic knowledge
- Py06 dictionary mapping dictionary nested key does not exist test key sorting
- [English] Grammar remodeling: the core framework of English Learning -- English rabbit learning notes (1)
- Explain in detail the functions and underlying implementation logic of the groups sets statement in SQL
- Successfully solved typeerror: data type 'category' not understood
- When my colleague went to the bathroom, I helped my product sister easily complete the BI data product and got a milk tea reward
- Delete external table source data
- 机器人类专业不同层次院校课程差异性简述-ROS1/ROS2-
猜你喜欢
Huawei equipment configuration ospf-bgp linkage
18. Multi level page table and fast table
My creation anniversary
医疗软件检测机构怎么找,一航软件测评是专家
万丈高楼平地起,每个API皆根基
这个高颜值的开源第三方网易云音乐播放器你值得拥有
Monotonic stack
Machine learning plant leaf recognition
kubernetes集群搭建Zabbix监控平台
AttributeError: Can‘t get attribute ‘SPPF‘ on <module ‘models. common‘ from ‘/home/yolov5/models/comm
随机推荐
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
自动化测试环境配置
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
Market segmentation of supermarket customers based on purchase behavior data (RFM model)
My creation anniversary
26岁从财务转行软件测试,4年沉淀我已经是25k的测开工程师...
LeetCode Algorithm 2181. 合并零之间的节点
A brief introduction of reverseme in misc in the world of attack and defense
Attributeerror: can 't get attribute' sppf 'on < module' models. Common 'from' / home / yolov5 / Models / comm
[unity] how to export FBX in untiy
C language_ Double create, pre insert, post insert, traverse, delete
ROS学习_基础
简单描述 MySQL 中,索引,主键,唯一索引,联合索引 的区别,对数据库的性能有什么影响(从读写两方面)
这个高颜值的开源第三方网易云音乐播放器你值得拥有
一文读懂简单查询代价估算
LeetCode每日一题(971. Flip Binary Tree To Match Preorder Traversal)
Map of mL: Based on the adult census income two classification prediction data set (whether the predicted annual income exceeds 50K), use the map value to realize the interpretable case of xgboost mod
Brief introduction to the curriculum differences of colleges and universities at different levels of machine human major -ros1/ros2-
C语言_双创建、前插,尾插,遍历,删除
Entity Developer数据库应用程序的开发