当前位置:网站首页>How to monitor micro web services
How to monitor micro web services
2022-07-29 04:52:00 【nginx】
At first, I didn't completely know how to monitor these websites , So I want to quickly write down how I did it .
I'm not going to talk about how to monitor large 、 Serious mission critical websites , Just talk about small and unimportant websites .

The goal is : It takes little time to operate I hope the website can work normally most of the time , But I also hope not to spend time on continuous operation .
I was initially very vigilant about running servers , Because in my last job , I am a 24/7 Take turns on duty , Responsible for some key services , In my impression ,“ Responsible for the server ” signify “ In the morning 2 Click to be called up to repair the server ” and “ There are many complex dashboards ”.
So for a while I only did static websites , So I don't have to think about the server .
But finally I realized , The risk of any server I want to write is very low , If they go down occasionally 2 Hours is no big deal , I just need to set up some very simple monitoring to help them keep running .
It's bad not to monitor At first , I haven't set up any monitoring for my server at all . The result of this is very predictable : Sometimes the website breaks down , But I didn't find , Until someone told me !
step 1:uptime The viewer The first step is to establish a uptime The viewer . There are many such things outside , I'm using updown.io and uptime robot. I prefer updown User interface and pricing structure ( It is on request, not on a monthly basis ), but uptime Robots have a more generous free package .
They will :
- Check whether the website is normal
- If there is a fault , It will email me
step 2: End to end health check Next , Let's talk about “ Check whether the website is normal ” What does it mean .
At first , I just turn one of my health check endpoints into a function , It will come back anyway 200 OK.
This is quite useful – It tells me that the server is started !
But as expected , I have a problem , Because it didn't check API Is it true that Work – Sometimes the health check-up succeeds , Although other parts of the service have actually entered a bad state .
So I updated it , Let it really send API request , And make sure it succeeds .
All my services have done very little (nginx playground There's only one endpoint ), So setting up a health check-up is very easy , It actually runs through most of the actions that services should do .
Here is nginx playground What the end-to-end health check handler looks like . It's very basic : It just sends out a POST request ( Give yourself ), And check whether the request succeeds or fails .
Now? , Most of my health checks run every hour , Some every 30 Run every minute .
func healthHandler(w http.ResponseWriter, r *http.Request) {// make a request to localhost:8080 with `healthcheckJSON` as the body// if it works, return 200// if it doesn't, return 500client := http.Client{}resp, err := client.Post("(healthcheckJSON))if err != nil {log.Println(err)w.WriteHeader(http.StatusInternalServerError)return}if resp.StatusCode != http.StatusOK {log.Println(resp.StatusCode)w.WriteHeader(http.StatusInternalServerError)return}w.WriteHeader(http.StatusOK)}
I run every hour , because updown.io The price of is calculated according to the number of health checks , I'm monitoring 18 Different URL, And I want to keep my health examination budget at 5 dollar / The lowest level of the year .
Take an hour to find that one of these websites is down , It's OK for me – If there are questions , I can't guarantee to repair it soon .
If you can run them more often , I may every 5-10 Run every minute .
step 3: The third step : If the health check fails , Automatic restart Some of my websites are fly.io On ,fly It has a fairly standard function , I can configure one for a service HTTP health examination , If the health check fails , Just restart the service .
“ Frequent restart ” It's a very useful strategy to make up for what I haven't fixed bug, For a while ,nginx playground There is a process leak ,nginx The process was not terminated , So the memory of the server has been running out .
Pass the health check , As a result, , This happens every other day or so :
- The server is out of memory
- The health check began to fail
- It was restarted
- It's all right again
- After a few hours, repeat the whole legend again
These health checks used to decide whether to restart the service run more frequently : Every time 5 About minutes .
This is not the best way to monitor large Services This may be obvious , I have said it at the beginning , however “ Write a HTTP health examination ” Not the best way to monitor large and complex services . But I won't discuss it in depth , Because this is not the subject of this article .
It has been running well so far ! I was at first 3 I wrote this article in April three months ago , But I waited until now to release it to make sure the whole setup works .
This makes a big difference – Before I encounter some very stupid downtime problems , Now in the past few months , The running time of the website has reached 99.95%!
From the original :
边栏推荐
- Introduction to auto.js script development
- 央企建筑企业数字化转型核心特征是什么?
- [C language] PTA 7-47 binary leading zero
- Makefile+make Basics
- Basic grammar of C language
- I++ and ++i details
- ssm整合增删改查
- Use more flexible and convenient Rogowski coil
- img 响应式图片的实现(含srcset属性、sizes属性的使用方法,设备像素比详解)
- Install the gym corresponding to mujoco in the spinning up tutorial, and the error mjpro150 is reported
猜你喜欢

Climbing the pit of traffic flow prediction (III): using pytorch to realize LSTM to predict traffic flow

Use jupyter (2) to establish shortcuts to open jupyter and common shortcut keys of jupyter

Improve the readability of your regular expressions a hundred times

SGuard64.exe ACE-Guard Client EXE:造成磁盘经常读写,游戏卡顿,及解决方案

Mysql各版本下载地址及多版本共存安装

2022杭电多校联赛第四场 题解

Opencv learning 1 (environment configuration)

File operation (Advanced C language)
![[express connection to MySQL database]](/img/a6/d68327fa74b8c94d250ea469301839.png)
[express connection to MySQL database]

Flink+Iceberg环境搭建及生产问题处理
随机推荐
Common rules of makefile (make) (II)
How to avoid damage of oscilloscope current probe
Oracle update and delete data
Traffic flow prediction pit climbing record (I): traffic flow data set, original data
IOS interview preparation - IOS
Christmas tree web page and Christmas tree application
Recyclerview switches the focus up and down through the dpad key. When switching to the control outside the interface, the focus will jump left and right
Mujoco and mujoco_ Install libxcursor.so 1:NO such dictionary
Google browser opens the web page and out of memory appears
Pyqt5 learning pit encounter and pit drainage (3) background picture coverage button style and check button status
Use more flexible and convenient Rogowski coil
Makefile+make Basics
Torch.nn.crossentropyloss() details
RecyclerView通过DPAD按键上下切换焦点 切换到界面外的控件时焦点会左右乱跳
钉钉对话框文子转换成图片 不能复制粘贴到文档上
I++ and ++i details
命令行交互工具(最新版) inquirer 实用教程
Flutter 手势监听和画板实现
2022杭电多校联赛第四场 题解
Download addresses of various versions of MySQL and multi version coexistence installation