当前位置:网站首页>I've been rejected by the product manager. Why don't you know

I've been rejected by the product manager. Why don't you know

2020-11-06 01:18:00 Yin Jihuan

Preface

A few days ago, I was chatting with readers , He said he was rejected by the product manager . The reason is online Bug 了 , Finally, it's the feedback from customers that we know .

I asked him : Are you not monitoring ?

readers : We are a newly established entrepreneurial team , The most important thing at the moment is the heap function , A lot of infrastructure doesn't have time to do .

How big a bowl is, how much rice do you eat , Don't blindly pursue large scale , That's a great plan , Just the right one . So is monitoring , Small plans as long as they're enough , Can solve problems , It's also a very good choice .

Let's introduce some common abnormal monitoring methods :

Cost minimization

If it's a newly established entrepreneurial team , We can use the minimum implementation cost to monitor the system exception in real time . The so-called minimum implementation cost , It can be implemented without relying on any three-party framework .

It can be used to manually bury the point to alarm the abnormality , In this way, it is better to alarm at the place of global exception handling , Can be managed in a unified way .

As the code shows :

  
  1. @ExceptionHandler(value = Exception.class)
  2. @ResponseBody
  3. public ResponseData<Object> defaultErrorHandler(HttpServletRequest req, Exception e) {
  4. // Record exceptions
  5. // Pin or SMS alert
  6. }

 picture

When we have global exception handling in our project , When the bottom reports an error , All exceptions will enter ExceptionHandler To deal with , stay ExceptionHandler We can pass HttpServletRequest To get the response request information and exception information , And then give an alarm .

Abnormal alarm information

Abnormal alarm information must be detailed , When something goes wrong on the line , The first time to fix this problem . If you don't have detailed information, you can't reproduce the problem at all , It's not easy to locate and solve .

Alarm information needs to have the following content :

  
  1. Alarm service :mobile-gateway
  2. person in charge :yinjihuan
  3. Request address :http://xxx.com/xxx/xxx?id=xxx
  4. Request body :{ "name": "xxx" }
  5. Request header :key=value
  6. Abnormal code :500
  7. Exception types :RuntimeException
  8. Exception stack :java.lang.RuntimeException: com.xxx.exception.ApplicationException: obtain XXX Information failed !

The most important thing is the request parameters , Errors can only be reproduced with parameters . What needs to be noticed is through HttpServletRequest An error will be reported when getting the request body , Because the stream can only be read once .

By the time the global exception handling class has been read , So we need to do something special , Write a filter to cache the value of the request body , Sure org.springframework.web.util.ContentCachingRequestWrapper Yes HttpServletRequest Decorate , And then through ContentCachingRequestWrapper Get request body .

Cost minimization + Give consideration to performance

The method of manual burying point can give real-time alarm to the abnormality , And then directly send SMS and other warning information , This process is synchronous , More or less increases the response time , However, if the request enters the exception handling area, it proves that the request has failed , The impact is not big .

Although the impact is not big , But you can still optimize it a little bit . The most common optimization method is to convert synchronous operation to asynchronous operation , For example, it is dropped into a separate thread pool for alarm , Put it in the memory queue , Use a single thread to get the alarm .

Local asynchrony may be lost , There are a few problems with this type of monitoring information loss , If you don't want to lose , An external message queue can be used to store alarm information , There are individual consumers who consume , Alarm operation .

 picture

Unified log monitoring

Ways to minimize costs , Just a few dozen lines of code can do it . The bad thing is that every project has to have this code , The alarm logic is also coupled in the code ..

what EFK,ELK I believe everyone has heard of , Collect the logs in a unified way , centralized management . Each system needs to write exception information to the local log when there is an error , There is no need to separately alert the exception , The alarm action can be done by a separate alarm system , The alarm system judges according to the collected logs , Whether an alarm is needed , Alarm frequency, etc .

 picture

Unified log monitoring needs to build a log platform , The cost is relatively high . Of course, you can also use open source solutions , There are also commercial solutions .

Business can use cloud services , Easy to use , Fast access , Support alarm rules of various dimensions , It's just a little expensive .

If you just want to monitor exceptions , I recommend an open source error tracking system ,Sentry Is an open source real-time error tracking system , It can help developers monitor and fix abnormal problems in real time , Of course Sentry There's also a commercial version .

APM monitor

apm(Application Performance Management) In addition to the call chain to the service , Performance is monitored in detail , At the same time, it also has better monitoring of abnormal information .

common apm Yes skywalking,pinpoint,cat etc. , With cat For example ,problem The report shows the error information of the application , And in cat The home page of the market will show the error of each application by minute , If there are a lot of mistakes , The color of the big plate is red , When you see a piece of red , There are too many exceptions .

 picture

Of course cat It also has alarm function , It is unrealistic to see the market by artificial timing , When there is a mistake , It's the timely warning that makes sense . Want to know more about cat You can take a look at my article :https://mp.weixin.qq.com/s/3mqmySr2nv4Xpd6nZlfsVg

summary

Do a minimum cost anomaly monitoring , It is estimated that it will be finished in one day . If you don't do , Then we can only wait for being rejected . You can't control it bug Next to impossible , If it's a program, there's bound to be bug. All we need to do is go out bug The first time I found this bug, And destroy it .

It's not easy to code words , If you can, let's have a triple shot , thank !

About author : Yin Jihuan , Simple technology enthusiasts ,《Spring Cloud Microservices - Full stack technology and case analysis 》, 《Spring Cloud Microservices introduction Actual combat and advanced 》 author , official account Ape world Originator .

I have compiled a complete set of learning materials , Those who are interested can search through wechat 「 Ape world 」, Reply key 「 Learning materials 」 Get what I've sorted out Spring Cloud,Spring Cloud Alibaba,Sharding-JDBC Sub database and sub table , Task scheduling framework XXL-JOB,MongoDB, Reptiles and other related information .

版权声明
本文为[Yin Jihuan]所创,转载请带上原文链接,感谢