当前位置:网站首页>On December 8th, 2020, the memory of marketing MRC application suddenly increased, resulting in system oom
On December 8th, 2020, the memory of marketing MRC application suddenly increased, resulting in system oom
2022-07-07 08:57:00 【bboyzqh】
List of articles
background
12.08 At noon on the th mrc Applications suddenly appear, memory continues to rise , From 67% Rise to 85% about ( Monitoring is as follows ), Fortunately, the rising process is relatively slow , A decisive restart solved the problem . The process of solving and analyzing problems is as follows .
Problem solving process

mrc It's the bottom application of marketing , Main partial rule calculation , common 6 Taiwan machine (2 Next cluster , And cluster traffic is isolated from each other , Such as the upper layer hipc Cluster traffic will not be requested to k8s Cluster machines ),6 At the same time, the memory keeps rising , Refer to sketch 1 .
Because it was a big promotion at noon that day , Considering that there are only 3 Taiwan machine , I'm afraid that in the process of restarting one , The other two can't stand the flow of big promotion , At first, I didn't dare to consider a single restart , After a short period of time, the decision was made taking into account the cpu Only 5% about , The worst worry is that memory can't take care of it all of a sudden , If frequent gc May affect the normal traffic access , So prepare for the worst : Restart decisively ( Remove traffic before restart , meanwhile dump Memory for subsequent analysis ), As a result, there was no problem , Refer to sketch 2 . The whole process is as follows :
- The target restarts the machine for traffic removal , Adjust to restart the machine dubbo The weight of 0 that will do , because dump Memory processes are memory consuming operations , Server may appear feign death phenomenon, affect normal call , So we need to remove the traffic .
- Force the target machine to restart once full gc, The purpose is to reclaim the normal memory object occupation , To prevent the normal memory occupation and the influence of real memory leak objects , The impact analysis , You can use the following command :
- dump Next target machine memory , The order is as follows :
jmap -histo:live 13 ( Trigger full gc)
or
jmap -dump:live,file=dump_001.bin 13 ( Trigger full gc, When triggered, put dump_001.bin File deletion )
or
jcmd 13 GC.run ( Trigger young gc)
- Use IBMAnalyzer( perhaps jdk Self contained jvisualvm Tools or mat Tools ) Yes dump File analysis is enough
jmap -dump:format=b,file=dumpFile 13
After the event, the best plan is to add a new one to Tongyun maintenance mrc machine , And then restart each one , Refer to sketch 3 .
Post analysis
After the event dump Document analysis , As it involves specific business, I will not elaborate on it , Just describe the conclusion : Because that day mrc Configuring the shadow library results in . The root cause is druid Threads that monitor shadow library configuration will not exit with the end of the pressure test , stay mrc After pressure testing, the thread creation is triggered without restart , Lead to mrc Application memory keeps rising .
Welcome to WeChat official account. : Fang Chen's blog 
边栏推荐
- Unityshader introduction essentials personal summary -- Basic chapter (I)
- MAC OSX php dyld: Library not loaded: /usr/local/xxxx. dylib
- Lenovo hybrid cloud Lenovo xcloud: 4 major product lines +it service portal
- Output all composite numbers between 6 and 1000
- Three usage scenarios of annotation @configurationproperties
- NCS Chengdu Xindian interview experience
- 为不同类型设备构建应用的三大更新 | 2022 I/O 重点回顾
- Recommended by Alibaba P8, the test coverage tool - Jacobo is very practical
- [wechat applet: cache operation]
- Gson转换实体类为json时报declares multiple JSON fields named
猜你喜欢

Three series of BOM elements

C language for calculating the product of two matrices

let const

Mountaineering team (DFS)

Goldbach conjecture C language
![[step on the pit] Nacos registration has been connected to localhost:8848, no available server](/img/ee/ab4d62745929acec2f5ba57155b3fa.png)
[step on the pit] Nacos registration has been connected to localhost:8848, no available server

2022-06-30 unity core 8 - model import

Count sort (diagram)
![Other 7 features of TCP [sliding window mechanism ▲]](/img/ff/c3f52a7b89804acfd0c4f3b78bc4a0.jpg)
Other 7 features of TCP [sliding window mechanism ▲]

PPT模板、素材下载网站(纯干货,建议收藏)
随机推荐
【ChaosBlade:节点磁盘填充、杀节点上指定进程、挂起节点上指定进程】
Isomorphic C language
注解@ConfigurationProperties的三种使用场景
[Nanjing University] - [software analysis] course learning notes (I) -introduction
Greenplum6.x常用语句
【踩坑】nacos注册一直连接localhost:8848,no available server
Problems encountered in the use of go micro
STM32串口寄存器库函数配置方法
Analysis of using jsonp cross domain vulnerability and XSS vulnerability in honeypot
2022-07-06 unity core 9 - 3D animation
【Istio Network CRD VirtualService、Envoyfilter】
Sign and authenticate API interface or H5 interface
Quick sorting (detailed illustration of single way, double way, three way)
Greenplum6.x搭建_环境配置
Newly found yii2 excel processing plug-in
H3C VXLAN配置
Markdown editor Use of MD plug-in
xray的简单使用
Mountaineering team (DFS)
The longest ascending subsequence model acwing 1017 Strange thief Kidd's glider