当前位置:网站首页>Full GC (Ergonomics)排查分析
Full GC (Ergonomics)排查分析
2022-07-31 11:53:00 【南风知我意丿】
Full GC [Ergonomics]排查分析
问题描述
有一个线上Spark 任务频繁GC最后导致挂掉,下面是对它的排查分析
GC日志
{
Heap before GC invocations=4312 (full 121):
PSYoungGen total 673792K, used 648704K [0x00000000d5580000, 0x0000000100000000, 0x0000000100000000)
eden space 648704K, 100% used [0x00000000d5580000,0x00000000fcf00000,0x00000000fcf00000)
from space 25088K, 0% used [0x00000000fcf00000,0x00000000fcf00000,0x00000000fe780000)
to space 25088K, 0% used [0x00000000fe780000,0x00000000fe780000,0x0000000100000000)
ParOldGen total 1398272K, used 1398167K [0x0000000080000000, 0x00000000d5580000, 0x00000000d5580000)
object space 1398272K, 99% used [0x0000000080000000,0x00000000d5565cf0,0x00000000d5580000)
Metaspace used 133520K, capacity 143098K, committed 143508K, reserved 1175552K
class space used 16519K, capacity 17415K, committed 17596K, reserved 1048576K
2022-07-27T00:10:42.519+0800: 18668.196: [Full GC (Ergonomics) [PSYoungGen: 648704K->57867K(673792K)] [ParOldGen: 1398167K->1397781K(1398272K)] 2046871K->1455648K(2072064K), [Metaspace: 133520K->133520K(1175552K)], 0.6053255 secs] [Times: user=14.25 sys=0.02, real=0.61 secs]
Heap after GC invocations=4312 (full 121):
PSYoungGen total 673792K, used 57867K [0x00000000d5580000, 0x0000000100000000, 0x0000000100000000)
eden space 648704K, 8% used [0x00000000d5580000,0x00000000d8e02da0,0x00000000fcf00000)
from space 25088K, 0% used [0x00000000fcf00000,0x00000000fcf00000,0x00000000fe780000)
to space 25088K, 0% used [0x00000000fe780000,0x00000000fe780000,0x0000000100000000)
ParOldGen total 1398272K, used 1397781K [0x0000000080000000, 0x00000000d5580000, 0x00000000d5580000)
object space 1398272K, 99% used [0x0000000080000000,0x00000000d55054d0,0x00000000d5580000)
Metaspace used 133520K, capacity 143098K, committed 143508K, reserved 1175552K
class space used 16519K, capacity 17415K, committed 17596K, reserved 1048576K
}
{
Heap before GC invocations=4313 (full 122):
PSYoungGen total 673792K, used 648704K [0x00000000d5580000, 0x0000000100000000, 0x0000000100000000)
eden space 648704K, 100% used [0x00000000d5580000,0x00000000fcf00000,0x00000000fcf00000)
from space 25088K, 0% used [0x00000000fcf00000,0x00000000fcf00000,0x00000000fe780000)
to space 25088K, 0% used [0x00000000fe780000,0x00000000fe780000,0x0000000100000000)
ParOldGen total 1398272K, used 1397781K [0x0000000080000000, 0x00000000d5580000, 0x00000000d5580000)
object space 1398272K, 99% used [0x0000000080000000,0x00000000d55054d0,0x00000000d5580000)
Metaspace used 133520K, capacity 143098K, committed 143508K, reserved 1175552K
class space used 16519K, capacity 17415K, committed 17596K, reserved 1048576K
2022-07-27T00:10:48.605+0800: 18674.282: [Full GC (Ergonomics) [PSYoungGen: 648704K->56957K(673792K)] [ParOldGen: 1397781K->1398167K(1398272K)] 2046485K->1455125K(2072064K), [Metaspace: 133520K->133520K(1175552K)], 0.7382079 secs] [Times: user=18.29 sys=0.05, real=0.74 secs]
Heap after GC invocations=4313 (full 122):
PSYoungGen total 673792K, used 56957K [0x00000000d5580000, 0x0000000100000000, 0x0000000100000000)
eden space 648704K, 8% used [0x00000000d5580000,0x00000000d8d1f710,0x00000000fcf00000)
from space 25088K, 0% used [0x00000000fcf00000,0x00000000fcf00000,0x00000000fe780000)
to space 25088K, 0% used [0x00000000fe780000,0x00000000fe780000,0x0000000100000000)
ParOldGen total 1398272K, used 1398167K [0x0000000080000000, 0x00000000d5580000, 0x00000000d5580000)
object space 1398272K, 99% used [0x0000000080000000,0x00000000d5565df0,0x00000000d5580000)
Metaspace used 133520K, capacity 143098K, committed 143508K, reserved 1175552K
class space used 16519K, capacity 17415K, committed 17596K, reserved 1048576K
}
{
Heap before GC invocations=4314 (full 123):
PSYoungGen total 673792K, used 648704K [0x00000000d5580000, 0x0000000100000000, 0x0000000100000000)
eden space 648704K, 100% used [0x00000000d5580000,0x00000000fcf00000,0x00000000fcf00000)
from space 25088K, 0% used [0x00000000fcf00000,0x00000000fcf00000,0x00000000fe780000)
to space 25088K, 0% used [0x00000000fe780000,0x00000000fe780000,0x0000000100000000)
ParOldGen total 1398272K, used 1398167K [0x0000000080000000, 0x00000000d5580000, 0x00000000d5580000)
object space 1398272K, 99% used [0x0000000080000000,0x00000000d5565df0,0x00000000d5580000)
Metaspace used 133520K, capacity 143098K, committed 143508K, reserved 1175552K
class space used 16519K, capacity 17415K, committed 17596K, reserved 1048576K
2022-07-27T00:10:54.991+0800: 18680.667: [Full GC (Ergonomics) [PSYoungGen: 648704K->57325K(673792K)] [ParOldGen: 1398167K->1398090K(1398272K)] 2046871K->1455415K(2072064K), [Metaspace: 133520K->133520K(1175552K)], 0.6338603 secs] [Times: user=14.77 sys=0.02, real=0.64 secs]
Heap after GC invocations=4314 (full 123):
PSYoungGen total 673792K, used 57325K [0x00000000d5580000, 0x0000000100000000, 0x0000000100000000)
eden space 648704K, 8% used [0x00000000d5580000,0x00000000d8d7b600,0x00000000fcf00000)
from space 25088K, 0% used [0x00000000fcf00000,0x00000000fcf00000,0x00000000fe780000)
to space 25088K, 0% used [0x00000000fe780000,0x00000000fe780000,0x0000000100000000)
ParOldGen total 1398272K, used 1398090K [0x0000000080000000, 0x00000000d5580000, 0x00000000d5580000)
object space 1398272K, 99% used [0x0000000080000000,0x00000000d55528a0,0x00000000d5580000)
Metaspace used 133520K, capacity 143098K, committed 143508K, reserved 1175552K
class space used 16519K, capacity 17415K, committed 17596K, reserved 1048576K
}
在页面上搜Full GC 可以看到非常的频繁
原因分析:
jstat -gcutil < pid >
可以看到 youngGC 基本不变,Full GC一致飙升
GC日志
可以看到Full GC原因基本是 (Ergonomics)而且GC对老年代已经起不了作用了
2022-07-27T00:10:48.605+0800: 18674.282: [Full GC (Ergonomics) [PSYoungGen: 648704K->56957K(673792K)] [ParOldGen: 1397781K->1398167K(1398272K)] 2046485K->1455125K(2072064K), [Metaspace: 133520K->133520K(1175552K)], 0.7382079 secs] [Times: user=18.29 sys=0.05, real=0.74 secs]
查找资料,发现当使用 Server 模式下的ParallelGC 收集器组合(Parallel Scavenge + Serial Old)时,会在 Minor GC前进行一次判断,也就是 内存空间分配担保机制:
Eden 空间不足发生 Minor GC 之前,虚拟机先检查老年代最大可用的连续空间是否大于新生代所有对象总空间,如果条件成立的话,那么 Minor GC 可以确认是安全的。如果不成立的话虚拟机查看 HandlePromotionFailure 的值是否允许担保失败,如果HandlePromotionFailure 的值不允许冒险,那么就要进行一次 Full GC。如果允许则检查老年代最大可用的连续空间是否大于历次晋升到老年代对象的平均大小,如果大于将继续尝试进行 Minor GC,小于的话就要进行 Full GC以保证 Minor GC 的空间担保成功
这个 Java 服务没有通过启动参数配置垃圾收集器,查看堆配置发现是使用的默认 Parallel GC 。显然,合理的推测是空间分配担保失败导致了Full GC
jmap -heap < pid>
看到发现老年代的使用已经达到了 99.97697796737938%,只剩下了 0.03M,是可以和之前的推测相佐证的
PS Old Generation
capacity = 179306496 (171.0MB)
used = 179265216 (170.96063232421875MB)
free = 41280 (0.03936767578125MB)
99.97697796737938% used
jmap -histo < pid> | head -n 30
查看堆内存中占用内存最多的前 30 个类实例,发现可疑点
ExectionListenerBus一直增加,导致堆内存一直被占用得不到释放
解决方案:
查了spark 的github看到有人提了PR,得知是spark框架的bug,本次bug在spark3.0.3已经修复
边栏推荐
- ApiPost is really fragrant and powerful, it's time to throw away Postman and Swagger
- The most complete phpmyadmin vulnerability summary
- 无法将“node.exe”项识别为 cmdlet、函数、脚本文件或可运行程序的名称。
- MySQL日志中“binlog”的三种格式玩起来真爽
- 瑞吉外卖项目:新增菜品与菜品分页查询
- pycharm汉化教程(碧蓝幻想汉化插件安装)
- MySQL index usage and optimization
- 502 bad gateway原因、解决方法
- Experience innovation and iteration through the development of lucky draw mini-programs
- St. Regis Takeaway Project: New dishes and dishes paged query
猜你喜欢

无法将“node.exe”项识别为 cmdlet、函数、脚本文件或可运行程序的名称。

The item 'node.exe' was not recognized as the name of a cmdlet, function, script file, or runnable program.

WebGL给Unity传递参数问题1: Cannot read properties of undefined (reading ‘SendMessage‘)
502 bad gateway causes and solutions

【软件工程之美 - 专栏笔记】33 | 测试工具:为什么不应该通过QQ/微信/邮件报Bug?

Power BI----几个常用的分析方法和相适应的视觉对象

cesium-Web网页优化进阶

最全phpmyadmin漏洞汇总

chroot命令

Service discovery of kubernetes
随机推荐
am335x 看门狗驱动&看门狗应用例程序
Android studio连接MySQL并完成简单的登录注册功能
Cloudera Manager —— 端到端的企业数据中心管理工具
订song餐系统
【软件工程之美 - 专栏笔记】33 | 测试工具:为什么不应该通过QQ/微信/邮件报Bug?
VBA实现双击单元格自动输出对号再次双击取消对号
R语言做面板panelvar例子
台达PLC出现通信错误或通信超时或下载时提示机种不符的解决办法总结
IDEA 配置方法注释自动参数
线程池 ThreadPoolExecutor 详解
IDEA configure method annotation automatic parameters
After class, watching the documentation and walking back to the lab, I picked up the forgotten SQL operators again
5 open source Rust web development frameworks, which one do you choose?
JVS应用中心
SAP Commerce Cloud Product Review 的添加逻辑
kernel syscore
Is the working process of the belt you know the story - actionreducerstore
mpu9150(driverack pa简明教程)
How MySQL's allowMultiQueries flag relates to JDBC and jOOQ
科学论文和学术论文写作
