当前位置:网站首页>美团一面:为什么线程崩溃崩溃不会导致 JVM 崩溃
美团一面:为什么线程崩溃崩溃不会导致 JVM 崩溃
2022-07-03 16:41:00 【InfoQ】

- 线程崩溃,进程一定会崩溃吗
- 进程是如何崩溃的-信号机制简介
- 为什么在 JVM 中线程崩溃不会导致 JVM 进程崩溃
- openJDK 源码解析
线程崩溃,进程一定会崩溃吗?

- 针对只读内存写入数据
- #include <stdio.h> #include <stdlib.h> int main() { char *s = "hello world";// 向只读内存写入数据,崩溃 s[1] = 'H'; }
- 访问了进程没有权限访问的地址空间(比如内核空间)
- #include <stdio.h> #include <stdlib.h> int main() { int *p = (int *)0xC0000fff; // 针对进程的内核空间写入数据,崩溃 *p = 10; }
- 在 32 位虚拟地址空间中,p 指向的是内核空间,显然不具有写入权限,所以上述赋值操作会导致崩溃
- 访问了不存在的内存,比如
- #include <stdio.h> #include <stdlib.h> int main() { int *a = NULL; *a = 1; }
进程是如何崩溃的-信号机制简介

- CPU 执行正常的进程指令
- 调用 kill 系统调用向进程发送信号
- 进程收到操作系统发的信号,CPU 暂停当前程序运行,并将控制权转交给操作系统
- 调用 kill 系统调用向进程发送信号(假设为 11,即 SIGSEGV,一般非法访问内存报的都是这个错误)
- 操作系统根据情况执行相应的信号处理程序(函数),一般执行完信号处理程序逻辑后会让进程退出
// 自定义信号处理函数示例
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
// 自定义信号处理函数,处理自定义逻辑后再调用 exit 退出
void sigHandler(int sig) {
printf("Signal %d catched!\n", sig);
exit(sig);
}
int main(void) {
signal(SIGSEGV, sigHandler);
int *p = (int *)0xC0000fff;
*p = 10; // 针对不属于进程的内核空间写入数据,崩溃
}
// 以上结果输出: Signal 11 catched!
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
int main(void) {
// 忽略信号
signal(SIGSEGV, SIG_IGN);
// 产生一个 SIGSEGV 信号
raise(SIGSEGV);
printf("正常结束");
}
为什么线程崩溃不会导致 JVM 进程崩溃


openJDK 源码解析

JVM_handle_linux_signal(int sig,
siginfo_t* info,
void* ucVoid,
int abort_if_unrecognized) {
// Must do this before SignalHandlerMark, if crash protection installed we will longjmp away
// 这段代码里会调用 siglongjmp,主要做线程恢复之用
os::ThreadCrashProtection::check_crash_protection(sig, t);
if (info != NULL && uc != NULL && thread != NULL) {
pc = (address) os::Linux::ucontext_get_pc(uc);
// Handle ALL stack overflow variations here
if (sig == SIGSEGV) {
// Si_addr may not be valid due to a bug in the linux-ppc64 kernel (see
// comment below). Use get_stack_bang_address instead of si_addr.
address addr = ((NativeInstruction*)pc)->get_stack_bang_address(uc);
// 判断是否栈溢出了
if (addr < thread->stack_base() &&
addr >= thread->stack_base() - thread->stack_size()) {
if (thread->thread_state() == _thread_in_Java) {
// 针对栈溢出 JVM 的内部处理
stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::STACK_OVERFLOW);
}
}
}
}
if (sig == SIGSEGV &&
!MacroAssembler::needs_explicit_null_check((intptr_t)info->si_addr)) {
// 此处会做空指针检查
stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::IMPLICIT_NULL);
}
// 如果是栈溢出或者空指针最终会返回 true,不会走最后的 report_and_die,所以 JVM 不会退出
if (stub != NULL) {
// save all thread context in case we need to restore it
if (thread != NULL) thread->set_saved_exception_pc(pc);
uc->uc_mcontext.gregs[REG_PC] = (greg_t)stub;
// 返回 true 代表 JVM 进程不会退出
return true;
}
VMError err(t, sig, pc, info, ucVoid);
// 生成 hs_err_pid_xxx.log 文件并退出
err.report_and_die();
ShouldNotReachHere();
return true; // Mute compiler
}
- 发生 stackoverflow 还有空指针错误,确实都发送了 SIGSEGV,只是虚拟机不选择退出,而是自己内部作了额外的处理,其实是恢复了线程的进程,并抛出 StackoverflowError 和 NPE,这就是为什么 JVM 不会崩溃且我们能捕获这两个错误/异常的原因
- 如果针对 SIGSEGV 等信号,在以上的函数中 JVM 没有做额外的处理,那么最终会走到 report_and_die 这个方法,这个方法主要做的事情是生成 hs_err_pid_xxx.log crash 文件(记录了一些堆栈信息或错误),然后退出
总结
边栏推荐
- Mysql database DDL and DML
- [combinatorics] recursive equation (outline of recursive equation content | definition of recursive equation | example description of recursive equation | Fibonacci Series)
- Informatics Olympiad all in one YBT 1175: divide by 13 | openjudge noi 1.13 27: divide by 13
- CC2530 common registers for ADC single channel conversion
- word 退格键删除不了选中文本,只能按delete
- What kind of material is 14Cr1MoR? Analysis of chemical composition and mechanical properties of 14Cr1MoR
- Page dynamics [2]keyframes
- Thinking about telecommuting under the background of normalization of epidemic | community essay solicitation
- [Jianzhi offer] 64 Find 1+2+... +n
- CC2530 common registers for crystal oscillator settings
猜你喜欢
拼夕夕二面:说说布隆过滤器与布谷鸟过滤器?应用场景?我懵了。。
Zebras are recognized as dogs, and Stanford found the reason why AI made mistakes
NSQ源码安装运行过程
A survey of state of the art on visual slam
Shentong express expects an annual loss of nearly 1billion
(补)双指针专题
QT serial port UI design and solution to display Chinese garbled code
CC2530 common registers for port interrupts
Explore Netease's large-scale automated testing solutions see here see here
[combinatorics] polynomial theorem (polynomial theorem | polynomial theorem proof | polynomial theorem inference 1 item number is the number of non negative integer solutions | polynomial theorem infe
随机推荐
PHP secondary domain name session sharing scheme
程序猿如何快速成长
Mongodb installation and basic operation
Record windows10 installation tensorflow-gpu2.4.0
To resist 7-Zip, list "three sins"? Netizen: "is the third key?"
PyTorch 1.12发布,正式支持苹果M1芯片GPU加速,修复众多Bug
[combinatorics] non descending path problem (number of non descending paths with constraints)
Overview of satellite navigation system
Informatics Olympiad all in one YBT 1175: divide by 13 | openjudge noi 1.13 27: divide by 13
Thread pool executes scheduled tasks
One article takes you to understand machine learning
为抵制 7-Zip,列出 “三宗罪” ?网友:“第3个才是重点吧?”
于文文、胡夏等明星带你玩转派对 皮皮APP点燃你的夏日
Two sides of the evening: tell me about the bloom filter and cuckoo filter? Application scenario? I'm confused..
爱可可AI前沿推介(7.3)
Client does not support authentication protocol requested by server; consider upgrading MySQL client
Le zèbre a été identifié comme un chien, et la cause de l'erreur d'AI a été trouvée par Stanford
How programming apes grow rapidly
拼夕夕二面:说说布隆过滤器与布谷鸟过滤器?应用场景?我懵了。。
Simulink oscilloscope data is imported into Matlab and drawn