当前位置:网站首页>The problem come from line screening process
The problem come from line screening process
2022-08-05 06:13:00 【sick caterpillar】
Troubleshooting
- For various common online problems, sort out the troubleshooting ideas.
Business questions
- Online problems are mostly caused by business problems. When most requests in the online environment are normal, when some or a user has problems, how to troubleshoot them?
- Under the current microservice system, there are generally distributed link tracking systems and ELK log systems. We can find the point of the problem through the monitoring platform:
- Crawling of exception logs

- At this point, we can get the current user's request information through log tracking:

- Use the watch command of Arths to monitor the corresponding abnormal interface, get the corresponding parameters through the log, and simulate the request of the online user through the invoke command of Dubbo, so as to reproduce the problem and solve the problem
Non-business questions
- Arthas tool is a good tool for locating problems online, easy to install
- In the troubleshooting process for non-business problems, it is necessary to first check the computer core resources such as CPU, memory, threads, etc.
- We can get the corresponding information in this service through the dashboard command, and get the latest data every few seconds.
- You can see in the thread monitoring area: thread id, name, status, CPU usage, whether to guard the thread, etc.
- Memory Hee Hee: Heap Memory, Eden Area, Survivor Area, Old Age, Method Area
- Machine condition

As above, we can get the key thread id of the corresponding thread information
Then you can query the execution stack of a thread through Thread thread_id without even dumping
There is also decompiled jad, and online query of the source code information of the corresponding class is convenient for troubleshooting
However, most of the online incidents do not have time to search temporarily, corresponding to the generation system, there is not much time for online positioning,
I will proceed as follows:
- Sequentially restart the problematic machines to see if that fixes the problem,
- At the same time, execute the jmap -dump command on the last machine to be restarted to save the thread status of the java heap
- If the machine cannot be restored after restarting, it will be rolled back to the previous version to ensure normal online business
- Import the saved dump file to the local
- Use the java visualVM tool that comes with jdk to import the dump file
- visualVM can view the classes used in the dump file records through the visual interface, the objects in each class and the specific content in various current environments can be analyzed offline and solved after analyzing the specific reasons.
边栏推荐
猜你喜欢

LinkSLA坚持用户第一,打造可持续的运维服务方案

运维的高光时刻,从智能化开始

【Day8】Knowledge about disk and disk partition
![[Day6] File system permission management, file special permissions, hidden attributes](/img/ec/7fb3fa671fac8abf389844c0f4fbe7.png)
[Day6] File system permission management, file special permissions, hidden attributes

Getting Started Doc 06 Adding files to a stream

Getting Started Document 07 Staged Output

I/O performance and reliability

Mongodb查询分析器解析

入职前,没想到他们玩的这么花
![[Day5] Soft and hard links File storage, deletion, directory management commands](/img/15/7ed58a180a72ace3463626bf446633.png)
[Day5] Soft and hard links File storage, deletion, directory management commands
随机推荐
传输层协议(TCP3次握手)
Image compression failure problem
单臂路由与三成交换机
spark源码-任务提交流程之-1-sparkSubmit
Spark source code-task submission process-6.2-sparkContext initialization-TaskScheduler task scheduler
Account and Permission Management
交换机原理
Small example of regular expression--remove spaces in the middle and on both sides of the string
ROS视频教程
Hugo搭建个人博客
The problem of redirecting to the home page when visiting a new page in dsf5.0
[Day6] File system permission management, file special permissions, hidden attributes
vim的三种模式
lvm逻辑卷及磁盘配额
The problem of calling ds18b20 through a single bus
入门文档01 series按顺序执行
static routing
Apache配置反向代理
什么?CDN缓存加速只适用于加速静态内容?
TensorFlow ObjecDetectionAPI在win10系统Anaconda3下的配置