当前位置:网站首页>Summary of common error reporting problems and positioning methods of thrift
Summary of common error reporting problems and positioning methods of thrift
2022-07-03 13:49:00 【yolo_ yyh】
Catalog
Aache Thrift The most common error messages are :
Problem location :No more data to read.
Problem location :Connection refused.
Problem location :No route to host.
Problem location :Called write on non-open socket.
Problem location :Thrfit_EAGAIN (timed out).
Problem location :socket open() error: There is no route to the host
Aache Thrift The most common error messages :
No more data to read
Called write on non-open socket
Connection refused
Thrift _EAGAIN(timed out)
Interrupted systemcall
When a connection error occurs , have access to ping、netstat、ss、nc、telnet And other tools or commands to quickly judge the network status of nodes . Pay attention to the log or UI appear RPC When reporting a mistake , What needs to be judged is the status of the destination node . For example, using netstat -anp | grep “ Port number ” You can check whether the current process successfully listens to the specified port number , At the same time, check the current connection .
Problem location :No more data to read.
“No more data to read.” Is the most common Apache Thrift Report errors , The root cause of the error is “ The connection is closed by the opposite end ”, This error message is thrift Unique , As long as you see this error , Necessity and thrift relevant , The reason might be :
1、 If it's a long connection , The idle time of the connection exceeds the receiving timeout of the server , Then the server will close the connection , Then use the connection to send data, and “No more data to read.” The error of ;
2、 Besides , Connect the server recv If you are interrupted by the system , It will also trigger the server to close the connection , At this time, the client will operate on the connection , There will also be “No more data to read.” The error of ;
3、 When the concurrent pressure is high ,client End connect success , but server Due to excessive concurrent pressure, there is no real accept,client At this time, the end will use this connection to communicate , There will also be “No more data to read.” The error of . This problem can be adjusted TCP Kernel parameter avoidance mitigation , But adjusting kernel parameters requires all nodes of the cluster to adjust at the same time , Simultaneous need root jurisdiction , Caution is recommended. .

TCP During the three handshakes, the kernel maintains two queues : Semi connected queues , namely SYN Queues and fully connected queues , namely ACCEPT queue ;
During handshake , The first handshake server received client Of syn after , The kernel stores the connection in the semi connection queue , Reply at the same time syn+ack to client( The second handshake ), The third handshake server received client Of ack, If the full connection queue is not full at this time , The kernel will remove the connection from the semi connection queue , And add it to accept queue , Waiting for the application process to call accept Function takes out the connection , If the full connection queue is full , The behavior of the kernel depends on the kernel parameters .
tcp_abort_on_overflow=0,server Will be discarded client Of ack.
tcp_abort_on_overflow=1,server Will send reset Give it to client.
Problem location :Connection refused.
“Connection refused.” The reason for this is usually the process crash on the server . You need to check the log of the server , Confirm whether the process of the server is in the startup state during the time period of error reporting . If long-term report Connection refused And the process status of the target node is normal , You need to confirm the following points :
1、 confirm hostname and IP Whether the mapping is correct , see /etc/hosts file , If the configuration is wrong, modify it in time ;
2、 Confirm whether the process of the target node listens to the port normally ;
Problem location :No route to host.
appear No route to host An error usually means that the target node server has been restarted .
There's another one No route to host The reason for the error is /etc/hosts Hostname mapping error caused , At this time, you need to carefully check the hostname mapping of all nodes . When adding nodes to the cluster, you should pay special attention to checking , It's easy to forget to add the hostname mapping of the new node to the original node .
Problem location :Called write on non-open socket.
The reason for this error is socket Connection open failed , Use an invalid connection send Caused by operation , This kind of error reporting can generally be avoided by retrying , If such errors continue to occur , You still need to check the process status of the target node 、 Port listening and hostname mapping . You can fully refer to “Connection refused.” Positioning method of .
Problem location :Thrfit_EAGAIN (timed out).
The reason for this error is the client receiving timeout , It may be caused by too many connection tasks .
There is another reason for the error of super times , It may be due to the CPU Caused by fullness , At this time, we need to focus on the target node CPU usage .
Problem location :Thrfit_EAGAIN (unavailable resources).
The reason for this error is the client receiving timeout , And it's more than thrift recv() Retry count .
Problem location :socket open() error: There is no route to the host
Check whether the firewall of the target node and the local node is running , Ensure that the firewall is turned off
CentOS 7 Issue the command to view the firewall status :firewall-cmd --state( You may need to root jurisdiction )
Turn off firewall :systemctl stop firewalld.service
边栏推荐
- Go language unit test 4: go language uses gomonkey to test functions or methods
- The latest BSC can pay dividends. Any B usdt Shib eth dividend destruction marketing can
- SQL Injection (GET/Select)
- 3D视觉——2.人体姿态估计(Pose Estimation)入门——OpenPose含安装、编译、使用(单帧、实时视频)
- php 迷宫游戏
- Ocean CMS vulnerability - search php
- Internet of things completion -- (stm32f407 connects to cloud platform detection data)
- Typeerror resolved: argument 'parser' has incorrect type (expected lxml.etree.\u baseparser, got type)
- Leetcode-1175. Prime Arrangements
- 掌握Cypress命令行选项,是真正掌握Cypress的基础
猜你喜欢

JVM系列——概述,程序计数器day1-1

Multi table query of MySQL - multi table relationship and related exercises

研发团队资源成本优化实践

【电脑插入U盘或者内存卡显示无法格式化FAT32如何解决】

Mastering the cypress command line options is the basis for truly mastering cypress

Complete DNN deep neural network CNN training with tensorflow to complete image recognition cases

Go language unit test 4: go language uses gomonkey to test functions or methods

Can newly graduated European college students get an offer from a major Internet company in the United States?
![[技术发展-24]:现有物联网通信技术特点](/img/f3/a219fe8e7438b8974d2226b4c3d4a4.png)
[技术发展-24]:现有物联网通信技术特点

CVPR 2022 | interpretation of 6 excellent papers selected by meituan technical team
随机推荐
Logback log sorting
Mobile phones and computers can be used, whole people, spoof code connections, "won't you Baidu for a while" teach you to use Baidu
AI 考高数得分 81,网友:AI 模型也免不了“内卷”!
AI scores 81 in high scores. Netizens: AI model can't avoid "internal examination"!
[développement technologique - 24]: caractéristiques des technologies de communication Internet des objets existantes
刚毕业的欧洲大学生,就能拿到美国互联网大厂 Offer?
CVPR 2022 | interpretation of 6 excellent papers selected by meituan technical team
Go language unit test 4: go language uses gomonkey to test functions or methods
使用tensorflow进行完整的DNN深度神经网络CNN训练完成图片识别案例
This math book, which has been written by senior ml researchers for 7 years, is available in free electronic version
Heap structure and heap sort heapify
NFT新的契机,多媒体NFT聚合平台OKALEIDO即将上线
Unity Render Streaming通过Js与Unity自定义通讯
Golang — 命令行工具cobra
MySQL 数据处理值增删改
HALCON联合C#检测表面缺陷——HALCON例程autobahn
Error running 'application' in idea running: the solution of command line is too long
网上开户哪家证券公司佣金最低,我要开户,网上客户经理开户安全吗
Use docker to build sqli lab environment and upload labs environment, and the operation steps are provided with screenshots.
软件测试工作那么难找,只有外包offer,我该去么?