当前位置:网站首页>Summary of common error reporting problems and positioning methods of thrift
Summary of common error reporting problems and positioning methods of thrift
2022-07-03 13:49:00 【yolo_ yyh】
Catalog
Aache Thrift The most common error messages are :
Problem location :No more data to read.
Problem location :Connection refused.
Problem location :No route to host.
Problem location :Called write on non-open socket.
Problem location :Thrfit_EAGAIN (timed out).
Problem location :socket open() error: There is no route to the host
Aache Thrift The most common error messages :
No more data to read
Called write on non-open socket
Connection refused
Thrift _EAGAIN(timed out)
Interrupted systemcall
When a connection error occurs , have access to ping、netstat、ss、nc、telnet And other tools or commands to quickly judge the network status of nodes . Pay attention to the log or UI appear RPC When reporting a mistake , What needs to be judged is the status of the destination node . For example, using netstat -anp | grep “ Port number ” You can check whether the current process successfully listens to the specified port number , At the same time, check the current connection .
Problem location :No more data to read.
“No more data to read.” Is the most common Apache Thrift Report errors , The root cause of the error is “ The connection is closed by the opposite end ”, This error message is thrift Unique , As long as you see this error , Necessity and thrift relevant , The reason might be :
1、 If it's a long connection , The idle time of the connection exceeds the receiving timeout of the server , Then the server will close the connection , Then use the connection to send data, and “No more data to read.” The error of ;
2、 Besides , Connect the server recv If you are interrupted by the system , It will also trigger the server to close the connection , At this time, the client will operate on the connection , There will also be “No more data to read.” The error of ;
3、 When the concurrent pressure is high ,client End connect success , but server Due to excessive concurrent pressure, there is no real accept,client At this time, the end will use this connection to communicate , There will also be “No more data to read.” The error of . This problem can be adjusted TCP Kernel parameter avoidance mitigation , But adjusting kernel parameters requires all nodes of the cluster to adjust at the same time , Simultaneous need root jurisdiction , Caution is recommended. .

TCP During the three handshakes, the kernel maintains two queues : Semi connected queues , namely SYN Queues and fully connected queues , namely ACCEPT queue ;
During handshake , The first handshake server received client Of syn after , The kernel stores the connection in the semi connection queue , Reply at the same time syn+ack to client( The second handshake ), The third handshake server received client Of ack, If the full connection queue is not full at this time , The kernel will remove the connection from the semi connection queue , And add it to accept queue , Waiting for the application process to call accept Function takes out the connection , If the full connection queue is full , The behavior of the kernel depends on the kernel parameters .
tcp_abort_on_overflow=0,server Will be discarded client Of ack.
tcp_abort_on_overflow=1,server Will send reset Give it to client.
Problem location :Connection refused.
“Connection refused.” The reason for this is usually the process crash on the server . You need to check the log of the server , Confirm whether the process of the server is in the startup state during the time period of error reporting . If long-term report Connection refused And the process status of the target node is normal , You need to confirm the following points :
1、 confirm hostname and IP Whether the mapping is correct , see /etc/hosts file , If the configuration is wrong, modify it in time ;
2、 Confirm whether the process of the target node listens to the port normally ;
Problem location :No route to host.
appear No route to host An error usually means that the target node server has been restarted .
There's another one No route to host The reason for the error is /etc/hosts Hostname mapping error caused , At this time, you need to carefully check the hostname mapping of all nodes . When adding nodes to the cluster, you should pay special attention to checking , It's easy to forget to add the hostname mapping of the new node to the original node .
Problem location :Called write on non-open socket.
The reason for this error is socket Connection open failed , Use an invalid connection send Caused by operation , This kind of error reporting can generally be avoided by retrying , If such errors continue to occur , You still need to check the process status of the target node 、 Port listening and hostname mapping . You can fully refer to “Connection refused.” Positioning method of .
Problem location :Thrfit_EAGAIN (timed out).
The reason for this error is the client receiving timeout , It may be caused by too many connection tasks .
There is another reason for the error of super times , It may be due to the CPU Caused by fullness , At this time, we need to focus on the target node CPU usage .
Problem location :Thrfit_EAGAIN (unavailable resources).
The reason for this error is the client receiving timeout , And it's more than thrift recv() Retry count .
Problem location :socket open() error: There is no route to the host
Check whether the firewall of the target node and the local node is running , Ensure that the firewall is turned off
CentOS 7 Issue the command to view the firewall status :firewall-cmd --state( You may need to root jurisdiction )
Turn off firewall :systemctl stop firewalld.service
边栏推荐
猜你喜欢
![Mysql:insert date:SQL 错误 [1292] [22001]: Data truncation: Incorrect date value:](/img/2f/33504391a661ecb63d42d75acf3a37.png)
Mysql:insert date:SQL 错误 [1292] [22001]: Data truncation: Incorrect date value:

Go language unit test 3: go language uses gocovey library to do unit test

Setting up remote links to MySQL on Linux

User and group command exercises

Mastering the cypress command line options is the basis for truly mastering cypress
![[redis] cache warm-up, cache avalanche and cache breakdown](/img/df/81f38087704de36946b470f68e8004.jpg)
[redis] cache warm-up, cache avalanche and cache breakdown

Resource Cost Optimization Practice of R & D team

又一个行业被中国芯片打破空白,难怪美国模拟芯片龙头降价抛售了

Multi table query of MySQL - multi table relationship and related exercises

Unable to stop it, domestic chips have made another breakthrough, and some links have reached 4nm
随机推荐
Shell timing script, starting from 0, CSV format data is regularly imported into PostgreSQL database shell script example
Unity EmbeddedBrowser浏览器插件事件通讯
常见的几种最优化方法Matlab原理和深度分析
This math book, which has been written by senior ml researchers for 7 years, is available in free electronic version
物联网毕设 --(STM32f407连接云平台检测数据)
Golang — 命令行工具cobra
Windos creates Cordova prompt because running scripts is prohibited on this system
刚毕业的欧洲大学生,就能拿到美国互联网大厂 Offer?
JS convert pseudo array to array
研发团队资源成本优化实践
MapReduce实现矩阵乘法–实现代码
Go language web development series 29: Gin framework uses gin contrib / sessions library to manage sessions (based on cookies)
The shortage of graphics cards finally came to an end: 3070ti for more than 4000 yuan, 2000 yuan cheaper than the original price, and 3090ti
使用Tensorflow进行完整的深度神经网络CNN训练完成图片识别案例2
MapReduce implements matrix multiplication - implementation code
Flutter dynamic | fair 2.5.0 new version features
MyCms 自媒体商城 v3.4.1 发布,使用手册更新
Asp.Net Core1.1版本没了project.json,这样来生成跨平台包
Box layout of Kivy tutorial BoxLayout arranges sub items in vertical or horizontal boxes (tutorial includes source code)
Ubuntu 14.04 下开启PHP错误提示