当前位置:网站首页>Too many open files
Too many open files
2022-06-09 06:18:00 【Desert Effect】
Preface
One day when prometheus Add to approximate 2500 Indicators (Targets) when ,prometheus It starts to open slowly , Unable to reload configuration file , And the service occasionally hangs up . So the problem was solved .
Report errors
View the service log and system log
systemctl status prometheus
journalctl -u prometheus
tail -f /var/log/messages
msg="Scrape commit failed" err="write to WAL: log samples: create new segment file: open /data/prometheus/wal/00007563: too many open files"
These logs all contain Too many open files
Cause and treatment
The above error report is usually due to limit Setting causes ,
A system or process The number of files and communication links that exceed the system limit are opened at a certain time
<1> Check out the limit Set up
ulimit -n
(open files The system usually defaults to 1024)

or
cat /etc/security/limits.conf
* soft nofile 102400 ## The maximum number of file descriptors that any user can open , Default 1024
* hard nofile 102400
* soft nproc 102400 # The maximum number of processes that can be opened by any user
* hard nproc 102400
It can be seen from the above that the parameters are quite large , The system has been tuned
<2> View the number of file handles for this process
Inquire about prometheus process pid
ps aux |grep prometheus
cat /proc/<pid>/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 256486 256486 processes
Max open files 1024 1024 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 256486 256486 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
It can be seen from the above that , The current process has a low maximum number of open files , This should be adjusted
<3> View the sorting of the number of open files in the system ( Or view the number of open files for this process )
(Centos6 Use )
lsof -n |awk '{print $2}'|sort|uniq -c |sort -nr|more
or
lsof -n |awk '{print $2}'|sort|uniq -c |sort -nr|more |grep <pid>
stay Centos7 And above , adopt lsof The number of file handles counted is greater than Centos7 The following systems are many times higher ,
yes lsof Different versions of , The new version of the lsof Default printing TID(-K Parameters ), Therefore, the above command is suggested to be in Centos6 Use in
lsof -p <pid> |wc -l Is to count the number of file handles opened by a process
lsof -n|grep <pid> |wc -l Is to count the number of file handles opened by a process and all its threads
<4> Count how many files are opened by a process
(Centos7 Use )
lsof -p <pid> |wc -l
or
ls /proc/<pid>/fdinfo/ | grep '^[0-9]'| wc
<5> Adjust the... Of the service limit
Because of the configuration systemctl Management style , modify service file , And reload it
Usually in /etc/systemd/system/ or /usr/lib/systemd/system/
cat /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus Node Exporter
After=network.target
[Service]
ExecStart=/data/prometheus/prometheus --config.file=/data/prometheus/prometheus.yml --storage.tsdb.path=/data/prometheus --web.read-timeout=5m --web.max-connections=100 --storage.tsdb.retention=10d --query.max-concurrency=20 --query.timeout=2m --web.enable-lifecycle
User=root
LimitNOFILE=10240
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl restart prometheus
System limit Tuning supplement
<1> Resource restriction profile
cat /etc/security/limits.conf
# At present shell Maximum number of files that can be opened by this user
* soft nofile 102400
* hard nofile 102400
# At present shell The maximum number of processes that this user can create
* soft nproc 102400
* hard nproc 102400
nofile The upper limit of the value of is determined by /proc/sys/fs/nr_open The limit , The default is 1048576
If you modify limit.conf Of nofile, Exceeding this value will result in a connection ssh Failure
---------------------------------------------------------------
<2> The number of open file descriptors used by the current system
cat /proc/sys/fs/file-nr
3072 0 794326
The number of open file descriptors allocated for use by the system The second number is the number released after allocation ( It is no longer in use ) The maximum number of file descriptors that the kernel can allocate (file-max)
---------------------------------------------------------------
<3> The maximum number of allocable file descriptors in the system kernel
cat /proc/sys/fs/file-max
6815744
Temporary modification method :
sysctl -w “fs.file-max=6553560”
Permanent modification file-max Value
vi /etc/sysctl.conf add to
fs.file-max = 6553560
---------------------------------------------------------------
<4> The maximum number of files that a single process can allocate
cat /proc/sys/fs/nr_open
1048576
Temporary modification method :
sysctl -w “fs.nr_open=1048576”
Permanent modification file-max Value
vi /etc/sysctl.conf add to
fs.nr_open=1048576
---------------------------------------------------------------
<5> summary
file-max The value of is greater than nr_open,nr_open The maximum number of files that can be configured in a single process cannot be greater than file-max
The number of file descriptors opened by all processes cannot exceed /proc/sys/fs/file-max
The number of file descriptors opened by a single process cannot exceed user limit in nofile Of soft limit
nofile Of soft limit No more than hard limit
nofile Of hard limit No more than /proc/sys/fs/nr_open
边栏推荐
- Educational Codeforces Round 20 E. Roma and Poker
- Helvetic Coding Contest 2017 online mirror (teams allowed, unrated) K - Send the Fool Further! (medi
- 性价比最高处理器和国产处理器I.MX6UL/A40I/T3对比
- srs-nodejs
- NAND flash Basics
- [reprint] LCD common interface principle
- Divide by Zero 2017 and Codeforces Round #399 (Div. 1 + Div. 2, combined) E. Game of Stones
- Implementation of Excel piecewise linear interpolation function
- Le Conseil de développement ITop - 2k1000 démarre ramdisk - make Start USB
- C # covariant inverter
猜你喜欢

全志平台BSP裁剪(6)附件三--rootfs menuconfig配置说明

全志V3s学习记录--ESP8089的使用

Conversion of data type real and word in PROFIBUS DP communication

照葫芦画瓢,移植qt5.12到T507开发板

Analysis and Discussion on security level of 6-bit password lock
unity 定位服务GPS API

Gh-bladed4.9 lidar module

The performance comparison of Quanzhi T3 (a40i) /t5 (t507) is better than that of the previous generation

基于国产全志A40I的机器人示教器解决方案

DNS principles 01 introduction to DNS principles
随机推荐
Coredns part 4-compiling and installing unbound
香蕉派 BPI-M2 Ultra的缩小版-CoM-X40I核心板
Transmission medium twisted pair and optical fiber and binary
VK Cup 2017 - Round 2 A. Voltage Keepsake
iTOP-2K1000开发板启动ramdisk-制作启动U盘
VK Cup 2017 - Round 2 A. Voltage Keepsake
Banana pie bpi-m2 ultra miniaturized version -com-x40i core board
unity 定位服务GPS API
[early spring 2022] [leetcode] 91 Decoding method
全志V3s学习记录(12)RTL8723BS的使用
Atlas7 NAND stress test program
Wireshark illustrates TCP three handshakes and four waves
Vs2013 secret key
基於國產全志A40I的機器人示教器解决方案
Encounter nodejs
C# Lambda表达式
el-table设置高度,表头出现错位
Coredns Part 1 Introduction and installation
C# 协变逆变
CountDownLatch