Link to home
Start Free TrialLog in
Avatar of WAS
WASFlag for United States of America

asked on

open files in linux

we are getting latency issues when we get below error (Too many open files)
 in tomcat logs, when i check ulimit -a, we have (open files  (-n) 1048576) and when run the lsof we get only 246408 which is much less then ulimit number which 1048576, Can you please help me how to troubeshoot this issue.

# lsof | wc -l
246408

2020-07-28 10:33:14,956 [THREAD ID=pool-3-thread-509] ERROR org.apache.camel.util.CamelLogger:156 - Failed delivery for (MessageId: ID-l33a-1595638997685-1-3459767 on ExchangeId: ID-l33a-1595638997685-1-3459767). On delivery attempt: 1 caught: java.lang.IllegalStateException: java.io.FileNotFoundException: /opt/app/tomcat/webapps/app-1/WEB-INF/lib/geronimo-javamail_1.4_mail-1.8.4.jar (Too many open files)


$ ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1048576
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) unlimited
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
$


Avatar of Dr. Klahn
Dr. Klahn

I/M/O, 246,000 files or connections open all at once is excessive.  (They can't all be connections if this is an IPv4 system; there are only 65,000 ports.)

Rather than increase the limit, find out why all these files or connections are all open.  See a commentary on this problem at that other site, with regard to the networking side of the issue:

https://stackoverflow.com/questions/5656458/java-net-socketexception-too-many-open-files/37605213#37605213

<opinion>
In a primarily web server environment you may want to cut some linux TCP defaults down.  This can be done in /etc/sysctl.conf -

# Decrease the KEEPALIVE time to 300 seconds
net.ipv4.tcp_keepalive_time = 300
# And set the KEEPALIVE interval to 30 seconds
net.ipv4.tcp_keepalive_intvl = 30
# Probes fail 4 times before declaring the connection dead
net.ipv4.tcp_keepalive_probes = 4
# Decrease TIME_WAIT seconds
net.ipv4.tcp_fin_timeout = 30
# Recycle and Reuse TIME_WAIT sockets faster
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1

Open in new window

</opinion>

If it develops that the problem can't be solved, consider spreading the application across multiple servers, possibly using a reverse proxy.
Wow... That is a lot of open files...

Might be useful for you to mention if you're running many LXD containers.

Also to attach (as a .txt file) your lsof output.

Something seems amiss.

On a random machine I'm logged in on right now I see...

# 28 LXD containers running complex LAMP Stacks
net17 # lsof 2>/dev/null | wc -l
886058

# Dividing this number by 29 Ubuntu instances show a per instance number of files...
net17 # echo "886058/29" | bc
30553

Open in new window


This machine has fairly heavy I/O thrash, so many files/dirs/connections opening + closing.

I'm with Dr. Klahn... 246408 seems... like there's likely some file descriptor leak somewhere...

If there is a file descriptor leak, increasing the file descriptor number likely won't help, as any additional file descriptors will get consumed, then leaked.

Likely your lsof output will give clues to any problems.
Avatar of WAS

ASKER

Sorry for late reply, i see there is no /dev/null in the lsof output, below is the content of /etc/sysctl.conf , i don't see the tcp settings.

$ cat /etc/sysctl.conf
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
# Settings made specifically to pass the ESM test
# Stack protection
kernel.exec-shield = 1
kernel.randomize_va_space = 1
-------------------------------------------------------------------------------------
$$ cat /usr/lib/sysctl.d/00-system.conf
# Kernel sysctl configuration file
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
$




ASKER CERTIFIED SOLUTION
Avatar of David Favor
David Favor
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial