Solved

Solaris command or utility to monitor number of open file descriptors and sockets

Posted on 2006-06-12
13
923 Views
Last Modified: 2013-12-27

Can somebody tell me what would be the best way to monitor the open sockets and file descriptors on a machine?  I am running into a situation where I think my machine is getting max'ed out on its file descriptors, which is set to 256, but I want to monitor this before I starting tuning the kernel.  Can somebody give me some advice on how to do this?  I know that I can use netstat to look at open sockets, but that doesn't give me the total number of combined sockets and file descriptors.

Also, I noticed that netstat sometimes repeats information for a socket.  Does anybody know to prevent that from happening?

Thanks.
0
Comment
Question by:mromeo
  • 4
  • 3
  • 2
  • +2
13 Comments
 
LVL 38

Expert Comment

by:yuzh
ID: 16890907
you can use "lsof" to do the job, download lsof binary package from:
http://sunfreeware.com/

man lsof (after installation)
to learn more details.

For tuning:
http://www.princeton.edu/~psg/unix/Solaris/troubleshoot/kerntune.html
0
 

Author Comment

by:mromeo
ID: 16895622
This is supposed to list the open files on a system.  If I type ulimit -a, it reports that my max file descriptors is 256. Yet if I type lsof | wc -l there are over 4000 entries.  So what is the real max on the number of file descriptors.  I assume that I am interpretting something incorrectly.  Can someone please explain the max file descriptors to me and how I can know if I'm hitting that limit?  Thanks.

0
 
LVL 38

Expert Comment

by:yuzh
ID: 16899222
lsof lists all open files, including files which are not using file descriptors - such as current working directories, memory mapped library files.

Please have a look at the following docs to  learn more details:

http://technopark02.blogspot.com/2005/05/solaris-32-bits-fopen-and-max-number.html
http://www.netadmintools.com/art295.html

and
http://sial.org/howto/debug/unix/lsof/
0
 
LVL 10

Expert Comment

by:Nukfror
ID: 16914948
Assuming you run lsof as root or lsof is SUID root, then lsof show all files open files for the entire system for ALL processes running.  ulimit -a is showing the PER PROCESS limit for the user running the command.  So if you personally have 1000 running applications and your file descriptor rlimit is set to 256, you theoretically could have 256 open file descriptors for each of your 1000 applications.  

So you trying to compare a micro view with a macro view - doesn't work.
0
 

Author Comment

by:mromeo
ID: 16915014
I thought this would be easy.  This system is losing socket connections w/o errors and I don't really know what else to look at.  I am following the theory that it is running low on system resources, as it is a very busy machine, but finding the right tools to monitor the machine is not so easy.  

Any suggestions are appreciated.
Thanks.
0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 10

Expert Comment

by:Nukfror
ID: 16915616
You need to be more specific.  What's losing socket connections e.g. what application is losing them ?  Solaris the OS can handle 10 of thousands of sockets.

If the machine is busy it could be something else.  Is the machine page thrashing or swapping ?  

Run vmstat command and look at W column (swapped) and SR column (scan rate).  A positive W column means applications have physically swapped out of memory at some point in the past.  A postivie SR columns means you are having a memory pressure.  

The R column (mean runnable jobs but they can't cuz the CPUs are too busy at the moment) may also be another interesting column.  The old rule is still pretty much 4x to CPU core count and you need to start looking at your environment.  Higher then this and you have a machine that's probably not sized correctly for the workload being thrown at it.  

Check your network statistics and see if you are getting lots of network errors:  netstat -in.  

A remote possibility is running out of swap space (but if this is true then you would be having more then just lost socket issues).  Run a swap -l - this will show physical swap and if you're running into a comsumption issue.  This relates to both a swap/page thrash situation as well a application that are consuming all your tmpfs space.  You can put limits on how big you let your /tmp or even /var/run directories get.  See the man page for mount_tmpfs.
0
 

Author Comment

by:mromeo
ID: 16915766
Some proprietary and 3rd party applications are losing their socket connections at about the same time every night.   The only way to recover is to restart these programs. It is very hard to pinpoint the exact time and sequence of events, but I am trying to write a script that will help gather some statistis.  I have added your suggestions above.  I am going to run lsof, swap, netstat, and vmstat.  I'm also going to use lsof and netstat to try to get the number of socket connections and their state.  
0
 
LVL 10

Expert Comment

by:Nukfror
ID: 16916406
This could be network related as well.  You might want to use a packet sniffer close to the time the sockets drop to see if something is coming in from the remote side closing down the connection(s).  
0
 

Author Comment

by:mromeo
ID: 17592222
I was able to do this monitoring using sar -r 10 100.  The last value in the list was what I was looking for.
0
 
LVL 20

Expert Comment

by:Venabili
ID: 17592310
Changed recommendation: PAQ - refund
0
 

Accepted Solution

by:
CetusMOD earned 0 total points
ID: 17631255
PAQed with points refunded (200)

CetusMOD
Community Support Moderator
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Let's say you need to move the data of a file system from one partition to another. This generally involves dismounting the file system, backing it up to tapes, and restoring it to a new partition. You may also copy the file system from one place to…
Every server (virtual or physical) needs a console: and the console can be provided through hardware directly connected, software for remote connections, local connections, through a KVM, etc. This document explains the different types of consol…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now