Solved

Solaris command or utility to monitor number of open file descriptors and sockets

Posted on 2006-06-12
13
926 Views
Last Modified: 2013-12-27

Can somebody tell me what would be the best way to monitor the open sockets and file descriptors on a machine?  I am running into a situation where I think my machine is getting max'ed out on its file descriptors, which is set to 256, but I want to monitor this before I starting tuning the kernel.  Can somebody give me some advice on how to do this?  I know that I can use netstat to look at open sockets, but that doesn't give me the total number of combined sockets and file descriptors.

Also, I noticed that netstat sometimes repeats information for a socket.  Does anybody know to prevent that from happening?

Thanks.
0
Comment
Question by:mromeo
  • 4
  • 3
  • 2
  • +2
13 Comments
 
LVL 38

Expert Comment

by:yuzh
ID: 16890907
you can use "lsof" to do the job, download lsof binary package from:
http://sunfreeware.com/

man lsof (after installation)
to learn more details.

For tuning:
http://www.princeton.edu/~psg/unix/Solaris/troubleshoot/kerntune.html
0
 

Author Comment

by:mromeo
ID: 16895622
This is supposed to list the open files on a system.  If I type ulimit -a, it reports that my max file descriptors is 256. Yet if I type lsof | wc -l there are over 4000 entries.  So what is the real max on the number of file descriptors.  I assume that I am interpretting something incorrectly.  Can someone please explain the max file descriptors to me and how I can know if I'm hitting that limit?  Thanks.

0
 
LVL 38

Expert Comment

by:yuzh
ID: 16899222
lsof lists all open files, including files which are not using file descriptors - such as current working directories, memory mapped library files.

Please have a look at the following docs to  learn more details:

http://technopark02.blogspot.com/2005/05/solaris-32-bits-fopen-and-max-number.html
http://www.netadmintools.com/art295.html

and
http://sial.org/howto/debug/unix/lsof/
0
 
LVL 10

Expert Comment

by:Nukfror
ID: 16914948
Assuming you run lsof as root or lsof is SUID root, then lsof show all files open files for the entire system for ALL processes running.  ulimit -a is showing the PER PROCESS limit for the user running the command.  So if you personally have 1000 running applications and your file descriptor rlimit is set to 256, you theoretically could have 256 open file descriptors for each of your 1000 applications.  

So you trying to compare a micro view with a macro view - doesn't work.
0
 

Author Comment

by:mromeo
ID: 16915014
I thought this would be easy.  This system is losing socket connections w/o errors and I don't really know what else to look at.  I am following the theory that it is running low on system resources, as it is a very busy machine, but finding the right tools to monitor the machine is not so easy.  

Any suggestions are appreciated.
Thanks.
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 10

Expert Comment

by:Nukfror
ID: 16915616
You need to be more specific.  What's losing socket connections e.g. what application is losing them ?  Solaris the OS can handle 10 of thousands of sockets.

If the machine is busy it could be something else.  Is the machine page thrashing or swapping ?  

Run vmstat command and look at W column (swapped) and SR column (scan rate).  A positive W column means applications have physically swapped out of memory at some point in the past.  A postivie SR columns means you are having a memory pressure.  

The R column (mean runnable jobs but they can't cuz the CPUs are too busy at the moment) may also be another interesting column.  The old rule is still pretty much 4x to CPU core count and you need to start looking at your environment.  Higher then this and you have a machine that's probably not sized correctly for the workload being thrown at it.  

Check your network statistics and see if you are getting lots of network errors:  netstat -in.  

A remote possibility is running out of swap space (but if this is true then you would be having more then just lost socket issues).  Run a swap -l - this will show physical swap and if you're running into a comsumption issue.  This relates to both a swap/page thrash situation as well a application that are consuming all your tmpfs space.  You can put limits on how big you let your /tmp or even /var/run directories get.  See the man page for mount_tmpfs.
0
 

Author Comment

by:mromeo
ID: 16915766
Some proprietary and 3rd party applications are losing their socket connections at about the same time every night.   The only way to recover is to restart these programs. It is very hard to pinpoint the exact time and sequence of events, but I am trying to write a script that will help gather some statistis.  I have added your suggestions above.  I am going to run lsof, swap, netstat, and vmstat.  I'm also going to use lsof and netstat to try to get the number of socket connections and their state.  
0
 
LVL 10

Expert Comment

by:Nukfror
ID: 16916406
This could be network related as well.  You might want to use a packet sniffer close to the time the sockets drop to see if something is coming in from the remote side closing down the connection(s).  
0
 

Author Comment

by:mromeo
ID: 17592222
I was able to do this monitoring using sar -r 10 100.  The last value in the list was what I was looking for.
0
 
LVL 20

Expert Comment

by:Venabili
ID: 17592310
Changed recommendation: PAQ - refund
0
 

Accepted Solution

by:
CetusMOD earned 0 total points
ID: 17631255
PAQed with points refunded (200)

CetusMOD
Community Support Moderator
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
create user in TSM 7 585
cant install flashplugins for firefox in freebsd 8 9 44
How would you as a DBA (DB2) answer this question? 9 73
unix in java example 9 52
Attention: This article will no longer be maintained. If you have any questions, please feel free to mail me. jgh@FreeBSD.org Please see http://www.freebsd.org/doc/en_US.ISO8859-1/articles/freebsd-update-server/ for the updated article. It is avail…
When you do backups in the Solaris Operating System, the file system must be inactive. Otherwise, the output may be inconsistent. A file system is inactive when it's unmounted or it's write-locked by the operating system. Although the fssnap utility…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…

914 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now