Solaris command or utility to monitor number of open file descriptors and sockets


Can somebody tell me what would be the best way to monitor the open sockets and file descriptors on a machine?  I am running into a situation where I think my machine is getting max'ed out on its file descriptors, which is set to 256, but I want to monitor this before I starting tuning the kernel.  Can somebody give me some advice on how to do this?  I know that I can use netstat to look at open sockets, but that doesn't give me the total number of combined sockets and file descriptors.

Also, I noticed that netstat sometimes repeats information for a socket.  Does anybody know to prevent that from happening?

Thanks.
mromeoAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

yuzhCommented:
you can use "lsof" to do the job, download lsof binary package from:
http://sunfreeware.com/

man lsof (after installation)
to learn more details.

For tuning:
http://www.princeton.edu/~psg/unix/Solaris/troubleshoot/kerntune.html
0
mromeoAuthor Commented:
This is supposed to list the open files on a system.  If I type ulimit -a, it reports that my max file descriptors is 256. Yet if I type lsof | wc -l there are over 4000 entries.  So what is the real max on the number of file descriptors.  I assume that I am interpretting something incorrectly.  Can someone please explain the max file descriptors to me and how I can know if I'm hitting that limit?  Thanks.

0
yuzhCommented:
lsof lists all open files, including files which are not using file descriptors - such as current working directories, memory mapped library files.

Please have a look at the following docs to  learn more details:

http://technopark02.blogspot.com/2005/05/solaris-32-bits-fopen-and-max-number.html
http://www.netadmintools.com/art295.html

and
http://sial.org/howto/debug/unix/lsof/
0
Become a CompTIA Certified Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

NukfrorCommented:
Assuming you run lsof as root or lsof is SUID root, then lsof show all files open files for the entire system for ALL processes running.  ulimit -a is showing the PER PROCESS limit for the user running the command.  So if you personally have 1000 running applications and your file descriptor rlimit is set to 256, you theoretically could have 256 open file descriptors for each of your 1000 applications.  

So you trying to compare a micro view with a macro view - doesn't work.
0
mromeoAuthor Commented:
I thought this would be easy.  This system is losing socket connections w/o errors and I don't really know what else to look at.  I am following the theory that it is running low on system resources, as it is a very busy machine, but finding the right tools to monitor the machine is not so easy.  

Any suggestions are appreciated.
Thanks.
0
NukfrorCommented:
You need to be more specific.  What's losing socket connections e.g. what application is losing them ?  Solaris the OS can handle 10 of thousands of sockets.

If the machine is busy it could be something else.  Is the machine page thrashing or swapping ?  

Run vmstat command and look at W column (swapped) and SR column (scan rate).  A positive W column means applications have physically swapped out of memory at some point in the past.  A postivie SR columns means you are having a memory pressure.  

The R column (mean runnable jobs but they can't cuz the CPUs are too busy at the moment) may also be another interesting column.  The old rule is still pretty much 4x to CPU core count and you need to start looking at your environment.  Higher then this and you have a machine that's probably not sized correctly for the workload being thrown at it.  

Check your network statistics and see if you are getting lots of network errors:  netstat -in.  

A remote possibility is running out of swap space (but if this is true then you would be having more then just lost socket issues).  Run a swap -l - this will show physical swap and if you're running into a comsumption issue.  This relates to both a swap/page thrash situation as well a application that are consuming all your tmpfs space.  You can put limits on how big you let your /tmp or even /var/run directories get.  See the man page for mount_tmpfs.
0
mromeoAuthor Commented:
Some proprietary and 3rd party applications are losing their socket connections at about the same time every night.   The only way to recover is to restart these programs. It is very hard to pinpoint the exact time and sequence of events, but I am trying to write a script that will help gather some statistis.  I have added your suggestions above.  I am going to run lsof, swap, netstat, and vmstat.  I'm also going to use lsof and netstat to try to get the number of socket connections and their state.  
0
NukfrorCommented:
This could be network related as well.  You might want to use a packet sniffer close to the time the sockets drop to see if something is coming in from the remote side closing down the connection(s).  
0
mromeoAuthor Commented:
I was able to do this monitoring using sar -r 10 100.  The last value in the list was what I was looking for.
0
VenabiliCommented:
Changed recommendation: PAQ - refund
0
CetusMODCommented:
PAQed with points refunded (200)

CetusMOD
Community Support Moderator
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Unix OS

From novice to tech pro — start learning today.