[Webinar] Streamline your web hosting managementRegister Today

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 7606
  • Last Modified:

How to read vmstat on Sun Solaris and what are the key items to test

On Sun Solaris I am running the below vmstat and need to know key items to look for in the output. I know I have serious problems and the system is about to crash when 'id' falls below 10. What other key values do I need to be testing?

----- vmstat output
 procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr m0 m1 m4 m1   in   sy   cs us sy id
 0 0 0 22292328 18225184 51 2005 51 5 4 0 0 7  0  1  6 1680  623 1372 32 37 31
0
rayskelton
Asked:
rayskelton
2 Solutions
 
SciGuyCommented:
heres a short desription of each field:

r     in run queue
b     blocked for resources I/O, paging
w     swapped

swap  amount  of  swap   space   currently   available
free  size of the free list

re    page reclaims
mf    minor faults
pi    KB paged in
po   KB paged out
fr    KB freed
de  KB anticipated mem shortfall
sr  pages scanned by clocked alg.

m*  disk operations per second

in  interrupts/sec
sys  syscalls/sec
cs   constext switches/sec

us  %user time
sy   %system/kernel time
id   %idle time

idle time isnt very useful for diagnosing a crash.  However, if you're running something that eats up CPU and causing a crash, it might also be using up lots of memory as well... (even then, I dont know how a user application can cause the entire system to crash)
 Whatever the cause, it is not likely because of "id" dropping below 10%.

You'll need to give more info about what you're running to cause the crash
0
 
wesly_chenCommented:
/var/adm/messages is the place to look at first when you encounter the crash.

Wesly
0
 
rayskeltonAuthor Commented:
I am looking at this from a developer of the only application on numerous large Solaris systems, which eats much memory and cpu during peak production periods. This is actually a good problem to have, since it means business is good.  I can always count on serious outages to occure, when the id drops below 10 and have added this check into my monitoring software. I was wanting to look at other crutial items within vmstat. I am a developer and not a sys admin, so attempting to identify whatis  the exact problem at a system level is not my concern. My concern is to give a pre warning to production support before a problem actually occurs. This gives them time to shut down batch servers and potential prevent a problem.  
0
Take Control of Web Hosting For Your Clients

As a web developer or IT admin, successfully managing multiple client accounts can be challenging. In this webinar we will look at the tools provided by Media Temple and Plesk to make managing your clients’ hosting easier.

 
wesly_chenCommented:
Okay, then the "swap free" is another item you might want to watch.
Usually, when swap free go below certain percentage (3%) and the system start unstable.

Wesly
0
 
cpc2004Commented:
Solaris crashes only if it runs out storage or system software problem. Even the CPU is running 95%, only the running processes are running slower than, it will not crash Solaris. I suspected the crash at your solaris is due to ran of virtual storage. Analyse the crash dump and you will find out the answer.

My installation has over 30 production Solaris system and I never have the Solaris crash due to high CPU utlization. We setup the monitoring tools to alert Technical Support whenever the CPU utilization of Solaris is over 90%.  Usually I issue top command to find out which process use most of the CPU. Kill the job if I suspect that the process is using extremely high CPU which slows down the system performance.

VMSTAT only shows the overall performance and it cannot find out the system hang up problem.  Our installation has over 50 production AIX and they never crash because of the CPU is high.  You need to install monitoring tools such as CA-NSM, BMC Patrol, Candle CCC or EcoTools to automate the computer monitoring.

Propsed System Health Checking
1. Run out of virtaul storage
   Check the usage of the swap file alert if it is over 80%
2. Filesystem corruption
   Monitor /var/adm/ras/message
3. Non-recovery hardware error such as CPU and memory
   Monitor  /var/adm/ras/messages to alert hardware message. You can get a list of hardware message from Solaris
4. Filesystem ran of space
  Monitor /var/adm/ras/messages if the usage of root, /tmp and /usr filesystem is over 90%


0
 
wesly_chenCommented:
Hi,

   My personal experience with Sun Ultra 80/Enterprise E420R have the crash problem with high-loaded CPU.
It turns out to be the hardware architecture of the clock bus between CPU and memory has bug on this motherboard design.
No OS patch can really fix this issue (Solaris 7, 8, 9 are all have the same issue).

   Anyway, monitor the "swap usage" and the "/var partition" is important to avoid crashing or hung-up.

Wesly
0

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now