[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now

x
?
Solved

How to read vmstat on Sun Solaris and what are the key items to test

Posted on 2004-10-26
6
Medium Priority
?
7,549 Views
Last Modified: 2013-12-05
On Sun Solaris I am running the below vmstat and need to know key items to look for in the output. I know I have serious problems and the system is about to crash when 'id' falls below 10. What other key values do I need to be testing?

----- vmstat output
 procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr m0 m1 m4 m1   in   sy   cs us sy id
 0 0 0 22292328 18225184 51 2005 51 5 4 0 0 7  0  1  6 1680  623 1372 32 37 31
0
Comment
Question by:rayskelton
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 1

Expert Comment

by:SciGuy
ID: 12416373
heres a short desription of each field:

r     in run queue
b     blocked for resources I/O, paging
w     swapped

swap  amount  of  swap   space   currently   available
free  size of the free list

re    page reclaims
mf    minor faults
pi    KB paged in
po   KB paged out
fr    KB freed
de  KB anticipated mem shortfall
sr  pages scanned by clocked alg.

m*  disk operations per second

in  interrupts/sec
sys  syscalls/sec
cs   constext switches/sec

us  %user time
sy   %system/kernel time
id   %idle time

idle time isnt very useful for diagnosing a crash.  However, if you're running something that eats up CPU and causing a crash, it might also be using up lots of memory as well... (even then, I dont know how a user application can cause the entire system to crash)
 Whatever the cause, it is not likely because of "id" dropping below 10%.

You'll need to give more info about what you're running to cause the crash
0
 
LVL 38

Expert Comment

by:wesly_chen
ID: 12417004
/var/adm/messages is the place to look at first when you encounter the crash.

Wesly
0
 

Author Comment

by:rayskelton
ID: 12420355
I am looking at this from a developer of the only application on numerous large Solaris systems, which eats much memory and cpu during peak production periods. This is actually a good problem to have, since it means business is good.  I can always count on serious outages to occure, when the id drops below 10 and have added this check into my monitoring software. I was wanting to look at other crutial items within vmstat. I am a developer and not a sys admin, so attempting to identify whatis  the exact problem at a system level is not my concern. My concern is to give a pre warning to production support before a problem actually occurs. This gives them time to shut down batch servers and potential prevent a problem.  
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 38

Expert Comment

by:wesly_chen
ID: 12422463
Okay, then the "swap free" is another item you might want to watch.
Usually, when swap free go below certain percentage (3%) and the system start unstable.

Wesly
0
 
LVL 20

Accepted Solution

by:
cpc2004 earned 500 total points
ID: 12436577
Solaris crashes only if it runs out storage or system software problem. Even the CPU is running 95%, only the running processes are running slower than, it will not crash Solaris. I suspected the crash at your solaris is due to ran of virtual storage. Analyse the crash dump and you will find out the answer.

My installation has over 30 production Solaris system and I never have the Solaris crash due to high CPU utlization. We setup the monitoring tools to alert Technical Support whenever the CPU utilization of Solaris is over 90%.  Usually I issue top command to find out which process use most of the CPU. Kill the job if I suspect that the process is using extremely high CPU which slows down the system performance.

VMSTAT only shows the overall performance and it cannot find out the system hang up problem.  Our installation has over 50 production AIX and they never crash because of the CPU is high.  You need to install monitoring tools such as CA-NSM, BMC Patrol, Candle CCC or EcoTools to automate the computer monitoring.

Propsed System Health Checking
1. Run out of virtaul storage
   Check the usage of the swap file alert if it is over 80%
2. Filesystem corruption
   Monitor /var/adm/ras/message
3. Non-recovery hardware error such as CPU and memory
   Monitor  /var/adm/ras/messages to alert hardware message. You can get a list of hardware message from Solaris
4. Filesystem ran of space
  Monitor /var/adm/ras/messages if the usage of root, /tmp and /usr filesystem is over 90%


0
 
LVL 38

Assisted Solution

by:wesly_chen
wesly_chen earned 500 total points
ID: 12436768
Hi,

   My personal experience with Sun Ultra 80/Enterprise E420R have the crash problem with high-loaded CPU.
It turns out to be the hardware architecture of the clock bus between CPU and memory has bug on this motherboard design.
No OS patch can really fix this issue (Solaris 7, 8, 9 are all have the same issue).

   Anyway, monitor the "swap usage" and the "/var partition" is important to avoid crashing or hung-up.

Wesly
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Just about everyone has an old PC laying around.  Ask anyone in the IT industry, whether they are a professional or play in it as a hobby.  From outdated Desktops to cheap "throwaway" laptops, they are all around and not as hard to "fix up" as you m…
I. Introduction There's an interesting discussion going on now in an Experts Exchange Group — Attachments with no extension (http://www.experts-exchange.com/discussions/210281/Attachments-with-no-extension.html). This reminded me of questions tha…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Hi friends,  in this video  I'll show you how new windows 10 user can learn the using of windows 10. Thank you.

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question