Solved

vmstat procs blocked - how to dig deeper?

Posted on 2008-10-07
7
2,144 Views
Last Modified: 2013-12-06
Hello,

System details:
HP-UX 11.23 on ia64

There appears to be a resource bottleneck on a server. When I run vmstat, I get the following output:
vmstat 5 5                                                                                          
                                                                         
         procs           memory                   page                              faults       cpu    
    r     b     w      avm    free   re   at    pi   po    fr   de    sr     in     sy    cs  us sy id  
    5    19     0  5503761  33861046  308   80     2    0     0    0     2  38750 416246 17021  16  7 77
    6    19     0  4093880  33860506  229   55     0    0     0    0     0  27147 228747 11679  17  4 79
    6    19     0  4093880  33859994  255   73     0    0     0    0     0  23451 218478 10536  17  4 79
    5    20     0  4137557  33859938  137   35     0    0     0    0     0  22023 202479  9512  18  3 80
    5    20     0  4137557  33860349  168   60     0    0     0    0     0  22964 563017  9528  16  6 78

From what I understand, b = blocked, which means the process is awaiting resources. As the output suggests, this is not memory related, so it must be I/O (disk operations or network or network, right?)

The database response times are down. How can I dig deeper into this? I've taken a look at iostat but the values don't really tell me much.

Thanks in advance.
0
Comment
Question by:SAP11-11
  • 4
  • 3
7 Comments
 
LVL 6

Expert Comment

by:peter991
Comment Utility
Here is som notes I made/found when looking in to vmstat.

 Problem symptoms:
1.) If the number of processes in run queue (procs r) are consistently greater than the number of CPUs on the system it will slow down system as there are more processes then available CPUs .
2.) if  this number is more than four times the number of available CPUs in the system then system is facing shortage of cpu power and will greatly slow down the processess on the system.
3.) If  the idle time (cpu id) is consistently 0 and if the system time (cpu sy) is double the user time (cpu us)  system is facing shortage of CPU resources.
     
Resolution :
Resolution to these kind of issues involves tuning of application procedures  to make efficient use of cpu
and as a last resort increasing the cpu power or adding more cpu to the system.  
0
 

Author Comment

by:SAP11-11
Comment Utility
Thanks for the reply but I understand the 'r' column. I don't think the server is suffering a CPU shortage.
It's the blocked processes that concern me. I've read that this should not very often go over 1 and indicates that the processes must await another resource before completion.
0
 
LVL 6

Expert Comment

by:peter991
Comment Utility
Hi!
It's hard to tell but your pi,po (paging) and sc (scan-rate) is zero.
(I saw the single 2 on the first line)

My guess is to focus on the application you are running on your machine.
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 

Author Comment

by:SAP11-11
Comment Utility
The application is Oracle and the 'log file sync' times are higher than expected (not a great deal, however.)

My question is; is it possible to drill down at OS level to ascertain what could be causing the processes to be blocked to such an extent. My thinking is disk I/O, especially considering the log file sync times being up. However, the stats from vmstat look extrodanarily high. I've never seen this many blocked processes before.

This is the most recent output:
         procs           memory                   page                              faults       cpu    
    r     b     w      avm    free   re   at    pi   po    fr   de    sr     in     sy    cs  us sy id  
    3    23     0  5410152  33450719  308   80     2    0     0    0     2  38796 416588 17049  16  7 77
    7    16     0  5386809  33450385  109   28     0    0     0    0     0  38351 858951 15928  12  7 81
    7    16     0  5386809  33449929  287   77     0    0     0    0     0  38375 765130 15618  13  7 80
    8    19     0  5035875  33450725  331   73     0    0     0    0     0  40205 473336 17055  15  7 78
    8    19     0  5035875  33450569  207   51     0    0     0    0     0  36935 345422 15414  16  5 79
    6    22     0  6256459  33450568  120   29     0    0     0    0     0  36713 292311 14967  17  4 79
    6    22     0  6256459  33450438   52   12     0    0     0    0     0  34592 259139 13439  18  4 78
   12    14     0  6079667  33450421  245  199     0    0     0    0     0  32498 237380 12549  23  5 73
   12    14     0  6079667  33450421   80   64     0    0     0    0     0  34561 231179 13518  23  3 74
    6    19     0  5188123  33450588  103   47     0    0     0    0     0  36349 256779 14371  28  4 67
    6    19     0  5188123  33450469   39   15     0    0     0    0     0  35150 267327 14136  28  4 68
   16    11     0  5326562  33450452   78   30     0    0     0    0     0  40089 386057 19325  20  5 75

The values are consistently high.
0
 
LVL 6

Expert Comment

by:peter991
Comment Utility
Perhaps this is a Oracle-question.
have you looked over your database?
Doe's it switch a lot?
Pending on your Oracle version, doe's the values from AWR or Statspack look good?
0
 

Accepted Solution

by:
SAP11-11 earned 0 total points
Comment Utility
We checked with the DBAs and they say that Oracle isn't the problem and the slightly increased high sync times indicate an I/O bottleneck, something that is outside of Oracle's control (assuming all files are spread across the devices optimally.)

sar -d gave more info and I was able to use this command to get a better view of what the devices were doing. This gave me the info I needed to push the problem back to the storage experts.

Thanks for all your help, anyway.
0
 
LVL 6

Expert Comment

by:peter991
Comment Utility
I'm glad to be at help.

Good luck!
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

INTRODUCTION The purpose of this document is to demonstrate the Installation and configuration, of the HP EVA 4400 SAN Storage. The name , IP and the WWN ID’s used here are not the real ones. ABOUT THE STORAGE For most of you reading this, you …
Data center, now-a-days, is referred as the home of all the advanced technologies. In-fact, most of the businesses are now establishing their entire organizational structure around the IT capabilities.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now