• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2344
  • Last Modified:

vmstat procs blocked - how to dig deeper?

Hello,

System details:
HP-UX 11.23 on ia64

There appears to be a resource bottleneck on a server. When I run vmstat, I get the following output:
vmstat 5 5                                                                                          
                                                                         
         procs           memory                   page                              faults       cpu    
    r     b     w      avm    free   re   at    pi   po    fr   de    sr     in     sy    cs  us sy id  
    5    19     0  5503761  33861046  308   80     2    0     0    0     2  38750 416246 17021  16  7 77
    6    19     0  4093880  33860506  229   55     0    0     0    0     0  27147 228747 11679  17  4 79
    6    19     0  4093880  33859994  255   73     0    0     0    0     0  23451 218478 10536  17  4 79
    5    20     0  4137557  33859938  137   35     0    0     0    0     0  22023 202479  9512  18  3 80
    5    20     0  4137557  33860349  168   60     0    0     0    0     0  22964 563017  9528  16  6 78

From what I understand, b = blocked, which means the process is awaiting resources. As the output suggests, this is not memory related, so it must be I/O (disk operations or network or network, right?)

The database response times are down. How can I dig deeper into this? I've taken a look at iostat but the values don't really tell me much.

Thanks in advance.
0
SAP11-11
Asked:
SAP11-11
  • 4
  • 3
1 Solution
 
peter991Commented:
Here is som notes I made/found when looking in to vmstat.

 Problem symptoms:
1.) If the number of processes in run queue (procs r) are consistently greater than the number of CPUs on the system it will slow down system as there are more processes then available CPUs .
2.) if  this number is more than four times the number of available CPUs in the system then system is facing shortage of cpu power and will greatly slow down the processess on the system.
3.) If  the idle time (cpu id) is consistently 0 and if the system time (cpu sy) is double the user time (cpu us)  system is facing shortage of CPU resources.
     
Resolution :
Resolution to these kind of issues involves tuning of application procedures  to make efficient use of cpu
and as a last resort increasing the cpu power or adding more cpu to the system.  
0
 
SAP11-11Author Commented:
Thanks for the reply but I understand the 'r' column. I don't think the server is suffering a CPU shortage.
It's the blocked processes that concern me. I've read that this should not very often go over 1 and indicates that the processes must await another resource before completion.
0
 
peter991Commented:
Hi!
It's hard to tell but your pi,po (paging) and sc (scan-rate) is zero.
(I saw the single 2 on the first line)

My guess is to focus on the application you are running on your machine.
0
Get 10% Off Your First Squarespace Website

Ready to showcase your work, publish content or promote your business online? With Squarespace’s award-winning templates and 24/7 customer service, getting started is simple. Head to Squarespace.com and use offer code ‘EXPERTS’ to get 10% off your first purchase.

 
SAP11-11Author Commented:
The application is Oracle and the 'log file sync' times are higher than expected (not a great deal, however.)

My question is; is it possible to drill down at OS level to ascertain what could be causing the processes to be blocked to such an extent. My thinking is disk I/O, especially considering the log file sync times being up. However, the stats from vmstat look extrodanarily high. I've never seen this many blocked processes before.

This is the most recent output:
         procs           memory                   page                              faults       cpu    
    r     b     w      avm    free   re   at    pi   po    fr   de    sr     in     sy    cs  us sy id  
    3    23     0  5410152  33450719  308   80     2    0     0    0     2  38796 416588 17049  16  7 77
    7    16     0  5386809  33450385  109   28     0    0     0    0     0  38351 858951 15928  12  7 81
    7    16     0  5386809  33449929  287   77     0    0     0    0     0  38375 765130 15618  13  7 80
    8    19     0  5035875  33450725  331   73     0    0     0    0     0  40205 473336 17055  15  7 78
    8    19     0  5035875  33450569  207   51     0    0     0    0     0  36935 345422 15414  16  5 79
    6    22     0  6256459  33450568  120   29     0    0     0    0     0  36713 292311 14967  17  4 79
    6    22     0  6256459  33450438   52   12     0    0     0    0     0  34592 259139 13439  18  4 78
   12    14     0  6079667  33450421  245  199     0    0     0    0     0  32498 237380 12549  23  5 73
   12    14     0  6079667  33450421   80   64     0    0     0    0     0  34561 231179 13518  23  3 74
    6    19     0  5188123  33450588  103   47     0    0     0    0     0  36349 256779 14371  28  4 67
    6    19     0  5188123  33450469   39   15     0    0     0    0     0  35150 267327 14136  28  4 68
   16    11     0  5326562  33450452   78   30     0    0     0    0     0  40089 386057 19325  20  5 75

The values are consistently high.
0
 
peter991Commented:
Perhaps this is a Oracle-question.
have you looked over your database?
Doe's it switch a lot?
Pending on your Oracle version, doe's the values from AWR or Statspack look good?
0
 
SAP11-11Author Commented:
We checked with the DBAs and they say that Oracle isn't the problem and the slightly increased high sync times indicate an I/O bottleneck, something that is outside of Oracle's control (assuming all files are spread across the devices optimally.)

sar -d gave more info and I was able to use this command to get a better view of what the devices were doing. This gave me the info I needed to push the problem back to the storage experts.

Thanks for all your help, anyway.
0
 
peter991Commented:
I'm glad to be at help.

Good luck!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Get 10% Off Your First Squarespace Website

Ready to showcase your work, publish content or promote your business online? With Squarespace’s award-winning templates and 24/7 customer service, getting started is simple. Head to Squarespace.com and use offer code ‘EXPERTS’ to get 10% off your first purchase.

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now