Solved

Oracle server experiencing heavy load suddenly.

Posted on 2001-08-14
4
932 Views
Last Modified: 2008-02-01
Hi,
  We are experiencing problems in the oracle database. The load average shoots up to 100+ and the server does not respond and the web site becomes dead slow. I am sending those monitoring tool output alongwith. The oracle box has 4 CPU'S and 8 GB RAM. It has no other application/process running. The load on Web server and Application servers were minial during the period when the load on oracle started building up fast. It went 150+ and after that site became too slow. The load continued to be high for quite some time until we redirected the request to our backup site. Can anyone from oralce side can explain this behaviour. I have doubt on the oracle instance running. If you see on the top process each oracle process gobbles 1GB of RAM which is unexplicable to me. Can anyone suggest solution ??

Thanks
muthu

Performance Monitor Output:

oracle01w:(11)>vmstat 5
 procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s2 s6 s6 s6   in   sy   cs us sy id
 1 0 0  86848 22656   0 163 231 131 139 3504 1 0 2 3 2 1481 6474 2841 29  7 64
 104 0 0 5955704 126584 1 0  0 16 16 2088 0 0  2  0  0 3369 11906 4208 88 12 0
 115 0 0 5954416 126136 1 120 0 201 294 1208 13 0 1 24 1 3140 11277 3941 86 14 0
 117 0 0 5953944 126104 1 171 0 11 60 408 6 0  3  1  1 3350 12272 4169 84 16 0
 107 0 0 5953192 126064 0 122 0 9 120 0 14  0  2  0  1 3049 10758 3763 84 16 0
 116 0 0 5951408 125976 3 187 6 233 545 0 43 0 3 24  0 3388 11862 4227 84 16 0
 102 0 0 5951008 126288 1 180 0 11 99 0 11  0  1  0  0 3232 11928 4018 82 18 0
 108 0 0 5950560 126600 1 129 0 184 184 0 0 0  0 22  0 3253 11827 4074 84 16 0
^Coracle01w:(12)>

oracle01w:(12)>iostat -x
                               extended device statistics
device    r/s  w/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd21      0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0
sd65      0.2  1.4    4.1   10.7  0.0  0.0    3.8   0   0
sd66      1.8  1.2  159.1  133.8  0.0  0.0   11.8   0   2
sd67      1.2  0.6   55.9   50.5  0.0  0.0   10.7   0   1
ssd0      0.3  0.6   11.2    4.6  0.0  0.0   59.1   0   1
nfs1      0.0  0.0    0.0    0.0  0.0  0.0    0.5   0   0
oracle01w:(13)>

oracle01w:(16)>sar -q 5 5

SunOS oracle01w 5.7 Generic_106541-11 sun4u    07/30/01

13:22:47 runq-sz %runocc swpq-sz %swpocc
13:22:52   111.2     100                
13:22:57   103.4      99                
13:23:02    94.8     100                
13:23:07    83.8     100                
13:23:13    77.4      98                

Average     94.1      99                
oracle01w:(17)>

oracle01w:(19)>sar -w 6 6

SunOS oracle01w 5.7 Generic_106541-11 sun4u    07/30/01

13:23:56 swpin/s bswin/s swpot/s bswot/s pswch/s
13:24:03    0.00     0.0    0.00     0.0    4866
13:24:10    0.00     0.0    0.00     0.0    5155
13:24:17    0.00     0.0    0.00     0.0    4677
13:24:23    0.00     0.0    0.00     0.0    4888
13:24:29    0.00     0.0    0.00     0.0    4872
13:24:35    0.00     0.0    0.00     0.0    4489

Average     0.00     0.0    0.00     0.0    4828
oracle01w:(20)>

oracle01w:(20)>sar -g 5 5

SunOS oracle01w 5.7 Generic_106541-11 sun4u    07/30/01

13:25:38  pgout/s ppgout/s pgfree/s pgscan/s %ufs_ipf
13:25:45     0.00     0.00     0.00     0.00     0.00
13:25:51     0.20     0.20     0.20     0.00     0.00
13:25:57     0.00     0.00     0.00     0.00     0.00
13:26:02     0.00     0.00     0.00     0.00     0.00
13:26:07     0.20     0.20     0.20     0.00     0.00

Average      0.07     0.07     0.07     0.00     0.00
oracle01w:(21)>

load averages: 84.39, 101.71, 97.51                                                                   13:26:50
446 processes: 364 sleeping, 75 running, 2 zombie, 5 on cpu
CPU states:  0.0% idle, 85.8% user, 14.2% kernel,  0.0% iowait,  0.0% swap
Memory: 8192M real, 178M free, 2859M swap in use, 5833M swap free

  PID USERNAME THR PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
17612 oracle     1  55    0 2480K 1944K sleep   1:51  2.21% top
14930 oracle     1  54    0 1688M 1670M run     0:36  1.62% oracle
20407 netadm     1  59    0 2504K 1944K cpu11   0:01  1.30% top
14680 oracle     1  58    0 1688M 1671M run     0:52  1.27% oracle
11418 oracle     1  58    0 1688M 1671M cpu6    1:37  1.26% oracle
15382 oracle     1  54    0 1687M 1670M run     0:27  1.16% oracle
15217 oracle     1  58    0 1688M 1671M run     0:32  1.13% oracle
14850 oracle     1  54    0 1687M 1669M run     0:38  1.12% oracle
19318 oracle     1  55    0 1687M 1669M run     0:11  1.12% oracle
13307 oracle     1  54    0 1688M 1671M run     7:43  1.11% oracle
15138 oracle     1  55    0 1688M 1670M run     0:30  1.11% oracle
24387 oracle     1  58    0 1688M 1672M run     6:00  1.10% oracle
 3075 oracle     1  58    0 1688M 1671M run     2:29  1.10% oracle
14692 oracle     1  54    0 1687M 1670M run     0:45  1.10% oracle
11172 oracle     1  55    0 1688M 1670M run     1:30  1.10% oracle


load averages: 84.39, 101.71, 97.51                                                                   13:26:50
446 processes: 364 sleeping, 75 running, 2 zombie, 5 on cpu
CPU states:  0.0% idle, 85.8% user, 14.2% kernel,  0.0% iowait,  0.0% swap
Memory: 8192M real, 178M free, 2859M swap in use, 5833M swap free

  PID USERNAME THR PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
17612 oracle     1  55    0 2480K 1944K sleep   1:51  2.21% top
14930 oracle     1  54    0 1688M 1670M run     0:36  1.62% oracle
20407 netadm     1  59    0 2504K 1944K cpu11   0:01  1.30% top
14680 oracle     1  58    0 1688M 1671M run     0:52  1.27% oracle
11418 oracle     1  58    0 1688M 1671M cpu6    1:37  1.26% oracle
15382 oracle     1  54    0 1687M 1670M run     0:27  1.16% oracle
15217 oracle     1  58    0 1688M 1671M run     0:32  1.13% oracle
14850 oracle     1  54    0 1687M 1669M run     0:38  1.12% oracle
19318 oracle     1  55    0 1687M 1669M run     0:11  1.12% oracle
13307 oracle     1  54    0 1688M 1671M run     7:43  1.11% oracle
15138 oracle     1  55    0 1688M 1670M run     0:30  1.11% oracle
24387 oracle     1  58    0 1688M 1672M run     6:00  1.10% oracle
 3075 oracle     1  58    0 1688M 1671M run     2:29  1.10% oracle
14692 oracle     1  54    0 1687M 1670M run     0:45  1.10% oracle
11172 oracle     1  55    0 1688M 1670M run     1:30  1.10% oracle
0
Comment
Question by:muthusamy
4 Comments
 
LVL 3

Accepted Solution

by:
ramkb earned 60 total points
ID: 6387143

Is this production or are you doing a stress test! :)

Ok, start looking at each of the top processes.
Do for eg.
ps -ef | grep 14930 and see what that top process is.  If it is an Oracle background processes each one has an associated task and there may be a bottleneck for the resources associated with the specific background process.
For eg: if it comes up with ora_lgwr_sid, your log writer process is busy and you have to focus your attention on the related parameters for tuning.

On the other hand if it is a server process, then look for the associated query or sql and see what it is doing.  You might have to tune your sql in that case.

Cheers,
Ramesh
0
 
LVL 6

Expert Comment

by:Mindphaser
ID: 7042708
Please update and finalize this old, open question. Please:

1) Award points ... if you need Moderator assistance to split points, comment here with details please or advise us in Community Support with a zero point question and this question link.
2) Ask us to delete it if it has no value to you or others
3) Ask for a refund so that we can move it to our PAQ at zero points if it did not help you but may help others.

EXPERT INPUT WITH CLOSING RECOMMENDATIONS IS APPRECIATED IF ASKER DOES NOT RESPOND.

Thanks,

** Mindphaser - Community Support Moderator **

P.S.  Click your Member Profile, choose View Question History to go through all your open and locked questions to update them.
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 7057890
Suggested disposition:

       Accept ramkb's comment(s) as an answer.

DanRollins -- EE database cleanup volunteer
0
 
LVL 1

Expert Comment

by:Moondancer
ID: 7058089
Thanks, Dan.
I finalized this today and will monitor it in the event an adjustment is needed.
Moondancer - EE Moderator
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Cursors in Oracle: A cursor is used to process individual rows returned by database system for a query. In oracle every SQL statement executed by the oracle server has a private area. This area contains information about the SQL statement and the…
Background In several of the companies I have worked for, I noticed that corporate reporting is off loaded from the production database and done mainly on a clone database which needs to be kept up to date daily by various means, be it a logical…
This video shows how to recover a database from a user managed backup
Via a live example, show how to restore a database from backup after a simulated disk failure using RMAN.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now