Link to home
Create AccountLog in
Avatar of willie0-360
willie0-360

asked on

Disk Utilization on Oracle Server

Hello Experts:

I have received this kind of alerts for two days in a row from Oracle 11gR2 on Windows Server 2008R2:


Target Name=oracleserver
Target Type=Host
Host=oracleserver
Metric=Disk Device Busy (%)
Metric Value=98.74
Disk Device=2 R:
Timestamp=Mar 11, 2015 10:38:09 PM EDT
Severity=Critical
Message=Disk Utilization for 2 R: is 98.74%, crossed warning (80) or critical (95) threshold.
Notification Rule Name=Host Availability and Critical States
Notification Rule Owner=SYSMAN
Notification Count=1


What can I do to resolve this?  How dangerous is this kind of critical error?

Thanks.
Willie
SOLUTION
Avatar of Geert G
Geert G
Flag of Belgium image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Avatar of willie0-360
willie0-360

ASKER

@Geert Gruwez:

Yes, backups and exports are running around the same time than the alert.  Backups run beginning at 9:01 PM and exports run beginning at 10:01 PM.  Also, automated RMAN script is running deletion of backups at 10:01 PM.  I changed the schedule for deletes to run at 1:01 AM and for full exports to run at 5:01 AM.

This kind of alert started a couple of days after automation of full exports was put in place.  

All of tasks running above are automated scripts that I put in place via Task Scheduler.

@slightwv:

If you can, please tell me how I can go about finding out what process is using a lot of disk I/O?  Is there a Windows utility for this?  This is running on Windows Server 2008 R2.

To anyone:

What disk is affected by this?  The server has a dedicated disk for backups and another one where the exports go, data pump directory.   Are either one or those two disk that are experiencing high I/O?

Thanks.
Windows has several utilities that can do this.  Task Manager for one.  Sysinternals also has several.  I think Process Monitor can do it.

Just not sure what ones can be executed in more of a batch mode.  Check with your System Administrator. They should have some idea.

>> Are either one or those two disk that are experiencing high I/O?

It is in the original message:  "R:"
slightwv:

>>Are either one or those two disk that are experiencing high I/O?

I would say yes to that.  The R drive is where the backups and deletion of backups happen.  
I remember that that disk, the R one, was increased in size at some point.  I wonder if that is the reason why Oracle shows it as "2 R" in the alert.

I will try some of those utilities you suggested.

Thanks.
2 R is actually the way Oracle classifies the disk.  Starting with 0.  In this order, the R drive is disk 2.  This is why it shows as "2 R" in the alert.

Also, I think we have found what is causing the problem.  It is like the two of you stated.  I have automated full backups via an RMAN script using Task Scheduler.  This is scheduled to run as of 9:01 PM every night.  It is 11:40 PM, and it is still running.  I believe this is what is causing the alert.  

As soon as it ends, I will disable it.  I wonder why it takes so long to complete.  I guess I will have to start thinking about incremental backups to soften the load.
ASKER CERTIFIED SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
I did a manual copy of a 13.4MB file from the disk where the data files are to the backup disk. It took about 00:01 of a second to copy.  I did a manual copy, however.  I did not use Task Scheduler.

I was also checking that this has been happening since the 10th of this month.  Before that, backing up the database took no more than 19 minutes.

I disabled the process, and I am expecting it no alert tonight.

I will look into your suggestion on Block Change Tracking and incremental backups.

Thanks.
>>It took about 00:01 of a second to copy.  I did a manual copy, however

Guess that was too small of a file.  What I was wanting you to take from that exercise was a rough estimate on raw disk times to perform the backup.

For example:
If the disks can copy 1 gig in 1 minute, then at a minimum a 100 Gig database would take 100 minutes.

>>this has been happening since the 10th of this month

Was the new 'export' process the only thing that changed?
Full backups were causing this problem.

Thanks to you both.
>>causing this problem.

I don't see 98% disk utilization during backups as a 'problem'.

It is only a problem is backups are done during a busy time and the busy disks are actually causing application issues.
virusscanner ?
exclude the location for oracle data files, redo logs, etc ...