Disk Utilization on Oracle Server

Hello Experts:

I have received this kind of alerts for two days in a row from Oracle 11gR2 on Windows Server 2008R2:


Target Name=oracleserver
Target Type=Host
Host=oracleserver
Metric=Disk Device Busy (%)
Metric Value=98.74
Disk Device=2 R:
Timestamp=Mar 11, 2015 10:38:09 PM EDT
Severity=Critical
Message=Disk Utilization for 2 R: is 98.74%, crossed warning (80) or critical (95) threshold.
Notification Rule Name=Host Availability and Critical States
Notification Rule Owner=SYSMAN
Notification Count=1


What can I do to resolve this?  How dangerous is this kind of critical error?

Thanks.
Willie
willie0-360Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Geert GOracle dbaCommented:
it's indicating the disk is being accessed at a high rate
typically happens during backups
were backups running at that time ?
0
slightwv (䄆 Netminder) Commented:
>>What can I do to resolve this?  

Find out what process is using a lot of disk IO.  Then figure out what it is doing.

>>How dangerous is this kind of critical error?

Once in a while, not critical.  If the disks are almost constantly being accessed at 98% capacity, there might be issues:  Quicker to failure, overall system performance issues, etc...
0
willie0-360Author Commented:
@Geert Gruwez:

Yes, backups and exports are running around the same time than the alert.  Backups run beginning at 9:01 PM and exports run beginning at 10:01 PM.  Also, automated RMAN script is running deletion of backups at 10:01 PM.  I changed the schedule for deletes to run at 1:01 AM and for full exports to run at 5:01 AM.

This kind of alert started a couple of days after automation of full exports was put in place.  

All of tasks running above are automated scripts that I put in place via Task Scheduler.

@slightwv:

If you can, please tell me how I can go about finding out what process is using a lot of disk I/O?  Is there a Windows utility for this?  This is running on Windows Server 2008 R2.

To anyone:

What disk is affected by this?  The server has a dedicated disk for backups and another one where the exports go, data pump directory.   Are either one or those two disk that are experiencing high I/O?

Thanks.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

slightwv (䄆 Netminder) Commented:
Windows has several utilities that can do this.  Task Manager for one.  Sysinternals also has several.  I think Process Monitor can do it.

Just not sure what ones can be executed in more of a batch mode.  Check with your System Administrator. They should have some idea.

>> Are either one or those two disk that are experiencing high I/O?

It is in the original message:  "R:"
0
willie0-360Author Commented:
slightwv:

>>Are either one or those two disk that are experiencing high I/O?

I would say yes to that.  The R drive is where the backups and deletion of backups happen.  
I remember that that disk, the R one, was increased in size at some point.  I wonder if that is the reason why Oracle shows it as "2 R" in the alert.

I will try some of those utilities you suggested.

Thanks.
0
willie0-360Author Commented:
2 R is actually the way Oracle classifies the disk.  Starting with 0.  In this order, the R drive is disk 2.  This is why it shows as "2 R" in the alert.

Also, I think we have found what is causing the problem.  It is like the two of you stated.  I have automated full backups via an RMAN script using Task Scheduler.  This is scheduled to run as of 9:01 PM every night.  It is 11:40 PM, and it is still running.  I believe this is what is causing the alert.  

As soon as it ends, I will disable it.  I wonder why it takes so long to complete.  I guess I will have to start thinking about incremental backups to soften the load.
0
slightwv (䄆 Netminder) Commented:
>> I wonder why it takes so long to complete

Many things.  Take a look at how much data is being written.  Take a 10 or 50 meg file and time a copy to the drives in question.  Then it is rough math to get an idea of the time it will take to move X gig to the disks.

>> I guess I will have to start thinking about incremental backups to soften the load.

Also take a look a Block Change Tracking.  It will speed up incrementals.

I also do what is called Incrementally Updating backups:
http://docs.oracle.com/cd/E11882_01/backup.112/e10642/rcmbckba.htm#BRADV8186

It takes the incremental and 'merges' it into the last lvl0 (full) so I don't have to worry about a LOT of incrementals.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
willie0-360Author Commented:
I did a manual copy of a 13.4MB file from the disk where the data files are to the backup disk. It took about 00:01 of a second to copy.  I did a manual copy, however.  I did not use Task Scheduler.

I was also checking that this has been happening since the 10th of this month.  Before that, backing up the database took no more than 19 minutes.

I disabled the process, and I am expecting it no alert tonight.

I will look into your suggestion on Block Change Tracking and incremental backups.

Thanks.
0
slightwv (䄆 Netminder) Commented:
>>It took about 00:01 of a second to copy.  I did a manual copy, however

Guess that was too small of a file.  What I was wanting you to take from that exercise was a rough estimate on raw disk times to perform the backup.

For example:
If the disks can copy 1 gig in 1 minute, then at a minimum a 100 Gig database would take 100 minutes.

>>this has been happening since the 10th of this month

Was the new 'export' process the only thing that changed?
0
willie0-360Author Commented:
Full backups were causing this problem.

Thanks to you both.
0
slightwv (䄆 Netminder) Commented:
>>causing this problem.

I don't see 98% disk utilization during backups as a 'problem'.

It is only a problem is backups are done during a busy time and the busy disks are actually causing application issues.
0
Geert GOracle dbaCommented:
virusscanner ?
exclude the location for oracle data files, redo logs, etc ...
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.