Solved

DriveReady SeekComplete Error

Posted on 2002-06-24
8
396 Views
Last Modified: 2013-12-16
I have a server that gives me the following errors every day:

Jun 24 04:16:14 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:14 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597862, sector=140517472
Jun 24 04:16:14 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517472
Jun 24 04:16:19 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:19 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597869, sector=140517480
Jun 24 04:16:19 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517480
Jun 24 04:16:25 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:25 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597862, sector=140517472
Jun 24 04:16:25 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517472
Jun 24 04:16:30 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:30 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597862, sector=140517472
Jun 24 04:16:30 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517472
Jun 24 04:16:36 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:36 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597862, sector=140517472
Jun 24 04:16:36 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517472

This looks like a bad sector, but I can't have the server down long enough for a e2fsck -cc.
Can I use bad blocks with a list or something to mark those sectors bad and be done with it?
And would I use the actual sector number or the LBAsector #

0
Comment
Question by:Scott Silva
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 1

Expert Comment

by:bryanjones
ID: 7106496
The reason for this - your drive is going bad - you can repair the drive for a while by fsck -y /dev/hde
0
 
LVL 10

Author Comment

by:Scott Silva
ID: 7107814
As you can see, it is only 2 sectors. I just want to lock out these 2 sectors, but can't have the server down for the several hours that a fsck would take.
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 7108585
there is no reliable method without downtime (except you're on RAID, or you can hot-plug a new disk)
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Expert Comment

by:bryanjones
ID: 7109354
Even though it is two sectors - the drive is still bad - had the same issues as well.
0
 
LVL 40

Accepted Solution

by:
jlevie earned 100 total points
ID: 7110576
For what it's worth... It's been my experience that the appearance of one or more bad sectors on an IDE drive will shortly be followed by complete failure of the drive. I'd highly recommend a full backup of the file systems on that drive and replacement of the drive while it is still mostly working. And if this is a heavily used server you'll need to do your backup in single user mode (if it is a system disk) or with users and user applications locked out if it is a data drive to get a sane and usable backup.

It might be inconvient to have the system down while this occurs, but it'll be a lot more inconvient to have the drive fail and not have a backup that you can restore from.

If this is a mission-critical server, then you should really look a RAID configuration to protect yourself from a failed drive. At the least you could use a pair of drives and mirror with soft RAID, and preferrably use a RAID controller in RAID 5 mode with a hot spare (4 drives total).
0
 
LVL 10

Author Comment

by:Scott Silva
ID: 7111115
I will have a new drive on order by close of business today.

Thank you.
0
 
LVL 40

Expert Comment

by:jlevie
ID: 7111221
Good move... If you don't have a means of backing up to another media or system you can connect the new drive to this box and do a disk-to-disk transfer. The preferred method would be to use dump/restore, but there could be a problem when dump encounters the bad blocks.

What I've done in the past when faced with sonmething like this is to do a read check on every file (e.g. cp /path-to/file /dev/null), deleteing any file that has bad blocks. As long as any bad blocks on the disk aren't a part of the directory stucture dump should not then have a problem.

As a last resort, if the bad blocks interfere with dump, you could try using e2fsck to map out those blocks. Since I'm rather paranoid when futzing with disk in that condition, I'd use other means (tar, cp, cpio) to replicate everything that I can onto the new disk before running e2fsck. That minimizes any potential loss if e2fsck can't fix the drive and it winds up being unreadable.
0
 
LVL 10

Author Comment

by:Scott Silva
ID: 7111245
I'm just wondering why this error shows up at the same time every day?
Must be triggered by some cron job. More digging to do.
The server is out of town, and I can't get to it til friday. Maybe I can make a link to another directory on another disk, and move the data. I can do that remotely.


"The more I work with computers, the more I realize where the term 'Boot' came from"
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

rdate is a Linux command and the network time protocol for immediate date and time setup from another machine. The clocks are synchronized by entering rdate with the -s switch (command without switch just checks the time but does not set anything). …
Introduction We as admins face situation where we need to redirect websites to another. This may be required as a part of an upgrade keeping the old URL but website should be served from new URL. This document would brief you on different ways ca…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
Suggested Courses

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question