Solved

DriveReady SeekComplete Error

Posted on 2002-06-24
8
394 Views
Last Modified: 2013-12-16
I have a server that gives me the following errors every day:

Jun 24 04:16:14 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:14 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597862, sector=140517472
Jun 24 04:16:14 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517472
Jun 24 04:16:19 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:19 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597869, sector=140517480
Jun 24 04:16:19 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517480
Jun 24 04:16:25 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:25 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597862, sector=140517472
Jun 24 04:16:25 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517472
Jun 24 04:16:30 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:30 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597862, sector=140517472
Jun 24 04:16:30 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517472
Jun 24 04:16:36 mail kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jun 24 04:16:36 mail kernel: hde: dma_intr: error=0x40 { UncorrectableError }, LBAsect=140597862, sector=140517472
Jun 24 04:16:36 mail kernel: end_request: I/O error, dev 21:05 (hde), sector 140517472

This looks like a bad sector, but I can't have the server down long enough for a e2fsck -cc.
Can I use bad blocks with a list or something to mark those sectors bad and be done with it?
And would I use the actual sector number or the LBAsector #

0
Comment
Question by:Scott Silva
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 1

Expert Comment

by:bryanjones
ID: 7106496
The reason for this - your drive is going bad - you can repair the drive for a while by fsck -y /dev/hde
0
 
LVL 10

Author Comment

by:Scott Silva
ID: 7107814
As you can see, it is only 2 sectors. I just want to lock out these 2 sectors, but can't have the server down for the several hours that a fsck would take.
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 7108585
there is no reliable method without downtime (except you're on RAID, or you can hot-plug a new disk)
0
Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

 
LVL 1

Expert Comment

by:bryanjones
ID: 7109354
Even though it is two sectors - the drive is still bad - had the same issues as well.
0
 
LVL 40

Accepted Solution

by:
jlevie earned 100 total points
ID: 7110576
For what it's worth... It's been my experience that the appearance of one or more bad sectors on an IDE drive will shortly be followed by complete failure of the drive. I'd highly recommend a full backup of the file systems on that drive and replacement of the drive while it is still mostly working. And if this is a heavily used server you'll need to do your backup in single user mode (if it is a system disk) or with users and user applications locked out if it is a data drive to get a sane and usable backup.

It might be inconvient to have the system down while this occurs, but it'll be a lot more inconvient to have the drive fail and not have a backup that you can restore from.

If this is a mission-critical server, then you should really look a RAID configuration to protect yourself from a failed drive. At the least you could use a pair of drives and mirror with soft RAID, and preferrably use a RAID controller in RAID 5 mode with a hot spare (4 drives total).
0
 
LVL 10

Author Comment

by:Scott Silva
ID: 7111115
I will have a new drive on order by close of business today.

Thank you.
0
 
LVL 40

Expert Comment

by:jlevie
ID: 7111221
Good move... If you don't have a means of backing up to another media or system you can connect the new drive to this box and do a disk-to-disk transfer. The preferred method would be to use dump/restore, but there could be a problem when dump encounters the bad blocks.

What I've done in the past when faced with sonmething like this is to do a read check on every file (e.g. cp /path-to/file /dev/null), deleteing any file that has bad blocks. As long as any bad blocks on the disk aren't a part of the directory stucture dump should not then have a problem.

As a last resort, if the bad blocks interfere with dump, you could try using e2fsck to map out those blocks. Since I'm rather paranoid when futzing with disk in that condition, I'd use other means (tar, cp, cpio) to replicate everything that I can onto the new disk before running e2fsck. That minimizes any potential loss if e2fsck can't fix the drive and it winds up being unreadable.
0
 
LVL 10

Author Comment

by:Scott Silva
ID: 7111245
I'm just wondering why this error shows up at the same time every day?
Must be triggered by some cron job. More digging to do.
The server is out of town, and I can't get to it til friday. Maybe I can make a link to another directory on another disk, and move the data. I can do that remotely.


"The more I work with computers, the more I realize where the term 'Boot' came from"
0

Featured Post

Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SonarQube on Linux vs Windows 3 54
awk file 6 79
number in printf 13 33
Migrating a Linux server to VMware 3 66
Daily system administration tasks often require administrators to connect remote systems. But allowing these remote systems to accept passwords makes these systems vulnerable to the risk of brute-force password guessing attacks. Furthermore there ar…
Network Interface Card (NIC) bonding, also known as link aggregation, NIC teaming and trunking, is an important concept to understand and implement in any environment where high availability is of concern. Using this feature, a server administrator …
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question