?
Solved

Failed redundancy, drive or controller?

Posted on 2013-01-13
7
Medium Priority
?
181 Views
Last Modified: 2013-12-29
sorry for the long post, figure too much info better than too little

Server 2003, software mirroring, 2 SATA drives, Sunfire x2100

History
Drive 0 gave rare intermittent errors over a year and finally failed BSOD
Boot sector was still good and could boot the mirrored drive
------------
Position 0                         Position 1
Old HDD, BSOD                Old Good drive, failed mirror
Good boot sector             No boot sector



Bought new drive, moved mirror to 1st position and made new drive the mirror
(still booting off floppy for 2 weeks)
-------------
Position 0                         Position 1
Old Good Drive                New Drive, good mirror
No boot sector                 No boot sector


Fixed boot sector on new drive and made it Drive 0 and ran like that for a week, without a working mirror
---------------------
Position 0                       Position 1
New Drive                      Old Good Drive, no mirror
Good boot sector           No boot sector


Finally fixed boot sector on old working drive, made it Drive 1 and reestablished the mirror.
------------------
Position 0                        Position 1
New Drive                       Old Good drive, good mirror for one day
Good boot sector            Good boot sector

Within 24 hours of syncing to 100%, the new drive started reporting errors until it detached itself

Now when I boot the system, I have to boot to the old working drive on Drive 1 and Disk Manager shows failed redundancy on both and the exclamation on Drive 0. When I right click the dynamic volume, it says that the drive status is active and working.
----------------
Position 0                      Position 1
New drive, BSOD           Old good drive, failed mirror
Good boot sector          Good boot sector          


Not sure what is failing here. Originally when the first drive failed and we were booting from the boot sector of the failed drive, but running from the mirror, that worked for a couple of months until we got the new drive... so sounds like working 2nd controller and working 2nd drive
When 2nd drive was in running for two weeks with new drive in 2nd position, there were no errors, so sounds like 2 good controllers and 2 good HDDs
So now that I have a new drive that won't boot, what is suspect, the controller or the new drive?
0
Comment
Question by:shadowz85
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
7 Comments
 
LVL 47

Accepted Solution

by:
David earned 2000 total points
ID: 38773161
No way to eliminate anything, because it is entirely possible that you have data corruption that crept in when you weren't running in a mirrored state after first HDD failed.

Or you could have had corruption on the surviving disk before the first drive failed.  

But if it takes 24 hours to sync, it is clear that the source disk is having a lot of recoverable and potentially unrecoverable errors.   So ODDs are that the surviving mirrored disk has encountered both unrecovered errors and already had filesystem damage before the sync.  

By any chance are these cheap desktop consumer drives? If so, they are unacceptable. You need enterprise class drives because they have more ECC bits so provide 10X more reliable data.

Where do you go from here?  Get another machine with known good motherboard, controller, RAM, etc .. and run diagnostics.
0
 

Author Comment

by:shadowz85
ID: 38773365
I don't believe they are cheap consumer drives. The replacement drive was almost $300 for a 250Gb drive. At the time I placed the order, I wasn't sure if I was dealing with a hardware mirror and I know they like the drives to be identical, so I used the part no. for the original drive.
I believe that once you boot off the mirror drive, Windows is no longer mirroring. Can you confirm that? The drives aren't even trying to sync and I noticed that the very first time I was booting from the 2nd drive instead of the 1st one.
0
 
LVL 47

Expert Comment

by:David
ID: 38773445
No, windows host-based raid on W2k3 mirrors all writes, and does load balancing on reads very early into the boot process ... then in the few seconds after it boots before the mirroring code kicks in, it syncs up anything that might have changed into the boot process.

The 24-hours is classic indication of read errors on one of the drive, but that is an independent  issue if you have a munged up file system. Decent diagnostics will confirm health of the drives and give you an idea of how many read errors they have had.  

What is make / model of disk?  Just because you paid $300 doesn't mean you got a $300 disk. They aren't making any models of disk drives today that they were making several years ago, so you MUST have gotten an old drive that has been sitting on the shelf degrading.  Disk drives don't have shelf lives like one would think.  They are somewhat like old car batteries.

Anyway, that "new" disk drive is not a new disk. It is an old disk, and I doubt it has any factory warranty remaining.  It could very well be one of the problems you have besides unrecoverable read errors on the other disk, and a slightly munged file system.
0
Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

 

Author Comment

by:shadowz85
ID: 38776957
Do you have any particular diagnostic tool that you prefer? The drive is either Hitachi or Seagate.
0
 
LVL 47

Expert Comment

by:David
ID: 38776962
Both seagate & hitachi have freebies designed specifically for their disk drives. Just go to their website.
0
 

Author Comment

by:shadowz85
ID: 38778509
Disk is a Sun disk. Hitachi HDS722525VLSA80 (250GB - 7200 RPM - SATA Disk)
0
 
LVL 47

Expert Comment

by:David
ID: 38778537
go to hds.com and look for the disk diagnostics.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Case Summary: In this Article we introduce the new method to configure the default user profile using Automated profile copy with sysprep rather than the old ways such as the manual copy of a configured profile to default user profile Old meth…
Issue: One Windows 2008 R2 64bit server on the network unable to connect to a buffalo Device (Linkstation) with firmware version 1.56. There are a total of four servers on the network this being one of them. Troubleshooting Steps: Connect via h…
NetCrunch network monitor is a highly extensive platform for network monitoring and alert generation. In this video you'll see a live demo of NetCrunch with most notable features explained in a walk-through manner. You'll also get to know the philos…
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…
Suggested Courses

801 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question