Link to home
Start Free TrialLog in
Avatar of Ted Williamson
Ted WilliamsonFlag for United States of America

asked on

Dead Dell SAS SCSI Adapter - Reinstall Question

Hello.

I donated a server to a non profit 7 years ago.  It ran very well until yesterday when it didn't boot due to a "PCIe Training Error:  Slot 1  System Halted"

After trial and error and research, I determined that the UCS-51 5/iR Raid adapter was dead.

I was able to find one on PC Liquidators. I'm going to install it later today, but I was curious if I had to rebuild the array (2 drive mirrored system).  I've mainly worked with SCSI drive arrays in the past.

Will the mirrored configuration be lost when I install the new replacement adapter?

Thanks!

-Ted
ASKER CERTIFIED SOLUTION
Avatar of Jernej Navotnik
Jernej Navotnik
Flag of Slovenia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of David
The controller will "learn" the configuration from the disk drives, so all your data and RAID sets will still be there.
Check this article on EE -


2010-07-27 at 15:11:11ID: 2635973
Avatar of Ted Williamson

ASKER

Thanks!
As an update, I was able to get the system running after setting the array as active.  It is now in a syncing mode.  However, the server services and many of the workstation services are not running.   Most of the errors appear when I try to start event viewer or any other server apps.  "The workstation driver is not installed" is the most common.  

I cannot start any of the services from the command prompt either.  

It's hard to tell if this is from corruption or simply the drive array re syncing.   I never seen this before. Typically the services start but the running more slowly during a re sync.  Naturally this is a single server environment with one domain controller. The backup has not been running for while.

It's odd that the data will be corrupted and the controller would go bad. Perhaps the two are related.  The bad controller could because the corruption of the drives, But it's unlikely that both drives would be corrupted.  

Also, in disk management, I see both of the drives.  In a true RAID mirror, I shouldn't.  

I was initially called in to look at the server when the email went down. The server was working but I wasn't able to login.   Once I shut it down and restarted it that's when I noticed the errors on the drive controller.  Eventually the drive controller quit altogether.   Now that the new drive controllers working I see that there may be some damage on the drives corruption.  

Is there something I'm missing? Should I prepare for the worst?

Thanks.
Another addendum: the servers actually booted from the E drive not the C drive.   I am attempting to synchronize the array from the LSI Config utility now.
Hey again!

Bad day i see... Would be nice to have a backup. :)

From a bad controller, data corruption can occur - it's quite likely to actually. No worries here, probably the drives are OK. Just that they are a bit old, but that's a different topic.

First, let's wait for the sync to finish. Do you have a raid manager or some software alike - to see how much syncing is left? And how the drives are doing...
Need to set the basic things back to normal first.

Second, when that is done, try to reboot, and see how many disk you get - yes you're correct, there should be only one.

Third, the workstation driver is not installed, is probably not the worst thing that could happen. If you would like to try before reboot - if it is 2003 server, newer have this fixed:
  net stop webclient
  net stop mrxdav
  net start mrxdav
  net start webclient

Might sound strange but: have you checked the time - is it set correctly? If not, check the CMOS battery also. Time is quite important in an server environment.

Last, try an elevated command prompt: sfc /scannow

Let me know. I'll be on vacation next 4 days, but will be  on line for a few minutes a day, maybe someone else will share his/her thoughts also.


Best regards, Jernej
Thanks Jerenj.  It's at 4% and I'm going to go home for a while and come back in a few hours.  Thanks for your advice.
The sync worked.  There is only one boot drive visible, but the workstation services and several other services aren't working.  I tried to do these commands:

net stop webclient
net stop mrxdav
net start mrxdav
net start webclient

But the webclient wouldn't start due to a failed dependency group.

I was able to do a backup to an attached USB drive.

Tomorrow morning I'll attempt to run the SP2 ISO and see if that works.
If you are seeing separate drives for the mirror in computer management then the raid is being done in software ?

On a dell server with the perc controller doing the raid you would typically see a single drive called something like "dell virtual drive array
I was able to get everything working.  However each time the server restarts the drive array  becomes unsynchronized.  Since the controller card is new is it possible that there's a system failure on the motherboard? Could the system battery be dead?   The clock is accurate so that might not be the case.
Are you still showing two drives in computer management?

Can you post a screenshot ? so we can check the config is correct
There is clearly something amiss here?
When I used utility to re-sync it worked fine and there was just one drive. However when it reboots each time it sets it to re-syncing.   That's what leads me to believe there something wrong with the off-line power in keeping the memory intact for the array.
If its now showing one drive that makes sense

The controller you got - Presumably was not brand new
I think the perc5 has an onboard battery - so that may well be the problem

but I also noted this

http://www.how2forge.info/dell-r200-pcie-training-error-embedded-io-bridge-device-1/

Your original question noted -> I donated a server to a non profit 7 years ago.  It ran very well until yesterday when it didn't boot due to a "PCIe Training Error:  Slot 1  System Halted"

Could this be your problem perhaps?
If you have it up and working i would prioritise on gettinga full backup - it may well be required?

I assume the raid had fully sync'd up before your reboots
otherwise the process effectively starts again ?

According to a post Dell suggest removing the raid controller entirely firing the server up without it = Then re-inserting the card ( carefully check the seating of the riser - if any?)
Examine your new controller for "expanded" or seeping capacitors also.

After that they recommend mobo and raid card replacement :-(
sorry just read my post - obviously do shut the server off again before inserting the card !
The adapter does not appear to have a battery.   Mobo  battery is fully charged.  RAID  card was just replaced.   It now boots up with this error each time (Even after running the RAID Utility which takes 5 hours each time to resync):

Integrated RAID exception detected:

Volume 00:000  Is currently in state RESYNCING.  See photo below for more details.  

Thanks.
Cannot see photo sorry
Can you repost ?