Solved

Errors with Server Hardware

Posted on 2011-02-18
12
1,200 Views
Last Modified: 2012-05-11
2 identical servers - Server A and Server B. Server A keeps giving me the blue screen for driver error and the screen I attached below. This happens at least once a week. Is it a RAM issue?

 Server B had an issue with the hot swap drive. Drive taken out, replaced and now we get error "no boot filename received". See attached.

Needless to say...HELP!
ServerA.jpg
0
Comment
Question by:renniscom
  • 6
  • 6
12 Comments
 

Author Comment

by:renniscom
ID: 34930290
0
 
LVL 15

Expert Comment

by:Perarduaadastra
ID: 34930524
The first thing I notice is that the servers seem to be rather elderly, if the BIOS dates are anything to go by. The second is that the BIOS version is the first production release (P01) if the board follows the Intel convention of the time. Upgrading to the latest revision might be useful at some point in the proceedings, but be aware that there can be caveats when updating the BIOS over a large number of revisions; always check the release notes that Intel provide with each update in case, for example, version 7 has to be installed before you can update to version 11.

The server A problem does look like faulty RAM, which the system has marked as such, and should be fixed before investigating the driver problem, as dodgy memory can compromise even the best behaved drivers. Is it always the same driver that is implicated, and if so which one is it, and what is the STOP error and heading?

The error on server B is a little simpler - the system can't find a boot device! The boot filename is needed for booting from the network rather than a local disk (or disk array) and by default is the last option in the boot device order. This means that it has tried to boot from the floppy drive, the optical drive, and the hard disk (or disk array) without success; when the last chance network saloon fails to provide boot files the system just gives up.

If the drives are hot-swap, go into the RAID adapter embedded configuration utility (accessible during the POST by pressing a key combination such as Ctrl-A or Ctrl-R   - the exact key combination will be displayed on screen) and find out what's going on with the drive(s) in the array.

I've just re-read the question, and I note that you refer to "the" hot-swap drive, implying singular; if there is only one drive, and you've replaced it with a new one, that would explain the boot failure, as a new disk is unformatted and contains no files or filesystem. If this is the case, it also means that references to arrays don't apply, as there isn't one.

I really need more information about your server setup, as I'm just guessing at possible scenarios.
0
 

Author Comment

by:renniscom
ID: 34931186
Server A - I will check the RAM since I am noticing that the hard drive space has been vanishing

Server B - Apparently it had a bad hot swap drive, it was replaced, and now we are getting this message where the server does not even boot up.
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 
LVL 15

Expert Comment

by:Perarduaadastra
ID: 34932387
If you remove the newly replaced drive does the server boot up again?

What RAID controller are you using, and what RAID level? Are the drives SCSI or IDE?
0
 

Author Comment

by:renniscom
ID: 34933016
I'll have to try later today. If I am not mistaken, its a RAID 5.

Is there another way to boot it up?
0
 
LVL 15

Expert Comment

by:Perarduaadastra
ID: 34933089
If you're running RAID 5 then you will have a minimum of three drives in the array. If you only have two, then the array is either RAID 0 or RAID 1. If it's RAID 0 and a drive in it has failed then the array is permanently broken and any data on the remaining drive is useless because half the sectors containing data are on the dead drive. If it's RAID 1 then the array should rebuild by itself, though if the RAID controller is as old as the server then you may have to explicitly tell it to rebuild the array. Again, I need more information to be more specific in trying to help you.

If the drives are SCSI, is the new drive ID set to the same as that of the failed disk? Are there any terminators that need to be transferred from the old drive to the new one? If any of these criteria are needed and haven't been met then the array probably won't function at all. Are any error messages from the controller displayed during startup?
0
 

Author Comment

by:renniscom
ID: 34933220
I just rebooted it, this is what I get....


IMG00196-20101127-1253--2-.jpg
0
 
LVL 15

Accepted Solution

by:
Perarduaadastra earned 500 total points
ID: 34934006
Is this with or without the new drive installed?

You have an Intel SRCU42L integrated controller  in your server, which is looking for a hotfix drive, as per section 4.3.3 of this document:

ftp://download.intel.com/support/motherboards/server/srcu42l/tps.pdf

It seems that the term "hotfix" means different things to different people... in any case, the controller isn't seeing the drive. This may be caused by the firmware version of the new drive being different from that of the others, which Intel has flagged up as a possible issue here:

http://www.intel.com/support/motherboards/server/sb/CS-006152.htm

As your controller firmware isn't the latest and nor, I suspect, are your drivers, see this page:

http://www.intel.com/support/motherboards/server/srcu42l/sb/cs-007030.htm

which lists the available firmware and drivers for the controller.

It's also possible that Seagate may have more recent firmware for the drive itself, but you would need to input the serial number of the drive here:

https://apps1.seagate.com/rms_af_srl_chk/

to discover if this is the case.

Hope this helps.

0
 
LVL 15

Expert Comment

by:Perarduaadastra
ID: 34935050
To amend one of my earlier comments, the SCSI ID of the new drive doesn't have to be the same as the failed one, but it does need to be different from those of the other drives in the array and that of the controller itself.

Going back to your question, was the drive that failed actually the hotfix spare?

The screenshot that you just posted shows four drives detected by the controller, which presumably are members of the array as they are all in the same LUN. It appears that the controller is configured to expect a hotfix spare to be present, and when one is not detected it doesn't bring up the host drive (the actual volume that the system sees) but reports a failure and waits to be told what to do via the Storage Console. The first link I posted to Intel's PDF on the controller has a lot of additional information; have a look at sections 4.3.4-6 as well and see if any of those scenarios are congruent with yours.

If the new drive is not detected, or prevents the array from working, then I would be looking at its firmware version to see if it differs from that of the other array member drives, and thinking about upgrading the firmware of the RAID controller.
0
 

Author Comment

by:renniscom
ID: 34939353
Your info has been very detailed and insightful and I appreciate it greatly!

I will be verifying the info you posted tomorrow.

Thank you again.
0
 

Author Comment

by:renniscom
ID: 34946982
I put the original drive back in, repaired array - rebooted. Updated firmware - we are back in business.

Thanks again!!
0
 
LVL 15

Expert Comment

by:Perarduaadastra
ID: 34947287
My pleasure. I'm glad that you're up and running again.
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A quick step-by-step overview of installing and configuring Carbonite Server Backup.
Moving your enterprise fax infrastructure from in-house fax machines and servers to the cloud makes sense — from both an efficiency and productivity standpoint. But does migrating to a cloud fax solution mean you will no longer be able to send or re…
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.
Established in 1997, Technology Architects has become one of the most reputable technology solutions companies in the country. TA have been providing businesses with cost effective state-of-the-art solutions and unparalleled service that is designed…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question