[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3389
  • Last Modified:

BSOD - 0x000000F4

Hiya,

We recently installed a load of these servers, some running Windows Server 2003 R2 x86 and some running Windows Server 2008 x64.

The ones with 2003 installed have not yet experienced this problem.

The ones running 2008 (4 of them) appear to be OK - can get Windows installed, make them domain controllers, install DNS/DHCP etc etc. Install updates and patches, reboot. All hunky dorey.

However, after a few days uptime, when rebooting them they will blue screen, with error 0x000000F4. This has happened on all 4 of the DL140's running 2008, for no apparent reason.

Unfortunately for us these are our 4 domain controllers (2 sites, 2 per site) so it is absolutely critical we get this solved. We cannot 'downgrade' them to Windows 2003 as or domain is set to 2008 mode. This has been happening for nearly 3 months now and we just cannot identify the problem. When we think it is something, we repeat it and reboot but it will be OK.

We have contacted MS and HP who are playing the blame game. Everything they have asked us to do hasn't fixed the problem. The only fix we have found is complete re-install!

Has anyone else experienced this, and more importantly have any suggestions as to what we can do (other than send the servers back for alternate models, which will be a real hassle as we would have to say they're not fit for purpose).

No safe modes work either, as they all produce the same BSOD!

Thanks very much
0
andyrobjohns
Asked:
andyrobjohns
  • 12
  • 8
  • 2
  • +1
1 Solution
 
JoWickermanCommented:
Hi andyrobjohns,

I guess you have looked at this:

To resolve this behavior, use one of the following methods:" For Parallel Advanced Technology Attachment (PATA) hard disks, configure your disk drive as master only. For Serial Advanced Technology Attachment (SATA) hard disks, connect the hard disk cable to a master channel SATA connector on the motherboard.
" Connect another device as a master, such as another disk drive or a CD drive or DVD drive.
" Change your PATA and SATA IDE cable even if the cable does not appear worn.
" Install Windows on a new hard disk because it is possible that your hard disk or your Windows installation may be corrupted

Cheers.
0
 
andyrobjohnsAuthor Commented:
These servers have SAS RAID controllers in them, with 2 SAS drives in a RAID 1 setup.

It would be very unlucky to have a corrupted installation of Windows more than 20 times (seriously, that is the number of times these 4 servers have been re-installed).

It just seems strange that this only happens with 2008 x64 AND the DL140 G3 servers.
0
 
JoWickermanCommented:
Oh yeah... Though so...

And you had the patience to do those installations?

You have not tried the 32 bit edition?
0
Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

 
exx1976Commented:
+1 try the x86 version.  If the x86 version stays stable, then I'd blame it on an HP driver (first thought is north/south bridge).  If you experience the same instabilities, then I'd blame the hardware itself (doubtful though, since it's 4 servers)..

FWIW, this is yet one more reason I don't like HPAQ..   :-(


HTH,
exx
0
 
andyrobjohnsAuthor Commented:
Trying the x86 version is the next (and last) thing to try. Are there any known issues running mixed x86/x64 domain controllers in the same domain though?

I also believe it is a driver issue, but proving it is proving (no pun intended!) difficult!

Thanks for replies though.
0
 
exx1976Commented:
There are no compatability issues between x86 and x64 DCs that I am aware of.
0
 
andyrobjohnsAuthor Commented:
Well no luck with the x86 version of Windows 2008; same problem exists.

The annoying thing is it is so random. It crashed when rebooting after installing the BackupExec remote agent, but after re-installing Windows and pushing out the remote agent just now has not given the same result - it rebooted just fine!

Arrrggghhh!!!!
0
 
exx1976Commented:
You are STILL screwing with that?

If those were IBM boxes, no doubt they would have been replaced long ago.  I know my IBM rep would take care of me.  :-)

Best of luck, and please do post up what the final solution was.  I'm interested to hear.
0
 
andyrobjohnsAuthor Commented:
That's part of the problem; HP say it's a Microsoft fault and Microsoft say it's an HP fault.

I'm trying to prove who/what is at fault, then can take it further.
0
 
exx1976Commented:
I'd send the servers back and buy IBMs.

That will solve both issues.  The servers will work, and HP won't be able to give you a hard time about it anymore.
0
 
andyrobjohnsAuthor Commented:
Again, I need to prove the fault first, as I can't imagine anyone would refund us the cost of the servers because we simply say 'there's a problem with them'!

Us to purchase place: We want a refund on 9 servers as there's a problem with them.
Purchase place to us: What's the problem?
Us to purchase place: Dunno, they keep crashing.
Purchase place to us: Ring HP technical support.
Us to purchase place: Done that, they blame Microsoft software.
Purchase place to us: Ring Microsoft technical support.
Us to purchase place: Done that, they blame HP hardware.
Purchase place to us: Well we won't be able to resell them as new so you need to identify that the problem lies in the hardware.
Us to purchase place: That's just it. We cannot identify the problem. It's just so random.
Purchase place to us: Not our problem!

Aaaaarrrrrrggggggghhhhhhh!
0
 
exx1976Commented:
Sounds like you need to work with different vendors.

IBM and the VAR that supplies me would easily let me do that.

Good luck.  Sounds like a few too many corners were cut somewhere along the way in the procurement process..
0
 
andyrobjohnsAuthor Commented:
Apologies, it seems we didn't try with the x86 version when I said we did. Just been speaking to a colleague and he doesn't remember trying it.

So, on Monday I rebooted and it crashed again with the x64 version. I installed the x86 version this time. Rebooted it at least 20 times, installed all available updates, added roles ADDS, DHCP, DNS, file services (DFS Namespace only) etc, x86 version of the latest drivers, and not one crash so far, touch wood.

If this turns out to be the 'solution', who is to blame re x64 version? HP for crap drives, or Microsoft for WHQL'ing them, and who would be responsible for fixing it!
0
 
andyrobjohnsAuthor Commented:
That last sentence should have said 'crap drivers', not 'crap drives'!
0
 
exx1976Commented:
HP for crap drivers.  I have several x64 systems running here on various IBM equipment (x3650, x3850, HS20 blades), and no issues yet.

Glad you solved it (with a month-old suggestion - LOL).


-exx
0
 
andyrobjohnsAuthor Commented:
Yeah this got put on the back-burner due to other things building up..
0
 
andyrobjohnsAuthor Commented:
Also blaming HP will be difficult; I used the built-in Windows drivers as it had them for everything in the server. After the first crash, I download updated ones from HP's website which made no difference.
If the problem exists with built-in drivers, HP will pass the buck!
0
 
exx1976Commented:
Mirosoft doesn't write those drivers.  HP supplies them to M$, and M$ chooses to include them with their build for "ease of distribution".  If HP makes a change to the hardware after providing those base drivers to M$ that causes system instabilities, there is nothing M$ can do about it.  What do you expect, that they are going to recall all install media and stamp new stuff because HP has a driver update?  That's what the interweb is for, to get driver updates.

If you are still afraig of HP passing the buck, then like I said, you need either a new hardware vendor, a new VAR, or both.  That is ridiculous.  The system doesn't do what you bought it to do.


-exx
0
 
andyrobjohnsAuthor Commented:
Well, think I have finally found the cause of this problem: Active Directory Domain Services!!!!
Install Windows Server 2008 x64, using latest drivers from HP's site. Install ADDS alone and promote to DC. Leave for a few days and reboot and it crashes.
Re-install Windows again, but this time install DHCP, DNS and DFS Namespace, without ADDS. Leave for a week and reboot and it comes back OK.
Wonder how Microsoft will deal with this, as it is now clearly their Operating System and it's services....
0
 
exx1976Commented:
I find it INCREDIBLY difficult to believe that it's an ADDS issue that is squarely on Microsoft's shoulders to fix.  Do you think you are the only person in the world to run ADDS on an x64 machine?  Certainly this issue would have already been brough to light and a patch released by MS to remedy the situation.


My money is still on HP's drivers.
0
 
andyrobjohnsAuthor Commented:
Yeah I know it seems hard to believe it's ADDS. But how come they only crash when ADDS is installed. These servers run for ages with x64 2008 installed, as long as ADDS is not.
I have gone to the extent of using a completely different model server (DL320 G5 instead of DL140 G3), and the problem still exists. I am in the middle of using an oldish non-HP server, so will wait on the outcome of that.
0
 
andyrobjohnsAuthor Commented:
OK, final update...
I rebuilt the 320 G5 as a new domain controller in a new domain, with completely different network settings so it has no connection to the live domain. Added all the services, configured EXACTLY the same as DC's in the live domain, and it's been up for 10 days with 0 (zero) crashes.
The oldish Fujitsu server, which is a DC in the live domain, has crashed after 3 days uptime.
I think that pretty much rules out hardware and/or drivers 10 million percent!
Now to try to get MS to fix it........................... Is it possible that some form of AD corruption could be causing this? If so, how can that happen...
0
 
Computer101Commented:
PAQed with points refunded (500)

Computer101
EE Admin
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 12
  • 8
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now