Windows 2008 R2 hyper-V cluster node bluescreen assistance - BAD_POOL_HEADER


I have a 4 node Server 2008 R2 failover cluster running hyper-v servers.  Over the weekend we had a bluescreen out of no where, the cluster has been relatively solid.  This is the first time I have seen this and was hoping that I could be pointed in the right direction.  I did google and find a technet forum thread about something that seems to match the issue, but I wanted to get some other eyes on this.  This cluster is an important part of our data center so I wanted to be extra careful before hotfixing a node and regretting it later.  Thank you for any assistance you can provide.

hardware:  ProLiant BL480c G1

This is the forum thread I have found so far:


Below is the minidump info:

*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *

The pool is already corrupt at the time of the current request.
This may or may not be due to the caller.
The internal pool links must be walked to figure out a possible cause of
the problem, and then special pool applied to the suspect tags or the driver
verifier to a suspect driver.
Arg1: 0000000000000020, a pool block header size is corrupt.
Arg2: fffffa8023161000, The pool entry we were looking for within the page.
Arg3: fffffa8023161340, The next pool entry.
Arg4: 000000000c340000, (reserved)

Debugging Details:

BUGCHECK_STR:  0x19_20

POOL_ADDRESS:  fffffa8023161000





IRP_ADDRESS:  fffffa8023160fc8

LAST_CONTROL_TRANSFER:  from fffff800019ec6d3 to fffff800018b9880

fffff880`08e62648 fffff800`019ec6d3 : 00000000`00000019 00000000`00000020 fffffa80`23161000 fffffa80`23161340 : nt!KeBugCheckEx
fffff880`08e62650 fffff800`018d8cce : 00000000`a0000003 fffffa80`236bb230 00000000`20206f49 00000000`00000000 : nt!ExFreePoolWithTag+0x18b4
fffff880`08e62700 fffff800`018bc276 : fffffa80`23161040 00000000`00000000 00000000`00000001 fffff8a0`023dc810 : nt!IopCompleteRequest+0x5ce
fffff880`08e627d0 fffff800`01b4622a : fffffa80`1f3146a0 fffff800`019ed400 fffffa80`1e937dc0 00000000`00000000 : nt!IopfCompleteRequest+0x6f6
fffff880`08e628c0 fffff800`01bd08f7 : fffffa80`1f3146a0 fffff880`08e62ca0 fffff880`08e62ca0 fffffa80`236bb230 : nt!WmipIoControl+0xd6
fffff880`08e62a10 fffff800`01bd1156 : 00000000`ffffff01 00000000`000001b8 00000000`00000000 00000000`00000000 : nt!IopXxxControlFile+0x607
fffff880`08e62b40 fffff800`018b8ad3 : fffffa80`1bfc2d80 00000000`00000000 00000000`00000000 fffff800`018b5487 : nt!NtDeviceIoControlFile+0x56
fffff880`08e62bb0 00000000`77bdf72a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
00000000`00d8eee8 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x77bdf72a


PROCESS_OBJECT: fffffa801ce56060

FOLLOWUP_NAME:  MachineOwner


IMAGE_NAME:  WmiApSrv.exe


FAILURE_BUCKET_ID:  X64_0x19_20_IMAGE_WmiApSrv.exe

BUCKET_ID:  X64_0x19_20_IMAGE_WmiApSrv.exe

Followup: MachineOwner
Who is Participating?
Philip ElderConnect With a Mentor Technical Architect - HA/Compute/StorageCommented:
Out of curiosity, how old is this setup?

Are all nodes running at the same Service Pack and patch level as well as driver updates?

Are all of the nodes and the chassis running the most current firmware?

Yours is the second I've seen in a week where none were seen prior to that. :S

TNCITAuthor Commented:
Hi Philip!

Thank you for your response.  The answer to your questions are as follows:

Its pretty old.  It used to be a 3 blade cluster with identical hardware.  A newer generation blade was added as the 4th node later on.  

I will admit, the firmware on the chassis and the drivers are all pretty old.  For something as critical as this cluster, we kind of have a 'if its not broke, dont fix it' mentality.  This did cross my mind though, I just didnt want to go chasing a white rabbit that might lead me down another hole entirely.  I wanted a little more evidence before using the shotgun blast appraoch of updating firmware/drivers accross the board with no direction.  If that makes sense.  

We do have the nodes at the same service pack and update levels across cluster.  

The nodes do not run anything else except our Hyper-V cluster.
Philip ElderConnect With a Mentor Technical Architect - HA/Compute/StorageCommented:
Another option would be to evict the node, flatten and re-install it, and bring it back into the cluster.

That may be a bit less extreme relatively speaking.

TNCITAuthor Commented:
My appologies for leaving this open for so long, I was out of work for several weeks on a personal matter.

I think we are going to observe and see if this happens again.  Its has been up a month since this occured with no other bluescreens.  We are slowly going to migrate VMs back and see what happens.
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
Thank you for the points.

I've seen other mention of this. It may actually be a bad update. But, I am not sure which one yet...

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.