Solved

Server 2008 r2 minidump help

Posted on 2010-11-16
17
1,671 Views
Last Modified: 2012-05-10
Hiya,

We've got a server 2008 R2 server running Symantec Backup Exec 11, the full console for all the endpoints.

We installed a fiber channel card and then installed Symantec backup exec to backup everything, few weeks later we've been getting bluescreens with the IRQL_NOT_LESS_OR_EQUAL

Attached is one of our minidumps
 minidump.txt
0
Comment
Question by:deepslalli
  • 6
  • 5
  • 3
  • +1
17 Comments
 
LVL 7

Expert Comment

by:tstritof
Comment Utility
Hi,

this could be a conflict with Symantec antivirus software. If you have Symantec/Norton AV installed remove it and check if the problems reappear.

Regards,
Tomislav
0
 

Author Comment

by:deepslalli
Comment Utility
Hya,

Before I go ahead and do that, is there a way we / i can check the dumps to see if its symantec doing it ?

Many thanks,
Shaun
0
 
LVL 27

Expert Comment

by:KenMcF
Comment Utility
You can view the mini dump with this utility
http://nirsoft.net/utils/blue_screen_view.html

Also since symantec is installed make sure your symevent.sys driver is up to date.
And update your video drivers.
0
 
LVL 7

Expert Comment

by:tstritof
Comment Utility
Hi,

sorry for late answer, been a long day today. The ntkrnlmp.exe reference in your minidump is not revealing, basically you can't point your finger at a specific driver, or software (at least I can't).

Since you said that everything worked OK for a few weeks and then failed, and that you posted the minidump analyzed by KD I supposed you've already eliminated the usual suspects (updated drivers, memory test and such).

Maybe I missunderstood, but I thought the BSODs happened only during backups and that led me to conclusion that you might have problems with AV because AVs are extremely unhappy when it comes to automated backup processes accessing around the system, creating system state and such. (I've had similar problems with backup on SBS 2003 and NOD32)

Since you've said you were using BackupExec I figured you were also using Symantec security tools and guessed it might be the cause.

Finally (and I'm convinced you didn't do any such thing) attempting to use the hardware (like PCI video encoding/streaming cards) with Windows 7 or Vista drivers is a certain path to problems (and from my limited experience the only thing that actually caused repeatable crashes on W2K8 machines).

Regards,
Tomislav
0
 

Author Comment

by:deepslalli
Comment Utility
Hi KenMcF,

Here is a report from that very usefull tool bluescreen view...does it help you both ?..


Crash-List.mht
0
 

Author Comment

by:deepslalli
Comment Utility
Hi tstritof,

Apologies maybe I didn't explain very well....sorry.

The bluescreens can happen at any time, it seems to happen the most when we are actually on the server doing things...can be simple things but actually using it via the console or RDP.

I have actually disabled the Symantec Endpoint virus scanner, and all that is running is the management console to communicate with the other client PC's, so there's no scanner on this machine now...

The backups are set to run at 10:00pm every evening, and it doesn't bluescreen during this...

I haven't tested the memory...what would be the best way to do this ? the HP diagnostics CD didn't show anything when we first installed it...

The only hardware which we have added recently would be the dual channel fibre card...to connect to the Storagetek L180 tape library...

Cheers guys,
Shaun
0
 
LVL 7

Accepted Solution

by:
tstritof earned 500 total points
Comment Utility
Well,

the first crash in your list seems to have been caused by a driver. The rest are just saying that kernel is crashing all over itself.

Regarding the cause (based on your new information):
- to eliminate AV disable is not enogh you would have to uninstall - although I'm not sure it's the AV based on new info (no crashes during backup, problems when using console or remote sessions),
- bad memory, display driver or nic driver might be more probable culprits since you say that most errors happen when actually working on the server.

However - the above are just guesses - you'll have to perform some sort of elimination process.

Try running memory tests (bad memory may leave warnings in event log - check that too):
- built in W2K8 server (close all apps and make sure no one is connected to computer, go to Administrative Tools > Windows Memory Diagnostic),
- 3rd party (link here).

To capture full memory dumps consider directions here.

Also - to try to troubleshoot manually - go to Device Manager and choose View > Resources by connection and also Show hidden devices.

Try locating the device with IRQ2 in IRQ group (IRQL 2 is mentioned in your minidump) or by memory range. However I think the only proper way to compare memory ranges by this method is to "capture" (screen capture) all displayed memory ranges and then check them against the location reported as the offender in the memory dump of first subsequent crash. You can find a lot of useful info in this article by Mark Russinovich on Technet.

Regards,
Tomislav
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 27

Expert Comment

by:KenMcF
Comment Utility
Try this link, there is an updated version of the ntoskrn and sound like it could be the fix you need.


http://support.microsoft.com/kb/979444
0
 
LVL 13

Expert Comment

by:eatmeimadanish
Comment Utility
'EX64' and 'ENG64.SYS
These are the Norton Antivirus files causing the crashing.  Run the Norton Removal tool http://majorgeeks.com/Norton_Removal_Tool_SymNRT_d4749.html and reinsintall.  
0
 

Author Comment

by:deepslalli
Comment Utility
Hi,

I've installed the hotfix that KenMcF suggested and rebooted , should i wait to see if that resolves things before removing Symantec ?...

Cheers,
Shaun
0
 
LVL 7

Expert Comment

by:tstritof
Comment Utility
Yes, please wait before moving onto the next step. If no BSODS occur in expectable time interval you are probably OK.

You could try running memory tests (if you haven't already) since that makes no unnecessary changes to the server configuration.

Regards,
Tomislav
0
 

Author Comment

by:deepslalli
Comment Utility
OK so the hotfix didnt work, just had a BSOD so now will migrate the Symantec Antivirus to another server and use your removal tool ....

Just to be clear we are using Symantec Endpoint not Norton is this OK?
0
 
LVL 27

Expert Comment

by:KenMcF
Comment Utility
does the mini dump point to the same thing as before?
0
 
LVL 7

Expert Comment

by:tstritof
Comment Utility
Hi,

the best way to remove SEP would be to contact Symantec support and ask for directions specific for your OS version and SEP version.

However, please try to confirm first that your issues started after SEP installation. Otherwise you could be doing a lot of work (not without risk for your system) and get the BSOD anyway.

Have you tried to record your devices in Device Manager (when viewed in Resources by connection) prior to last BSOD?

Do you have the last crash dump + minidump? Can you post them?

Have you run memory test without failures?

Regards,
Tomislav
0
 

Author Comment

by:deepslalli
Comment Utility
I thought things were resolved...however I had a crash yesterday

the only Symantec product installed now is Symantec Backup Exec , all the AV stuff has gone..


0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 0000000000000008, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, bitfield :
	bit 0 : value 0 = read operation, 1 = write operation
	bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff800018b8873, address which referenced memory

Debugging Details:
------------------


READ_ADDRESS:  0000000000000008 

CURRENT_IRQL:  2

FAULTING_IP: 
nt!CcFlushCache+103
fffff800`018b8873 488b7808        mov     rdi,qword ptr [rax+8]

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

BUGCHECK_STR:  0xA

PROCESS_NAME:  System

TRAP_FRAME:  fffff880021ce8c0 -- (.trap 0xfffff880021ce8c0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=fffff80001a4a540
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff800018b8873 rsp=fffff880021cea50 rbp=fffff880021cec58
 r8=0000000000000000  r9=0000000000000000 r10=0000000000000001
r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei pl zr na po nc
nt!CcFlushCache+0x103:
fffff800`018b8873 488b7808        mov     rdi,qword ptr [rax+8] ds:9320:00000000`00000008=????????????????
Resetting default scope

LAST_CONTROL_TRANSFER:  from fffff800018cdae9 to fffff800018ce580

STACK_TEXT:  
fffff880`021ce778 fffff800`018cdae9 : 00000000`0000000a 00000000`00000008 00000000`00000002 00000000`00000000 : nt!KeBugCheckEx
fffff880`021ce780 fffff800`018cc760 : 00000000`00000000 00000000`00000008 00000000`00000000 fffff800`01a49e80 : nt!KiBugCheckDispatch+0x69
fffff880`021ce8c0 fffff800`018b8873 : 00000000`067d4e01 fffff880`00de1360 00000000`00000000 00000000`00000000 : nt!KiPageFault+0x260
fffff880`021cea50 fffff800`018c154b : 00000000`00000000 fffff800`00000001 00000000`00000001 fffff800`018d5852 : nt!CcFlushCache+0x103
fffff880`021ceb50 fffff800`018c2138 : fffff880`0379bd00 fffff880`021cec58 00000000`00000000 fffff800`00000000 : nt!CcWriteBehind+0x1eb
fffff880`021cec00 fffff800`018db981 : fffffa80`06909540 fffff800`018c1f70 fffff800`01ad61a0 fffffa80`068f9000 : nt!CcWorkerThread+0x1c8
fffff880`021cecb0 fffff800`01b72336 : 8948038b`480674db fffffa80`068f9040 00000000`00000080 fffffa80`068ea400 : nt!ExpWorkerThread+0x111
fffff880`021ced40 fffff800`018ab106 : fffff880`009c6180 fffffa80`068f9040 fffff880`009d0f40 8b490039`4d30ec83 : nt!PspSystemThreadStartup+0x5a
fffff880`021ced80 00000000`00000000 : fffff880`021cf000 fffff880`021c9000 fffff880`021ce9f0 00000000`00000000 : nt!KiStartSystemThread+0x16


STACK_COMMAND:  kb

FOLLOWUP_IP: 
nt!CcFlushCache+103
fffff800`018b8873 488b7808        mov     rdi,qword ptr [rax+8]

SYMBOL_STACK_INDEX:  3

SYMBOL_NAME:  nt!CcFlushCache+103

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

IMAGE_NAME:  ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  4c1c42e3

FAILURE_BUCKET_ID:  X64_0xA_nt!CcFlushCache+103

BUCKET_ID:  X64_0xA_nt!CcFlushCache+103

Followup: MachineOwner
---------

Open in new window

0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Join & Write a Comment

I had a question today where the user wanted to know how to delete an SSL Certificate, so I thought that I would quickly add this How to! Article for your reference. WHY WOULD YOU WANT TO DELETE A CERTIFICATE? 1. If an incorrect certificate was …
The recent Microsoft changes on update philosophy for Windows pre-10 and their impact on existing WSUS implementations.
This tutorial will walk an individual through locating and launching the BEUtility application and how to execute it on the appropriate database. Log onto the server running the Backup Exec database. In a larger environment, this would generally be …
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now