Link to home
Start Free TrialLog in
Avatar of bugsbey
bugsbey

asked on

Receiving "critical error" on NetWare Server. Suspect Backup Exec as the cause.

Our Novell NetWare sever is giving us a "critical" error message and restarting on it's own. This only has happened a few times but seems to occur during the Backup Exec server backup job. Does anyone know what I can do to prevent this from happening again?

--- Novell Error Message  ---

[date of error]   4:44:20 pm:    SERVER-5.60-4
   Severity = 4  Locus = 18  Class = 6
   WARNING! Server "server name" has experienced a critical error.  It is going down in 2 minutes.  Save your files and logout.

--- Backup Exec - Job Error Report ---

BREAK: User abort message received.
Error on Backup-to-Disk_1.
This session has been taken offline. Other sessions can
continue job processing. Most error conditions can be corrected
by unloading and reloading the application. If the session does
not restart, you will need to down and reboot the server.
Vendor: VERITAS
Product: VIRTDRV
ID:
Firmware: CAGM
Function: Write(5)
Error: General Error 0 (0x0)(11)
Avatar of Ghost96
Ghost96
Flag of United States of America image

Please post the entire abend log, excluding any sensitive information.  Can you also list the full version of BackupExec.  I see 9.2 - but 9.2.what?

How often does this happen?
Any cause and effect results here that you can think of that might have started triggering the behavior?
What type of disk are you backing up to?

Sometimes the disk will deactivate (USB ones commonly) and crash the entire job, then the server.
Avatar of bugsbey
bugsbey

ASKER

========================================================================
Job server: "server name"
  Job name: Server
 User name: ADMIN."server name"
  Job type: BACKUP
 Operation: BACKUP
 Submitted: 12/06/05 at 12:44:37p
Policy name: Server
Backup operation started: 05/27/08 at 04:00:07p
========================================================================
 
========================================================================
 Backup method: Full
========================================================================
Pre-scanning device(s) to determine estimated size.
 
"server name".NetWare File System/"server name"/Server Specific Info:
 
Pre-Scan set started: 05/27/08 at 04:00:18p
 
Pre-Scan set ended: 05/27/08 at 04:00:21p
 
"server name".NetWare File System/"server name"/SYS:
 
Pre-Scan set started: 05/27/08 at 04:00:22p
 
Pre-Scan set ended: 05/27/08 at 04:08:19p
Pre-scanning operation completed.
========================================================================
 
Backup set started: 05/27/08 at 04:08:22p
 
Media ID: 38bb8007  Media 1
Media description: "Media Created on Dec 06, 2005"
Set 1 created 05/27/08
BarCode: VL000001
Set name: "BackupJob_4"
Set description: "BackupJob_4"
Backup of "server name".NetWare File System/"server name"/Server Specific Info:
Drive: "Backup-to-Disk_1"
 
  Total directories: 1
        Total files: 0
        Total bytes: 696,685  (0.6 Megabytes)
         Total time: 00:00:03
         Throughput: 232,228 bytes/second  (13.2 Megabytes/minute)
 
Backup set ended: 05/27/08 at 04:08:29p
------------------------------------------------------------------------
 
Backup set started: 05/27/08 at 04:08:31p
 
Media ID: 38bb8007  Media 1
Media description: "Media Created on Dec 06, 2005"
Set 2 created 05/27/08
BarCode: VL000001
Set name: "BackupJob_4"
Set description: "BackupJob_4"
Backup of "server name".NetWare File System/"server name"/SYS:
Drive: "Backup-to-Disk_1"
File \ETC\AUDIT.CTL or one of its streams is in use - SKIPPED.
File \ETC\AUDIT.LOG or one of its streams is in use - SKIPPED.
File \ETC\CONSOLE.LOG or one of its streams is in use - SKIPPED.
File \grpwise\MSLOCAL\0527MTA.001 or one of its streams is in use - SKIP-
PED.
File \grpwise\po\NGWGUARD.DB or one of its streams is in use - SKIPPED.
File \grpwise\po\wpcsout\ofs\0527POA.001 or one of its streams is in use
 - SKIPPED.

End of media reached. Trying next media in the Backup-to-Disk Folder.

Media ID: 38bb8007  Media 2
Media description: "Media Created on Dec 06, 2005" created 05/27/08
BarCode: VL000002

End of media reached. Trying next media in the Backup-to-Disk Folder.

Media ID: 38bb8007  Media 3
Media description: "Media Created on Dec 06, 2005" created 05/27/08
BarCode: VL000003

End of media reached. Trying next media in the Backup-to-Disk Folder.

Media ID: 38bb8007  Media 4
Media description: "Media Created on Dec 06, 2005" created 05/27/08
BarCode: VL000004
File \PCADATA\1001\FileName.DDT or one of its streams is in use - SKIP-
PED.
File \PCADATA\1001\FileName.DKY or one of its streams is in use - SKIP-
PED.
BREAK: User abort message received.



Error on Backup-to-Disk_1.
This session has been taken offline. Other sessions can
continue job processing. Most error conditions can be corrected
by unloading and reloading the application. If the session does
not restart, you will need to down and reboot the server.
Vendor: VERITAS
Product: VIRTDRV
ID:
Firmware: CAGM
Function: Write(5)
Error: General Error 0 (0x0)(11)
Sense Data:
70 00 be 00 - 00 00 00 00 - 00 00 00 00 - 04 0a 00 00
00 00 00 00 - 00 00 00 00 - 00 00 00 00 - 00 00 00 00


 
  Total directories: 2465
        Total files: 31996
        Total bytes: 13,881,783,425  (13238.7 Megabytes)
         Total time: 00:17:49
         Throughput: 12,985,765 bytes/second  (743.0 Megabytes/minute)
 
Backup set ended: 05/27/08 at 04:27:25p
      Files skipped: 8
------------------------------------------------------------------------
 
========================================================================
      Total devices: 2
        Total bytes: 13,882,480,110  (13239.3 Megabytes)
         Total time: 00:17:52

Avatar of bugsbey

ASKER

We are running Backup Exec for NetWare Servers Version 9.20 Revision 1401.
This error has only happened 3 times in the past month. We have been running this setup for a couple years without problems. It seems to happen randomly during the backup job.
We are backing up to a secondary SCSI hard drive.
I'm not trying to be a jerk, but you really didn't answer any of my questions so I can't help you.

(I see you are a beginner so I'll try to break this down further)  Quoting myself:
Please post the entire abend log, excluding any sensitive information.  
---This file can be found in the SYS:SYSTEM folder and is called ABEND.LOG.  You can find the last entry by date and copy and paste the contents, excluding any sensitive information.

Can you also list the full version of BackupExec.  I see 9.2 - but 9.2.what?
---If you are running it on the Netware server that's crashing, go to the BackupExec Administration Console and navigate to "About-->Administration Console" and get me the version numbers.

Normally the Workstation-based client holds the same information - just open it and go to Help-->About Administration Console.


How often does this happen?

Any cause and effect results here that you can think of that might have started triggering the behavior?

What type of disk are you backing up to?  (Internal, external, USB, etc.?)
Sometimes the disk will deactivate (USB ones commonly) and crash the entire job, then the server.
[edit] we were posting at the same time [/edit]
How full is the secondary hard drive?
I will still need your Abend.log file.
Avatar of bugsbey

ASKER

Ghost96:

Thank you for your help and patience!
I attached the Abend.log file.
The backup drive has 37GB free.
ABEND.LOG
Avatar of bugsbey

ASKER

I can't think of any reason why it started happening.
Given your abend.log file, this looks to be a hardware issue, possibly with a system board being faulty, or something of the like.  Have you been using this server long?  What brand of server is it?

NMI errors are normally hardware-related to the system board or other faulty hardware.  Take a quick look at some of these TID's.

http://tinyurl.com/6aq266
http://tinyurl.com/5spy7m
http://tinyurl.com/5opo9y
http://tinyurl.com/6syyb9
http://tinyurl.com/5opo9y

Is the disk you are backing up to on the same controller as the rest of your system disks?  Or is the backup disk on a totally independent controller?

If it's not, I would swap it out for a new controller and see if your problems persist.  A SCSI controller isn't an expensive card to swap out if it's just for a standard disk.
Avatar of bugsbey

ASKER

I believe the secondary drive is on the same controller. Is there an easy way to check?
It's a Dell      PowerEdge 2400 - P3, 730 MHZ, 767 MB Ram

We have been using this Server for more than 5 years and always backing up to a DLT tape drive on a separate controller. About a month ago when we first received this error we were backing up to tape . I thought maybe it was the tape drive that was causing it so I switched the backup to the secondary hard drive. I thought that fixed it, then it happened again.
Well, bad RAM can cause this kind of error as well, but it seems more prevalent in the controller or system board.  Since the controller is getting a workout when the system gets backed up, bad memory on the controller or a bad controller itself can easily take a server down.

Best way to see if the drive is on the same controller is to open the server up and physically look at it.

You can also look in MONITOR and select STORAGE DEVICES.  It just indents everything that's under a controller.  So maybe you'll see 2 controllers, and the second one will have just 1 drive listed underneath it.

Look at my screenshot as well.

See the highlighted tape drive at the bottom?  It's indented under the SCSI controller present in the server.  That's the only thing attached to the Adaptec SCSI controller.
example1.jpg
Avatar of bugsbey

ASKER

Yeah, it looks like they are both on the same controller. Is there anyway to tell for sure if that's the problem without replacing it?

ID:0 [V319-A0] Adaptec AIC-7890/7891 - Ultra2 SCSI
Capture-5.JPG
Capture.JPG
Avatar of bugsbey

ASKER

Do you know of a current Adaptec SCSI Controller that can work in place of this old one? I don't think they sell it anymore.
ASKER CERTIFIED SOLUTION
Avatar of Ghost96
Ghost96
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of bugsbey

ASKER

I will probably replace the controller and see if that solves it. I'll follow up on this question if that successfully eliminates the error for a few months.

Thank you Ghost96! You've been very helpful!!!
Avatar of bugsbey

ASKER

Thank you!!!!