Problem with backups using Brighstor arcserve 11.5 with an HP autoloader 1/8

wyrickits
wyrickits used Ask the Experts™
on
We've been experiencing intermittent problems with our backups.  Some nights they complete successfully.  Other nights they go through partially.  And some nights they don't go through at all.  We are using Brightstor Arcserve 11.5 and an HP autoloader 1/8.

  From we Brightstor logs we usually get one of the two errors.

C3703
Unable to open media   ec=bad device handle
 or
E3719
Unable to write to media.  EC= Hardware error.

On the HP autoloader the drive will SOMETIMES so up as locked or with an error BD 01.  Which is resolved if we reboot the device.


We've worked with HP and they suggested replacing the autoloader, which we went ahead and did, but we are still experiencing the same intermittent issue.  Any ideas on the best way to resolve the issue?  Is it a brightstor issue?  Scsi card or cable? Tapes? Termination?

Thanks for your help!







 
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Lets cover the basics..

1. Disable Removable Storage Service
 
2. Make and model of the SCSI / HBA card that connects the Changer and the drive. Also mention the driver version. Cos depending on the driver used, you may have to disable or enable the tape drive and the changer from the windows device manager

3. Do you have HP Management Agents running on the server ? If yes, the please follow the following document.
https://support.ca.com/irj/portal/anonymous/kbtech?docid=478936&searchID=TEC478936&fromKBResultsScreen=T

4. Any other HP Services running ?

5. Any other hardware(SCSI card) related errors in the Windows Event viewer ?

Author

Commented:
Thank you for your reply.

1.  Removable storage is already disabled

2. The SCSI card in device manager is listed as LSI Adapter, Ultra320 SCSI 2000 series(w/1020/1030)(storPort)  

  3. Yes we do.  I went ahead and followed the documentation and made the changes.  I was working with HP with this same problem. I think the mentioned something about how the tape drive was missing polls.  Maybe this will fix the issue!  Keep your fingers crossed.  :)

4.  Other hp services running are:
           hp proliant remote monitor
          hp proliant system shutdown service
          hp system management homepage
          hp version control agent
5.  Have never seen any erros in event viewer.

Any other suggestions?  or should I just wait and see if the SCSI polling was the issue?  Again the issue is intermittent so I will have to wait for a little whie to report back.

Thanks!

 
Since you have used storport drivers installed for the SCSI card, you can unninstall the OEM Drivers for the changer and the tape drive and can have the default MS drivers loaded. default drivers work fine with storport drivers.

Dabbler
Success in ‘20 With a Profitable Pricing Strategy

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Author

Commented:
Ok I will give that a try on Monday.  The backups are running right now so I don't want to try to change the drivers right now.  Thank you very much for your suggestions.  I will report back early next week.

Author

Commented:
Unfortunately the backups failed last night with the error

C3703
Unable to open media   ec=bad device handle

Any other suggestions?

Thank you
update storport.sys from Microsoft
2003
http://support.microsoft.com/kb/950903 
or
2008
http://support.microsoft.com/kb/968675

If that does not help then, go to the Job log, get the time stamp of the error, go to ARCserveBackup\Log\tape.log at that time and copy that portion of the log here.

Author

Commented:
The hotfix did not resolve the issue.  Here is the log from when the first backup failed last night.
Thank you


2010/02/05 00:07:52 [05e8] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/05 00:07:52 [05e8]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/05 00:07:52 [05e8]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/05 00:07:52 [05e8] Media Error-1, Barcode[1/28/10 6:01 PM], [] in []
2010/02/05 00:23:45 [05e8] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/05 00:23:45 [05e8]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/05 00:23:45 [05e8]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/05 00:23:45 [05e8] Media Error-2, Barcode[1/28/10 6:01 PM], [] in []
2010/02/05 00:39:38 [05e8] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/05 00:39:38 [05e8]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/05 00:39:38 [05e8]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/05 00:39:39 [05e8] Media Error-3, Barcode[1/28/10 6:01 PM], [] in []
2010/02/05 00:55:31 [05e8] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/05 00:55:31 [05e8]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/05 00:55:31 [05e8]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/05 00:55:31 [05e8] Media Error-4, Barcode[1/28/10 6:01 PM], [] in []
2010/02/05 01:11:24 [05e8] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/05 01:11:24 [05e8]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/05 01:11:24 [05e8]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/05 01:11:24 [05e8] Media Error-5, Barcode[1/28/10 6:01 PM], [] in []
2010/02/05 01:27:17 [05e8] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/05 01:27:17 [05e8]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/05 01:27:17 [05e8]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/05 01:27:18 [1748]         DRV:5 [HP      Ultrium 3-SCSI  G54W] recovered [0] error for [reading.].
2010/02/05 01:27:43 [1164] ClientReserveGroup:: Reserving a GROUP = GROUP1. Reserved Drives:1 Free Drives:1 --hJob[p], jID[14344616], mID[16452], jTKN[0], jNUM@3@
2010/02/05 01:27:43 [1164] ClientReserveGroup:: GROUP RESERVED. nReservedDrives = 1, nFreeDrives = 1 --hJob[p], jID[14344616], mID[16452], jTKN[0], jNUM@3@
2010/02/05 01:27:43 [1164] ClientReserveGroup:: nReservedDrives = 1, nFreeDrives = 1 --hJob[p], jID[14344616], mID[16452], jTKN[0], jNUM@3@
2010/02/05 01:28:51 [1164] ClientConnectToTape:: Decremented reserved drives. nReservedDrives = 0, nFreeDrives = 0 --hJob[p], jID[14344616], mID[1046], jTKN[0], jNUM@3@
2010/02/05 01:28:53 [1164] ClientConnectToTape:: usSlotNo = 0, currentDrive = 5, nReservedDrives = 0, nFreeDrives = 0 --hJob[p], jID[14344616], mID[1046], jTKN[0],

Author

Commented:
sorry for the confusion...  The above logs were around midnight.  The following logs are around the time the backups start.  Which is around 6 pm

2010/02/04 14:43:35 [1748]         DRV:5 [HP      Ultrium 3-SCSI  G54W] recovered [0] error for [writing.].
2010/02/04 16:16:07 [0ac0]                 LDN:6 Unit Attention/Not ready not cleared after 25 retries. Returning.
2010/02/04 16:16:19 [0ac0] Library door has been closed, reintializing.
2010/02/04 16:16:24 [1300] =>ABSL:4050 Command:[LOAD                ] <Drive>
2010/02/04 16:16:24 [1300]   ABSL:4050 Sense Key status: Not Ready [02]
2010/02/04 16:16:24 [1300]   ABSL:4050 Additional sense: Medium Not Present [3a, 00]
2010/02/04 20:34:46 [05e8] ClientReserveGroup:: Reserving a GROUP = GROUP1. Reserved Drives:1 Free Drives:1 --hJob[p], jID[14344616], mID[13512], jTKN[0], jNUM@10@
2010/02/04 20:34:46 [05e8] ClientReserveGroup:: GROUP RESERVED. nReservedDrives = 1, nFreeDrives = 1 --hJob[p], jID[14344616], mID[13512], jTKN[0], jNUM@10@
2010/02/04 20:34:46 [05e8] ClientReserveGroup:: nReservedDrives = 1, nFreeDrives = 1 --hJob[p], jID[14344616], mID[13512], jTKN[0], jNUM@10@
2010/02/04 20:35:55 [048c] ClientConnectToTape:: Decremented reserved drives. nReservedDrives = 0, nFreeDrives = 0 --hJob[p], jID[14344616], mID[1043], jTKN[0], jNUM@10@
2010/02/04 20:37:44 [048c] ClientConnectToTape:: usSlotNo = 0, currentDrive = 5, nReservedDrives = 0, nFreeDrives = 0 --hJob[p], jID[14344616], mID[1043], jTKN[0], jNUM@10@, dcb[5]
2010/02/04 20:53:42 [048c] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/04 20:53:42 [048c]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/04 20:53:42 [048c]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/04 20:53:42 [048c] Media Error-1, Barcode[1/28/10 6:01 PM], [] in []
2010/02/04 21:09:35 [048c] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/04 21:09:35 [048c]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/04 21:09:35 [048c]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/04 21:09:35 [048c] Media Error-2, Barcode[1/28/10 6:01 PM], [] in []
2010/02/04 21:25:28 [048c] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/04 21:25:28 [048c]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/04 21:25:28 [048c]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/04 21:25:28 [048c] Media Error-3, Barcode[1/28/10 6:01 PM], [] in []
2010/02/04 21:41:21 [048c] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/04 21:41:21 [048c]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/04 21:41:21 [048c]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/04 21:41:21 [048c] Media Error-4, Barcode[1/28/10 6:01 PM], [] in []
2010/02/04 21:57:14 [048c] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/04 21:57:14 [048c]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/04 21:57:14 [048c]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/04 21:57:14 [048c] Media Error-5, Barcode[1/28/10 6:01 PM], [] in []
2010/02/04 22:13:07 [048c] =>ABSL:4050 Command:[SPACE               ] <Drive>
2010/02/04 22:13:07 [048c]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/04 22:13:07 [048c]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]
2010/02/04 22:13:07 [050c]         DRV:5 [HP      Ultrium 3-SCSI  G54W] recovered [0] error for [reading.].
2010/02/04 22:13:45 [05e8] ClientReserveGroup:: Reserving a GROUP = GROUP1. Reserved Drives:1 Free Drives:1 --hJob[p], jID[14344616], mID[13512], jTKN[0], jNUM@11@
2010/02/04 22:13:45 [05e8] ClientReserveGroup:: GROUP RESERVED. nReservedDrives = 1, nFreeDrives = 1 --hJob[p], jID[14344616], mID[13512], jTKN[0], jNUM@11@
2010/02/04 22:13:45 [05e8] ClientReserveGroup:: nReservedDrives = 1, nFreeDrives = 1 --hJob[p], jID[14344616], mID[13512], jTKN[0], jNUM@11@
2010/02/04 22:14:53 [05e8] ClientConnectToTape:: Decremented reserved drives. nReservedDrives = 0, nFreeDrives = 0 --hJob[p], jID[14344616], mID[1044], jTKN[0], jNUM@11@
2010/02/04 22:14:55 [05e8] ClientConnectToTape:: usSlotNo = 0, currentDrive = 5, nReservedDrives = 0, nFreeDrives = 0 --hJob[p], jID[14344616], mID[1044], jTKN[0], jNUM@11@, dcb[5]
Different problem, so it could be the storport.sys update took care of those errors.

Now what we have is a failure of the tape drive to read what was previously written to the tape.

When the command is a Space the failure is Recorded Entity Not Found.
If the command was a Read for example if a Verify or Restore was running it would be a something like and Unrecoverable Read Failure.

This could be a problem with the tapes, or the drive, or something environmental.

The Tape Engine is in Summary mode so there is more going on.
Detailed out
2010/02/04 20:53:42 [048c] =>ABSL:4050 Command:[SPACE               ]
2010/02/04 20:53:42 [048c]   ABSL:4050 Sense Key status: Media Error [03]
2010/02/04 20:53:42 [048c]   ABSL:4050 Additional sense: Recorded Entity Not Found [14, 00]

Tape Engine sends a SPACE command.
Drive fails to complete the command.
Tape Engine sends a REQUEST SENSE for details on the failure.
The tape drive returns a dump of its registry.
Tape Engine takes note of the Sense Key, Additional Sense Code, Additional Sense Code qualifier.
Sense Key is 03 = Media Error
14, 00 are the other 2.
Get inspired and get a copy of the SCSI Reference Manual for the drive and it will tell you 14 00 means Recorded Entity Not Found.

This is all standard SCSI protocol that all drives and apps use.
For example Sense Key 4 = Hardware error, 5 = Illegal Command.

Bottom Line:
Put these tapes on the side.
LTO drives will tell you when they need cleaning so no need to to that.
take note of the brand, if different brands are used.
take note of the batch, perhaps there is a problem with one batch of tapes.
ck environment, drives will fail if too hot cold dusty.
go to Reports and run a media error report to find out which tapes have had errors.
LTO should not have any Media errors.
out of time hope this helps.

Author

Commented:
We have been using more sony tapes as of late.  We are going to purchase some HP tapes and give those a try.  I will research the error code to see what I can find.

Thank you
My experience has been that HP brand tape drives work best with HP brand tapes.

Author

Commented:
So we purchased new HP tapes; the backups started to run over the weekend and failed in the middle of the session.  We received the same error:

C3703
Unable to open media   ec=bad device handle


I rebooted the server remotely last night to see if I could get the backups to run.  After the reboot I tried running a backup and was still receiving the same error.  I then checked device manager; the tape and the autoloader was missing and not viewable within device manager.

I went and checked on the autoloader this AM.  I tried removing the tapes from the autoloader.  We received a message that says "Device locked".  I then reset the HP autoloader; the tape drive and autoloader were then displayed in device manager.  However, when running a backup within CA the error is still occurring.

I then restarted all of the CA backup services.  Job will still not run; still receiving the same error message.

I also notice that the device is not viewable from within CA groups.  When I go into the CA groups the autoloader and tape drive are not viewable from there.  However they are viewable from device manager and they are also viewable in CA when I click on devices.

  When I try to modify an existing job and select run now I receive "Cannot submit the RUN NOW job due to unavailable group or backup device."

I'm going to try to reboot the server later on today.  

Any other ideas?  I'm stumped!  Again this keeps happening intermittently.

Thank you.

Author

Commented:
ok.  Went ahead and went through the device configuration wizard.  The device is now viewable within groups and the backups are running.  The backups were set to use group 1.  I went ahead and switched the backups and tape drive to use group 0 instead.  Who knows maybe this will resolve the issue.  However, we are still expiriencing the intermittent issues.

Any ideas?

Thx
If the BD 01 error codes are still showing up then install HP Tape Tools and next time this happens use them to troubleshoot. They indicate a communication problem with the drive and there is a good chance this will require help from HP support.

Author

Commented:
We haven't had too much luck with HP.. So far they replaced our autoloader twice and we are still recieving the same error.  I will follow up with them soon.
Roshan MohammedCloud Engineering Officer

Commented:
Run the HP LT&T software and run tests, esp the LTO drive assessment test with a new tape. use this same tape in all the drives and see the results. possibility of a bunch of tapes which are bad, as such intermittently causing issues!
We ended up replacing the scsi controller to fix the issue.
Roshan MohammedCloud Engineering Officer

Commented:
What was the initial scsi controller, and what did you replace with?

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial