Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

NTBackup restore failure across multiple tapes: Cannot catalog media, cannot restore from previously cataloged media

Posted on 2014-12-17
9
Medium Priority
?
295 Views
Last Modified: 2015-01-07
Hi all,

I recently found that all my monthly backup tapes seem to be in a non-restorable state, and I'm kinda starting to get a little freaked out. Any advice or suggestions would be gratefully welcomed.

Background Info:
The backup setup in question is an HP Proliant DL380 G4 using a single drive HP StorageWorks Ultrium 960 and imation LTO3 tapes. All tapes were purchased in the last year, and they have only been written to once each. All backups were done with single job, single tape backup sets (no backups than run across 2 or more tapes). The same symptoms are consistent across all of my backup tapes older than 3 months old, and those symptoms are:
1) attempts to restore data from a tape result in a warning message saying that necessary files from the Active Family are offline. The tape in the drive ejects, and ntbackup refuses to use the tape even after reinsertion. All the backup jobs were done with single tape families.
2) I can traverse the files on the backup tapes thanks to the local cached catalogs of the backups, but when I try to run a Catalog job on one of the tapes, it errors saying their was an unexpected inconsistency on the requested media

The files that are on the tapes are backup to disk files that are several GB each.
After I first noticed the issue, I did a test backup using the same files and steps as always, but for reasons unknown I am able to restore from the new test backup tape without any issue.

Things tried:
-Un-checking the "use the catalog on the media to speed up restorations" option -> no effect
-Switching the SCSI ports that the tape drive and the MSA harddisk array are connected through -> no effect  
-Purchased a new tape drive cleaning catridge and cleaned the head -> no effect
-Ran 'rsm view /tlibrary  and then rsm.exe refresh /lf"library name"  to refresh RSM -> no effect
-Stopped the remote storage service, forced a recreate of the ntmsdata folder and rebooted the server -> all previously cataloged media disappeared but attempts to catalog my backup tapes fail the same as outlined above.
-Cataloging media on a separate Win 2003 server using NTBackup -> Catalog fails same as above
-Cataloging media on a separate Win 2003 server using BackupExec 2010 -> Fails saying "the blocksize being used is incorrect"
  - Tried changing the blocksize settings in BE 2010 -> no effect

Any advice would be greatly appreciated.
0
Comment
Question by:sloutz
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
9 Comments
 
LVL 21

Expert Comment

by:SelfGovern
ID: 40507742
I'd suggest you download HP's free tape drive diagnostic utility -- Library & Tape Tools.  Install it and run the drive diagnostics, and if that finds nothing odd, run tape media tests on a piece of media you can afford to lose (the write tests overwrite any current data; the read tests do not affect data on the tape).  

I am assuming that the backup jobs written to these tapes completed without errrors?  If not and there were errors, can you let us know the error messages?

It is possible that blocksize got changed somehow; have you tried all the possibilities?  

Other than running L&TT, it sounds like you've done most of the reasonable steps.  Let me know what it says, and also what trying as yet untried block sizes does.
0
 
LVL 1

Author Comment

by:sloutz
ID: 40508314
Thanks SelfGovern. I'll try the L&TT tests today and post the results when its finished.

Regarding the backup jobs, yes they were completed without errors, and I was unable to find anything in the error logs related to ntbackup.

With regards to the blocksize settings, I only tried the largest 2 (64k and 32k) based on the assumption that large multi-gb files would require a larger blocksize, but I'll give the remaining settings a go after the L&TT just to be thorough.
0
 
LVL 21

Expert Comment

by:SelfGovern
ID: 40508425
Sounds like a plan.

As a point of interest, large files don't 'require' a large block size, but larger block sizes typically provide better performance on today's systems.  You could successfully backup up TB-sized files with 4K blocksize... it just might take a bit of time.
0
Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

 
LVL 1

Author Comment

by:sloutz
ID: 40508732
Hi SelfGovern,
The Read/Write test of L&TT went fine, but the drive diagnostic was far from pretty. The results are below and items with the ** were highlighted in red.  I did a little reading but didn't find much other than this can indicate a general device failure. Being that I can still perform test backup and restores without issue, I'm reluctant to point the sole blame at my tape device, but I've arranged to borrow a similar device from a friend on Monday to get a "second opinion" so to speak.

All my catalog trials with the varying block sizes ended the same sadly. Interesting note on blocksize and file size relationships though. I'll be sure to remember that.

If you (or anyone else) happens to think of something else I can try over the weekend, I'm all ears~

|__ Analysis Results
    ||__ LTO Drive Assessment Test, version V23.01.2013
    ||__ Test run: Fri Dec 19 11:10:40 2014
    ||__ Drive serial number: HU10606
   ** ||__ There was an unexpected error condition on a Receive Diagnostic command
   ** ||__ Sense Key 0x00, Sense Code 0x0000 (No additional sense information) Error Code: 0x00 GOOD
    ||__ Sense Key 0x05, Sense Code 0x2600 (Operator selected invalid field in parameter list)
   ** ||__ There was an unexpected error condition on a Receive Diagnostic command
    ||__ Sense Key 0x05, Sense Code 0x2600 (Operator selected invalid field in parameter list) Error Code: 0x1802 DI_INVALID_PARAMETER
    ||__ This test requires that the Removable Storage service is not running.
    ||__ Please stop RSM (Computer Management/Services and Applications/Services)
    ||__ and then re-run the test.
    ||__ Test time: 1:53
0
 
LVL 21

Expert Comment

by:SelfGovern
ID: 40509198
Did you verify that RSM was not running at the time you were running this test?

If not, and the drive is still under support, I'd recommend you call HP.

It will be interesting to see what happens with a different tape drive.  I have seen very rare cases where it seems that a drive has slowly drifted out of spec, and can read tapes it's writing now, but not ones it wrote a while back -- possibly coincident with poor quality media.  If another drive can read the old tapes but has trouble with the new tapes, this might be the issue.

And on a side note, an LTO-5 tape drive will read LTO-3 media (although it is only able to write to LTO-4 and LTO-5 tapes).  It also brings you 4x the capacity, and the ability to encrypt your tapes with no performance or capacity hit... so if this drive is on its last legs or questionable, consider upgrading to an LTO-5.
0
 
LVL 1

Author Comment

by:sloutz
ID: 40512243
This time around I disabled RSM instead of just stopping it, then rebooted, but when running the L&TT I still got similar errors with the new addition that the tape I am using is registering as write protected though it is physically not (little switch on the front is in the open state, the same as all new tapes).  This drive is haunted.

Ill post an update in a few hours after I get my hands on the secondary device.

|__ Test 'LTO Drive Assessment test' started on device 'HP Ultrium 3-SCSI' at address '2/0.5.0'
    |__ Test aborted
    |__ Operations Log
    |    |__ LTO Drive Assessment Test Options
    |    |__ Test Coverage : Default
    |    |__ Allow Overwrite : True
    |    |__ executing LTO Drive Assessment Test...
    |    |__ adjusting boost value...
    |    |__ erasing ...
    |    |__ soft unload ...
    |    |__ loading ...
    |    |__ writing wrap 0 (1.8 m/sec.)
    |    |__ writing wrap 0 (1.8 m/sec.)
    |    |__ soft unload ...
    |    |__ loading ...
    |    |__ erasing ...
    |    |__ checking tape load ...
    |    |__ Aborted
    |__ Analysis Results
        |__ LTO Drive Assessment Test, version V23.01.2013
        |__ Test run: Mon Dec 22 10:17:28 2014
        |__ Drive serial number: HU10606CB0
**        |__ There was an unexpected error condition on a Receive Diagnostic command
        |__ Sense Key 0x05, Sense Code 0x2600 (Operator selected invalid field in parameter list) Error Code: 0x1802 DI_INVALID_PARAMETER
**        |__ There was an unexpected error condition on a Receive Diagnostic command
        |__ Sense Key 0x05, Sense Code 0x2600 (Operator selected invalid field in parameter list) Error Code: 0x1802 DI_INVALID_PARAMETER
        |__ Data Cartridge Information:
        |__     Vendor: Unknown
        |__     Format: Unknown
        |__     Serial Number: Unknown
        |__     Barcode: Unknown
**        |__ The cartridge currently loaded in the drive is write protected.
**       |__ This test cannot be performed on a write protected cartridge.
        |__ Please replace the cartridge with a writeable data cartridge and re-run the test.
        |__ Test time: 1:48
0
 
LVL 1

Author Comment

by:sloutz
ID: 40516008
It looks like my tape drive was the culprit all along.
I have only tested with 2 so far, but both tapes were able to be cataloged and restored from using the LTO4 tape drive I borrowed.
Its crazy to see that my old drive can read from backups its has recently written without issue, but that it sees tapes that it wrote more 3 months as unusable or corrupt.
0
 
LVL 21

Accepted Solution

by:
SelfGovern earned 2000 total points
ID: 40516707
Sometimes a drive slowly drifts out of spec -- alignment, for instance -- and for whatever reason, the drive's internal sensors aren't able to catch it.  Note: this is pretty rare in my experience.
It's like a gun with a scope that used to be dead on but is a bit loose and has been drifting off-target.  If you've been continually using the rifle as it gets worse, you'll compensate and shoot dead-on (i.e., your drive can read the tapes as they are now).  But if a gunsmith were to 'fix' your scope, you'd find that you couldn't hit the target anymore, even though you could before, when the scope was sighted in correctly (i.e., you can't hit the targets or read the tapes you used to be able to read).

HP has tools like TapeAssure (free), and especially Tape Assure Advanced (for which they charge) which can actively monitor tape drives for things like compression, speed, wear, and possible failures.  Other vendors may have similar diagnostics.   It's worthwhile to keep an eye on tape drives using the tools available.
0
 
LVL 1

Author Closing Comment

by:sloutz
ID: 40537025
I'm in the process of procuring new equipment, but as that will take a bit of time I'm closing out this ticket for now.
Thanks for the support and the interesting analogy to help explain my equipment's unique malfunction.
0

Featured Post

Use Case: Protecting a Hybrid Cloud Infrastructure

Microsoft Azure is rapidly becoming the norm in dynamic IT environments. This document describes the challenges that organizations face when protecting data in a hybrid cloud IT environment and presents a use case to demonstrate how Acronis Backup protects all data.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Bare Metal Image backup allows for the restore of an entire system to a similar or dissimilar hardware. They are highly useful for migrations and disaster recovery. Bare Metal Image backups support Full and Incremental backups. Differential backup…
Microsoft will be releasing the Windows 10 Creators Update in just a matter of weeks. Are you prepared? Follow these steps to ensure everything goes smoothly and you don't lose valuable data on your PC.
This tutorial will walk an individual through the steps necessary to enable the VMware\Hyper-V licensed feature of Backup Exec 2012. In addition, how to add a VMware server and configure a backup job. The first step is to acquire the necessary licen…
This tutorial will walk an individual through the process of configuring basic necessities in order to use the 2010 version of Data Protection Manager. These include storage, agents, and protection jobs. Launch Data Protection Manager from the deskt…

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question