?
Solved

Backup tapes do not write up to their capacity

Posted on 2011-05-01
21
Medium Priority
?
2,921 Views
Last Modified: 2012-05-11
Hey everyone,

I have this weird issue with a backup library and ArcServe 11.5. I have a graphics department, working with their own server. Quite often, the have to backup their data to tapes. On the hardware side, they've got the HP MSL2024 tape library with two LTO3 tape devices - and LTO3 tapes. As a backup software, they are using ArcServe 11.5 SP4.

Now, quite often (if not all the time) the data written on tapes is way under LTO 3 tape capacity of 400GB minimum (800GB with 2:1 compression). This recent job could be used as an example (log attached). Out of tape's capacity of 400GB, only 276GB are actually written, before ArcServe requests next tape in sequence. I have disabled file estimation so that ArcServe would write up until the real end of the tape, to no avail.

Any ideas? I've been battling with this for quite a while now. I've even sent the library for a checkup and it returned clean. I often perform head cleaning (last one done yesterday).
ArcServeLog.txt
0
Comment
Question by:KeterHD
  • 12
  • 5
  • 3
  • +1
21 Comments
 
LVL 4

Author Comment

by:KeterHD
ID: 35503072
In the meanwhile, I ran another back on the same tape, and now it backed up 283GB...
0
 
LVL 56

Expert Comment

by:andyalder
ID: 35503280
You sent the library off for a checkup? HP's Library and Tape Tools is a free download that will fully test the library, upgrade the firmware etc and it will test it using your server so as to eliminate any hardware problem with your server's controller and SCSI cable as well as the library. Not only that but it will do a test using your media and it could easily be your media that is at fault.
0
 
LVL 17

Expert Comment

by:Gerald Connolly
ID: 35503651
Don't overuse the cleaning tapes they are very abrasive!

Not an expert on Arcserve, but are you sure your tapes don't already have something on them, a failed backup maybe?
0
Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

 
LVL 4

Author Comment

by:KeterHD
ID: 35503733
andyalder: Of course, mate ;) I use LT&T and so far they don't seem to show much of an issue. I've upgraded the firmware for both tapes and the library itself. Due to the fact we have a service agreement for the library, I sent it off for advanced checkup, which returned nothing.

connollyg: I set all backup jobs to overwrite the tapes. Also, I've erased and formatted the tapes - to no avail, unfortunately.
0
 
LVL 56

Expert Comment

by:andyalder
ID: 35503747
Can you get the statistics from a tape using Arcserve or L&TT, it might show a large amount of soft write errors meaning that you won't get the full capacity since when read-after-write fails it writes the data again further down the tape to avoid going back to re-write which would stop it streaming.

I presume there aren't any other multiplexed jobs using the same tape as they use space on it and won't appear in this job log.
0
 
LVL 4

Author Comment

by:KeterHD
ID: 35503786
andyalder: Will check it out. Do you remember where exactly the setting is? If not, it's okay, I'll find it. Don't think I have the option is AS, but LT&T surely must have the option.

And yes, there's only one job running at a time.
0
 
LVL 4

Author Comment

by:KeterHD
ID: 35503789
Rather, it should be L&TT.
0
 
LVL 21

Expert Comment

by:SelfGovern
ID: 35504724
This behaviour is often a symptom of a tape head going (gone?) bad.  In essence, when the tape head deteriorates, it writes more and more bad blocks.  Each bad block has to be written again until it is 'right'; each failed write skips the badly written chunk of tape and writes the next sector.   So lots of bad writes mean lots of skipped tape which means way less usable capacity which means... maybe only 283GB on a 400GB tape.

This could also be caused by a dirty head (but you've cleaned it) or bad media.  If this happens to all tapes, it's probably the drive problem.

Library and Tape Tools has both media and drive tests.   I encourage you to run them again to check what it says about the drive in particular.   You can generate a "support ticket" which will help identify the actual problems.

HP also has TapeAssure, another free utility that can monitor tape use in real time, including performance and compression.   It's a bit more challenging install, you'll need to also install CommandView Tape Library (free) to show the results.   See http://www.hp.com/go/tapeassure
TapeAssure is also really good in complex environments, as it lets you monitor how much each individual tape drive is being used (hey, why is this drive only used for three hours a day, but the other seven are being used 20 hours a day?).

0
 
LVL 4

Author Comment

by:KeterHD
ID: 35504868
@SelfGovern: Thanks a lot. In general terms, I am aware of these issues. My main problem right now is how to find the necessary data. For example, where can I see the soft write errors? As far as I can see, ArcServe does not log those, can I see it from within L&TT?
0
 
LVL 21

Expert Comment

by:SelfGovern
ID: 35505968
Run the drive test in L&TT.   It will show you what happens during that test run.

You can also generate a support ticket in L&TT.  That will tell you what's been going on with the drive (such as, "tape head is at 48% of margin.  Replace tape drive."

If it is the tape drive that needs to be replaced -- LTO-4 will read and write LTO-3 tapes, doubling your capacity per tape, and possibly increasing performance by up to 50% (if your disks can feed the data fast enough).   LTO-5 can read LTO-3 tapes, but cannot write them.   Capacity is 1.5TB native (almost 4x your current tapes) and performance is 140MB/sec native (75% faster than LTO-3).

In addition, both LTO-4 and LTO-5 support hardware encryption at the drive, for no loss of performance or compression.   I don't remember if your version of ARCserve handles encryption, but whether it does or not, the MSL Encryption Kit from HP will work with LTO-4 and LTO-5 drives to encrypt data regardless of the backup application.

Oh -- with some applications, go to a written piece of media, right click and select "properties" to get the write error info, space used, etc., at least in rudimentary form.
0
 
LVL 56

Expert Comment

by:andyalder
ID: 35506329
Umm, I don't think you'll find an LTO4 drive will double the writeable capacity of an LTO3 tape, after all an LTO3 drive may have to read it again.
0
 
LVL 21

Expert Comment

by:SelfGovern
ID: 35507727
Andy, you're correct -- native capacity of LTO-4 tape is twice that of an LTO-3 tape, and sticking an LTO-n tape into an LTO-(n+1) drive doesn't change the tape's capacity.   Sloppy choice of words on my part.

If the LTO-3 is swapped out for an LTO-4, the questioner would be able to read and write to his LTO-3 tapes, and use of LTO-4 tapes would double the capacity (and potentially increase performance).
0
 
LVL 4

Author Comment

by:KeterHD
ID: 35510495
Wait, wait, wait... Maybe it's the bad night's sleep I had, but is your suggestion to swap the LTO3 tape drives for LTO4 tape drives?

Overall, it might not be such a bad idea, although, to be honest, I am thinking of switching the backup method for a hard drive based one (for example, buy some small server with disks inside, then buy a few external HDDs and have double backups of everything).
0
 
LVL 4

Author Comment

by:KeterHD
ID: 35510521
Above are results of DAT and support tickets for both drives. The results aren't perfect (is it possible some of the errors are due to SCSI?), but I'm having a hard time deciphering all this information.
0
 
LVL 21

Expert Comment

by:SelfGovern
ID: 35513288
One of the main things to look for is "margin".  I can't tell you how they came up with that word, but 10)% is factory new excellent condition.  Over time the drive degrades until it gets to 0% margin, which represents normal end-of-life.   At some point after 0% margin, you'll start to get unacceptable error count, and margin continues to go more and more negative.

Your drive 1 shows consistent negative margins, and should be replaced.
Your drive 2 is a bit more interesting; I see one negative margin warning in the support ticket, but other things look good... so this might be one to float by HP.

You can run your backups to disk; many people are.   Be aware, though, that disk is a poor choice for long-term storage of backup data, or archival use.   (You don't want to keep spending money on electricity to power the disks for years, and, you want your archive offsite where one disaster won't take out everything.
0
 
LVL 4

Author Comment

by:KeterHD
ID: 35513309
@SelfGovern: When I was talking about disks, I meant external disks, that would be kept in a safe place. In such scenario, are they worse than tapes? Does HDD quality deteriorate over tape faster than that of tapes?
0
 
LVL 4

Author Comment

by:KeterHD
ID: 35514605
Also, I am not entirely sure I can see the negative margins you are talking about. Could you point me to an example?
0
 
LVL 21

Accepted Solution

by:
SelfGovern earned 2000 total points
ID: 35517809
The important section from the drive assesment test:
|    |__        1.8 m/sec. tape speed: Effective capacity: 72.3%   Margin: -51.2%   (1.5/1.5 GB written using 171.5 metres of tape)
    |    |__                  Channel variation: 36.0%   Channel variation margin: -44.0%
    |    |__        2.1 m/sec. tape speed: Effective capacity: 71.6%   Margin: -56.1%   (1.5/1.5 GB written using 172.9 metres of tape)
    |    |__                  Channel variation: 43.4%   Channel variation margin: -73.6%
    |    |__        2.4 m/sec. tape speed: Effective capacity: 76.9%   Margin: -20.9%   (1.2/1.2 GB written using 117.8 metres of tape)
    |    |__                  Channel variation: 35.2%   Channel variation margin: -40.6%
    |    |__        2.7 m/sec. tape speed: Effective capacity: 73.0%   Margin: -46.5%   (1.2/1.2 GB written using 122.2 metres of tape)
    |    |__                  Channel variation: 40.5%   Channel variation margin: -62.0%
    |    |__        3.0 m/sec. tape speed: Effective capacity: 69.9%   Margin: -25.3%   (1.2/1.2 GB written using 126.3 metres of tape)
    |    |__                  Channel variation: 46.7%   Channel variation margin: -86.7%
    |    |__        3.4 m/sec. tape speed: Effective capacity: 67.9%   Margin: -35.4%   (1.2/1.2 GB written using 128.9 metres of tape)
    |    |__                  Channel variation: 49.5%   Channel variation margin: -98.1%
    |    |__        3.7 m/sec. tape speed: Effective capacity: 61.6%   Margin: -33.7%   (1.2/1.2 GB written using 139.2 metres of tape)
    |    |__                  Channel variation: 55.4%   Channel variation margin: -100.0%
    |    |__        4.0 m/sec. tape speed: Effective capacity: 52.5%   Margin: -69.8%   (1.2/1.2 GB written using 159.1 metres of tape)
    |    |__                  Channel variation: 65.8%   Channel variation margin: -100.0%
    |    |__        forward direction: Effective capacity: 47.6%   Margin: -100.0%   (3.9/3.9 GB written using 580.4 metres of tape)
    |    |__                  Channel variation: 109.8%   Channel variation margin: -100.0%
    |    |__        reverse direction: Effective capacity: 81.6%   Margin: 33.0%   (6.2/6.2 GB written using 557.5 metres of tape)
    |    |__                  Channel variation: 6.2%   Channel variation margin: 75.3%
    |    |__ Overall drive margin: -30.7%
    |    |__ Worst-case margin (forward direction): -100.0%
    |    |__ Worst-case channel variation margin (forward direction): -100.0%
    |    |__ The LTO Drive Assessment Test has checked the history and operation of the selected drive, and
    |    |__ problems have been reported.
    |    |__ The drive is no longer recommended for use.

As for storing disks on a shelf -- nobody that knows hard disk technology would recommend this for more than a short period of time.  There is a low-level firmware process in spinning hard drives that periodically reads each sector of a disk and checks for signal strength.  If it's low, the sector is re-written and checked again.  If too low, the data is copied to a reserved area of the disk, and the original sector is remapped to the new location.   You as a user won't ever know that this has happened.

If the drive is sitting without power on a shelf, that process won't happen, and the bits on the disk are free to flip from 1 to 0 (or 2/3, even, possibly :)  ).  You won't ever know until you try to read the disk, and find some or all of your data gone.   I can't tell you if that will be likely to happen in six months, or a year, or five years, or never for a particular disk, because nobody knows.  I don't believe that a single disk vendor publishes a "data lifetime (unpowered)" statistic for their disks, because they don't test it.   Disk is not designed for long-term powered-off storage.  

Tape has huge redundancy built in.   I've heard that you can cut over an inch out of an LTO tape and still recover 100% of the data.  And tape is tested and rated for 20 - 30 years life sitting on a shelf without more than basic climate control (and sometimes not even that -- see this story on 40-year-old NASA tapes that were kept in a garage for over 20 years of that time, yet were read with 100% success: http://www.nasa.gov/topics/moonmars/features/LOIRP/  )
0
 
LVL 4

Author Comment

by:KeterHD
ID: 35695699
That's so excellent insight, thanks a lot, mate.

The company that provides us service for the library claims the issue might be with the SCSI controller. I guess I'll have to haul the library to another set of servers I've got, connect it there and see if it resolves the issue. I'll keep this updated as soon as possible.
0
 
LVL 4

Author Comment

by:KeterHD
ID: 35718624
Hey, thanks a lot for the help. Apparently - yes, the issue is with the library. When I connect another separate backup tape I've got, it backs up just fine, using same SCSI cables, power cable and same cartridge.
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Workplace bullying has increased with the use of email and social media. Retain evidence of this with email archiving to protect your employees.
This article shows how to use a free utility called 'Parkdale' to easily test the performance and benchmark any Hard Drive(s) installed in your computer. We also look at RAM Disks and their speed comparisons.
This tutorial will walk an individual through the process of configuring basic necessities in order to use the 2010 version of Data Protection Manager. These include storage, agents, and protection jobs. Launch Data Protection Manager from the deskt…
This tutorial will show how to configure a single USB drive with a separate folder for each day of the week. This will allow each of the backups to be kept separate preventing the previous day’s backup from being overwritten. The USB drive must be s…
Suggested Courses

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question