Link to home
Create AccountLog in
Avatar of datasafe

asked on

LTO-4 Drive Capacity Problem

We are running Backup Exec 12.5 and have recently moved from LTO-3 to LTO-4.  All works fine except for the tape capacity. All the data is precompressed so we do not expect more than 800GB but tapes are being ejected at between 600GB and 650GB during backup as full! For example, with new HP media, looking at the statistics, the following is reported:

Used capacity: 780GB

Bytes written: 584GB

Bytes read: 192KB

Mounts: 4

Seeks: 15

Errors: All  0



Avatar of Tony Giangreco
Tony Giangreco
Flag of United States of America image

I would format those tapes, re-inventory them and then run a test backup on one. Check to see if it lists a compression ratio or simulr specs so you can verify you are getting the correct compression and the tape is actually the correct capacity.
The size they quote on these tapes is the compressed size.
The LTO-4 uses 2 to 1 compression.

So really.... it is a 400 GB tape.
If you store a very big text document it would store about 800 GB of text onto that tape but anything that is already compressed will give you less and less storage space.
To be honest if you are getting 600-700 MB then that is pretty good for pre-compressed files.  I would expect it to be closed to 500.
Avatar of datasafe


LTO-4 tapes are 1600GB assuming 2:1 compression and 800GB with no compression.  We should be getting 800GB or 780GB depending how one counts.
We do not expect any compression. The files we are backing up to tape have already been compressed.  The compression ratio is 1:1 i.e. no compression.  However, we have been running with hardware compression on (never made a difference with LTO-3) and are now testing with it turned off.
Avatar of Thomas Rush
From the report which shows 584BG written and 780 GB used, I infer the following:
1) Your tapes are being correctly seen and used at their 800GB capacity (the difference between 800 and 780 is the difference between decimal and binary GB, or close enough).
2) You don't say which brand of tape drive you're using.  Some manufacturers have better adaptive write speed algorithms than others.  The behaviour you're seeing could be because you're not feeding data to the tape drive fast enough, so that the buffer empties, the tape is stopped and rewound, and then repositioned so that it can start writing again.  This always uses a bit of "empty" space in between the last write's end and the current write's beginning.  If it happens often enough, you end up with a potentially significant loss of usable tape capacity.   Which brand of tape drive are you using?  What write performance are you seeing?
3) It's also possible that you've got marginal or bad media, or a marginal or dirty tape head.  With a new drive and new media I consider this unlikely, but it's worth running diagnostics to check (the write errors may be hidden from the backup application, since on a second or third try the write is succeeding).   To check, download your tape drive vendor's diagnostics.  If HP, look for Library and Tape Tools -- it's free and has a complete set of diagnostics.
You'll want to run the media test (on a scratch tape; data will be overwritten) and the drive tests.  Let me know the results.
The standard spiel for LTO is that it does compression, and if your data is uncompressible and compression is enabled it will take more space on tape than the original size of the data (ie the compression overhead) although i don't think anyone would expect the compression overhead to be quite that big (25% in this case).

Have you tried disabling the compression on the drive?
To Connollyg: Your information is incorrect for LTO drives.  As the data comes in to LTO, it is directed to two separate, simultaneous paths: one through the compression chip, and one without compression.  The size of the resultant blocks is compared; if a block compresses, the compressed data is used; if the block does not compress, the uncompressed data is used.  (This information is correct for HP LTO drives; I suppose it is possible, but I believe it unlikely, that other manufacturers do not have the split data paths to ensure that uncompressible data is never expanded because of compression metadata.  If someone knows of other manufacturers who do it differently, I'd appreciate the info!)

Older tape drives (DAT, AIT, DLT) either used the compression chip for all data in a job, or for none of the data in that job (according to job settings) -- so if compression was turned on, they would inflate the non-compressible data because of the compression metadata that was added to each block.
@SelfGovern - Well you learn something new everyday!  Although this extract from the LTO entry on Wikipedia shows what you describe, it seems to imply that there is a small fixed overhead.

The LTO specification describes a Data Compression method LTO-DC, also called Streaming Lossless Data Compression (SLDC).[56] It is very similar to the algorithm ALDC[57] which is a variation of LZS (a patent-encumbered algorithm controlled by Hi/Fn[58]).
The primary difference between ALDC and SLDC is that SLDC does not apply the compression algorithm to uncompressible data (i.e. data that is already compressed or sufficiently random to defeat the compression algorithm). Every block of data written to tape has a header bit indicating whether the block is compressed or raw. For each block of data that the algorithm works on, it saves a copy of the raw data. After applying the compression function to the data, the algorithm compares the "compressed" data block to the raw data block in memory and writes the smaller of the two to tape. Because of the pigeonhole principle, every lossless data compression algorithm will end up increasing the size of some inputs. The extra bit used by SLDC to differentiate between raw and compressed blocks effectively places an upper bound on this data expansion.
LTO-DC achieves an approximately 2:1 compression ratio when applied to the Calgary Corpus. This is inferior to slower algorithms such as gzip, but similar to lzop and the high speed algorithms built into other tape drives. It should be noted that plain text, raw images, and database files (TXT, ASCII, BMP, DBF, etc.) typically compress much better than other types of data stored on computer systems. In contrast, encrypted data and pre-compressed data (PGP, ZIP, JPEG, MPEG, MP3, etc.) would normally increase in size, if data compression was applied. In some cases this data expansion could be as much as 15%. With the SLDC algorithm, this significant expansion is avoided.
Thanks for all the comments. After extensive testing we have come to the conclusion there is something wrong with the tape drive. It will backup 400GB (compressed) without a problem and fast if we use LTO3 tapes but only 600+ GB using an LTO4 tape.
Not necessarily the tape drive, this could be an issue with what you are backing up.

LTO drives have minimum supply speed requirements which typically show up when moving from one generation to another. Being able to read the info off disk and supply the data at a rate suitable for a LTO-3 does not mean it will be OK for a LTO-4, this normally maifests itself as slow (often very slow) performance and reduced capacity, both due to a process commonly known as shoe-shining.

Shoe-shining occurs when the data written to the tape drive, or more properly its cache is slower than the drive is writing the data to the tape from the cache, which means that the cache becomes empty. This forces the tape drive to stop streaming ie to come to a stop, but this obviously doesnt happen suddenly, and to maximise the capacity the drive has to reverse the tape back past where it ran out of data so that when data does arrive in the cache, it can get back up to streaming speed before it gets to the point where it previously ran out of data, where it then starts writing data until the data runs out again. This to'ing and fro'ing of the tape is likened to the action used to polish shoes (when sitting on on of those shoe cleaning stands) and obviously severly reduces the writing speed usually below 1MB/s.

Even though LTO drives have methods to alleviate this issue, by reducing the speed of the tape (aka data-rate matching) they all have a minimum speed below which shoe-shining occurs, with LTO-3 its 20-27 MB/s and with LTO-4 its 33-40 MB/s

So you need to look at what you are backing up and what performance you get out of it.
Conollyg is right that feeding data to a tape drive too slowly will result in a loss of capacity... but 25%(+) seems an unusually high penalty.  

To DataSafe (OP) -- to confirm it's a problem with the drive and not just the speed you're feeding it data, use HP's Library and Tape Tools (link above) and run a write test with large blocks of noncompressible written from memory to the tape drive.   This should sustain streaming speeds and if you write 800GB, you can verify that it's not the tape drive.  

You might want to monitor backup speed in your backup application to see how fast you're reading data during backup.   If it's under the drive's minimum adaptive write speed, you'll have somewhat of a problem (although I don't expect the capacity loss to be anywhere near 25%).

If feed speed is your problem, see what happens when you turn compression OFF.  This is counter-intuitive, but here's my thinking: Compression forces the disk to supply data faster to keep the tape drive streaming.  If you're experiencing buffer under-run, it may be because 2:1 compressible data (for instance) kicks up the tape drive's minimum streaming speed to 66-80MB/sec, and this is faster than many people's disks can get backup data to the tape drive.   Turning compression *off* means that the tape drive is writing at exactly its native speed, thus 33-40 MB/sec or more will keep it streaming.    If turning compression off clears up your capacity problem, you can see where your bottleneck is, and start working on improving that component's speed.  (observation: it's usually the disk, caused by a combination of tiny files, deeply nested directories, long file names, fragmentation, and contention from other processes.  Some of those you can do something about, some you can't (other than by treating the symptoms: moving to an image (vs. file) backup; or implementing a D2D2T backup strategy where you first do a backup to disk or virtual tape library, and once you have this nice contiguous, huge file backup, you copy it to tape, and that copy should go much faster.)
Have installed Tape Tools and run a number of tests - all passed and no problems I can see.

We are effectively running D2D2T.

Ran another test backup today. BE Monitor reported 711GB stored and ejected the tape as full (there was 770GB to backup). Media properties showed 663GB written and 780GB used capacity! The media contains no other backups. The data rate is 5400MB/min which is good - way above shoe-shining rates. Compression is off (1:1) and there were no hardware or software errors reported.

Why am I still missing 70GB of storage space? Is this a BE 12.5 issue?
Tape Drive is being replaced by supplier after numerous tests including HP Tape Tools with compression switched off never stored more than 670GB on a new HP tape
Avatar of datasafe

Link to home
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Because after much investigation it was the only complete solution to the problem.