Need to speed up Backup Exec when backing up small image files

We are using Backup Exec 12.5, backing up to disk on a data valut.  Most backup jobs move very quickly (1800 MB/min).  However, we store over a terbyte of image files (mostly Tiff) that are all less than 200 KB.

When Backup Exec hits these folders it slows down to a crawl.  Is there a way to speed up backing up of all of these small files?  Looking for any suggestions.

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Serge FournierAnalyst ProgrammerCommented:
do you have many files il same sub directory?

never put more than 500 files in a directory

more than that, accessing thoses files is a crawl, even for backup exec
keagle79Author Commented:
No, actually they are split up into many directories with no more than 200 hundred.  The are scanned documents that are separated by date.
Thomas RushCommented:
Small files will always be a problem -- I have seen fast servers slow to under 10MB/sec when sequentially reading a  bunch of 10K files.

Things to do:
- Put the files as close to the root directory as possible, without being in root, to minimize directory tree walking.
- Use the fastest possible source disk -- RAID 1+0 on a *good* *hardware* RAID controller
- Keep the disk defragmented (hopefully by a process that is not running during the backup!)
- Keep other applications from accessing this disk while backup is running, including background processes like indexing and defrag.
- If it's not a huge amount of data (i.e., tens to 100GB, vs. 100s of GB to TB+), you could put this data on SSD (solid state disk), which will give you the fastest possible read times, since there's no physical movement involved in reads.  The challenge is that SSD has a high, but limited, number of read cycles... so if these are very heavily accessed files, you might wear out SSD disks faster than you'd like.  (And remember to TURN OFF defragmentation processes on SSD!)

If none of those help (and they may not help much; this small-file thing is a NTFS/FAT attribute), then the question is, "How often do you have to restore single files?"  If the answer is "Rarely" or "Never", then the best solution is to put the files on a disk (stripe set, probably) of their own, and perform an image backup.  An image backup will read the disk sectors sequentially, and can give much faster speeds than a file-by-file backup (which you're doing now) that has to read the directory, walk the tree, read one file, go back to root, read the directory, walk the tree, read one file....

The problem with image backup is that restores will take significantly longer... but if restores are rare and this is just for archive in case of disaster, then that is probably not a problem.  Note that, even if there is "some" other data on this disk that does need occasional single-file restores, you will still back that up as part of the disk image (which gets *everything*), but you can also back up those other files separately as part of a file-by-file backup (specify that particular directory in a standard backup).

If an image backup is not practical for some reason, you've got one other choice, which is to use a disk target, then move that to tape.  With D2D2T (Disk to Disk to Tape), you use disk as the first target of your backup job, which creates a huge single file that is mostly contiguous (and is your backup job in the same format as if it had been written to tape)... then step 2 is to use your backup application to copy that to physical tape.  Since it's coming from a huge file (hopefully close to root!), you can get good backup speeds to tape (but this will not improve the original backup speed, since source disk is the bottleneck).

If you're going to use D2D2T, the cheapest method is to use the backup application to create a D2D tartet on your server's hard disk.  Make sure it's big enough to hold the complete backup job.  Problems are that there is much more server overhead, you have to manage the space manually, and it's a server-by-server task, not something you can do for all servers easily.

Then the more expensive but much more scalable solution is to purchase a D2D backup system (a type of viirtual tape library, or VTL), a typically Linux-based appliance that mimics multiple tape libraries and acts as a backup target for multiple servers at once.  The best VTLs allow you to perform some sort of automigration, where the D2D system itself can copy or move the data to physical tape, so you don't have to go back over the network.  Different VTLs are available, they can have either iSCSI (simple, free, decent performance) or Fibre Channel (more expensive, high performance) connectivity to your servers.

I'm pretty sure those are your options.   If you do look at VTLs or D2D backup systems, please consider the HP D2D2500 or D2D4000 series (see ).  Obligatory Disclaimer: Yes, I do work for HP -- but everything in this email up to this paragraph is as vendor-neutral as you can get.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage Software

From novice to tech pro — start learning today.