Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


Backup Solution for 4-6 TB of data

Posted on 2010-11-18
Medium Priority
Last Modified: 2012-06-21
I have run into many situations where creative/marketing companies have 3TB or 4TB or more worth of uncompressable graphic files that need backing up and archiving. There are plenty of solutions such as NAS devices for network storage but the issue is the off-site/DR portion of the backup plan.

Does anyone have any recommendations on how to backup this amount of data off-site? I realize bandwidth isn't fast enough for nightly to a datacenter or on-line backup solution, and those are very expensive as well. We have entertained having a couple NAS devices and physically rotating them, but that can easily mess up the scheduled jobs and counts on user intervention, which isn't reliable.

Any ideas?

Question by:Dopher
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Expert Comment

ID: 34164503
The most important thing is defining the need correctly;
if you need archiving yoru solution is different and if you nered backing up solution would be different;
One of the important parameter that you should check is change ratio of the data. If most of the data is not changing you should look at the deduplication solutions such as datadomain (it has built-in replication feature), puredisk etc. deduplication would dramaticaliy shorten your backup time and decrease your bandwith usage.

LVL 63

Expert Comment

ID: 34164838
Or I would do a full backup over a weekend, and then just nightly incrementals.

Eventually, think of  a monthly full, and then daily incrementals, and keep the monthly for xx months depending on retention policies ( if needed )

You may need to do a one time archive of everything, and then store it permanently offsite, and then do incrementals. Doing an archive once or twice may be a good solution to keep the incrementals to a reasonable amount.

I hope this helps !
LVL 42

Expert Comment

ID: 34169798
Assuming that you're using NAS for active data, you can use Jungle Disk as a front end to S3 for cloud storage. That's probably as cheap as you can get. A cloud backup provider would work.

You can also look at Nasuni which is a NAS like appliance as a front end to cloud storage. Take a look at their website, . They even have a simple price calculator.
Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

LVL 21

Accepted Solution

SelfGovern earned 1000 total points
ID: 34172424
Yep... everything to the cloud... and then how long does it take to restore your 3TB of data when your server crashes?    The point being that the goal is not "Back up my data", the goal is, "Being able to restore my data should my server crash or data be lost."

With that in mind --

To me, "Archive" means you've got files that are going to be accessed rarely or not at all in the course of normal business, but need to be available for historical purposes, legal discovery, etc.

A "Backup", on the other hand, is a process where you take working or current data and make safe copies of it, with the ability to use those copies to do restores, possibly promoting the "backup" to part of your "archive" down the road.

Here's what I think -- 4TB isn't a lot of data in the big scheme of things, I work with companies that have orders of magnitude more than that... but it's a pretty big deal without the infrastructure and plan to deal with it.  It's almost too bad you don't have 10x that data, because that's where these systems called Hierarchical Storage Management systems start to be cost-effective (or maybe you can find one targeted at smaller environments?  The idea is, you have some data online, and the rarely-accessed data on tape, but it's still accessed through the file system, buy keeping a stub and pointer to the rest of the data.)

Questions we really need to have answered:
How much of that data doesn't change?
How much is the data growing on a monthly or yearly basis?
How much data can you afford to lose (i.e., could you afford to lose a day's worth, or only an hour's worth, or ...?
How is the data stored now?   direct attach, SAN, or... ?   How many servers are involved, and how much data does each server "own"?
What is your backup window?
How are you doing your backups today?
And... what's your budget?  How much is protecting the data worth to you?

The solution that I often recommend for situations similar to yours is a D2D2T, or disk to disk to tape, solution using incremental forever and synthetic full backups.   It works like this --

- Your first backup is a full backup to disk
- After that, you only run incremental backups to disk, which means your daily backups are much smaller than otherwise
- Periodically you'll run a process to create a synthetic full backup and put it on physical tape -- this means that the backup application uses the information in its catalog to create a full backup physical tape from that collection of backup data, and the end result just as if you'd done a full backup to tape in the first place.

Because you're only backing up changed files -- depending on how and how much things change! -- your backup window and the amount of disk space required can be pretty small, even with 4TB total data.  Yet, you have the benefit of a weekly full backup set for off-site preparedness and long-time archive.  You can use a tape library with one or two LTO-5 tape drives to create the synthetic full tapes.   LTO-5 has a native capacity of 1.5TB/tape, so you should be looking at about three tapes for your full set.   If you rotate those in a traditional GFS rotation, where you keep a month's worth of weekly tapes, and a year's worth of monthly tapes, and however long of yearly tapes, you'll have a great solution.   LTO-5 also allows you to use hardware encryption in the tape drive to ensure no one can read the data unless they have the proper encryption key.   See or other vendors for libraries in this class.

Two products that have this functionality are HP's Data Protector ( ) and IBM's Tivoli Storage Manager.  You should be able to download the products and use them free for 30- or 60-days for evaluation purposes.

If you go with any kind of backup to disk (even this D2D2T solution), don't scrimp on the disks.   You'll want a fast solution to be able to support the streaming speeds you'll need to create the tape copies.
LVL 42

Expert Comment

ID: 34174275
My understanding of Nasuni and other services like it is that it is a NAS device that you can access now, without having to wait for restoring several TB of data. The most active files are going to be cached locally, which improves performance. You can treat it like a lower performing NAS tier, but it's faster than directly storing the data in the cloud.
LVL 17

Expert Comment

by:Gerald Connolly
ID: 34177273
What is the loss of this data worth?

There are lots of ways to backup/archive this data but without more details of the current configuration, current backup hardware and regime, recovery requirements, recovery time requirements, data growth, budget, etc etc

NB Its starting to look like paid consultancy! (Not me, as we 6 hours of timezone apart).

4TB a day its not too bad, its only a 50 MB/sec over a 24 hour period, so realtime replication is a possibility (50MB/sec is around 500Mbits/sec or 50% of a Gigabit pipe - definitely achievable). but its about the total amount of data and how long it will take to sync and more importantly how long it will take to restore.

Author Closing Comment

ID: 34183803
Thanks everyone for the great insight! We have decided to check out a NAS capable of Amazon S3 synchronization. This should offer a cost-effective solution that will offer on-line access as well as maintain an archived copy in the cloud.

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Learn how the use of a bunch of disparate tools requiring a lot of manual attention led to a series of unfortunate backup events for one company.
Employees depend heavily on their PCs, and new threats like ransomware make it even more critical to protect their important data.
This tutorial will walk an individual through the steps necessary to enable the VMware\Hyper-V licensed feature of Backup Exec 2012. In addition, how to add a VMware server and configure a backup job. The first step is to acquire the necessary licen…
This tutorial will show how to configure a single USB drive with a separate folder for each day of the week. This will allow each of the backups to be kept separate preventing the previous day’s backup from being overwritten. The USB drive must be s…

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question