• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 231
  • Last Modified:

How archiving lots of files? (tera bytes)


I'm looking for the best way to archive documents (of any type). The documents must be stocked on a server. The client pcs must be able to consult those archived documents. (By using for exemple a webservice)

I already found 2 solutions :

- Save the files into database blobs...
Problems :
       * I've to archive tera bytes of files --> database will probably slow down.
       * We only have one large database file --> not easy for backuping and transfering to another server or several servers.

- Save lots of files each after the other into a large file till we reached a certain size. Once reached the maximum size we can create another large-file. By stocking the offset and the length of each files we will be able to consult the right files.

If someone has a better solution, please let me know.

Any ideas are welcome!!!

  • 2
1 Solution
Duncan MeyersCommented:
An ATA SAN disk array would fit the bill. An EMC CX300 with 3 racks of ATA disk (and one of fibre channel) could offer potential 14.4TB raw storage (that is; 45 discs by 320GB - no allowances for parity discs, RAID type etc have been made). That is a lot by any strandard...


Backing it all up will be a challenge, though. Unfortunately, backup technology lags behind storage technology... LTO-3 tape drives are about as good as it gets at the moment.

If you have a need for true archival storage then have a look at this: http://www.emc.com/products/systems/centera.jsp?openfolder=platform 
If you need the storage space, a NAS is more suitable.  I have both a SAN and a NAS.  A NAS is less expensive, and the data you're archiving is probably less important than the data you're not archiving.  With the NAS, you don't need to purchase HBAs for all servers to connect to it; you can use your existing infrastructure.  Also, the NAS doesn't really require much management at all, and it doesn't require another interface to learn.  If you're using Microsoft now, you can purchase an HP NAS preloaded with Microsoft Storage Server.  Also, if you need to, you can even run databases directly off of the NAS.

As for managing the files and database, Veritas just bought KVS and renamed the product Veritas Enterprise Vault: http://www.veritas.com/Products/www?c=pdtfeatures&refId=322&psId=9609

One of the modules you're looking for is File Server Archiving:
     "Enabling “out-of-the-box” archiving support for any application that generates and stores
      information in standard files, Enterprise Vault File System Archiving saves critical space on
      file servers by seamlessly moving files to alternative storage devices without impact on the
      end-user. Using policy-based settings, Enterprise Vault enables you to archive files, by age,
      size or other criteria. Items are compressed, stored and indexed centrally by Enterprise
      Vault, and can be searched as well as retrieved transparently by users via optional placeholders."

This software is excellent.  You can setup policies to move data based on size, age, whatever, and leave a placeholder for the file so that the move is transparent to your users.  No user training involved!  You can even set it to delete data based on policies--good for regulations.

Then, you can backup the NAS on a different schedule.  I automate the movement of my data and schedule that movement in such a fashion that I only need to backup the NAS once a month--for that moved data.  Some other parts of the NAS are being used for other things.

1. Store least important data on least expensive storage; NAS
2. Automate data movement; Veritas Enterprise Vault
3. Backup the NAS on a different schedule.
Hi.  Have you narrowed your choices yet?

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now