How archiving lots of files? (tera bytes)

Posted on 2005-04-18
Last Modified: 2010-04-03

I'm looking for the best way to archive documents (of any type). The documents must be stocked on a server. The client pcs must be able to consult those archived documents. (By using for exemple a webservice)

I already found 2 solutions :

- Save the files into database blobs...
Problems :
       * I've to archive tera bytes of files --> database will probably slow down.
       * We only have one large database file --> not easy for backuping and transfering to another server or several servers.

- Save lots of files each after the other into a large file till we reached a certain size. Once reached the maximum size we can create another large-file. By stocking the offset and the length of each files we will be able to consult the right files.

If someone has a better solution, please let me know.

Any ideas are welcome!!!

Question by:davyberroho
    LVL 30

    Expert Comment

    by:Duncan Meyers
    An ATA SAN disk array would fit the bill. An EMC CX300 with 3 racks of ATA disk (and one of fibre channel) could offer potential 14.4TB raw storage (that is; 45 discs by 320GB - no allowances for parity discs, RAID type etc have been made). That is a lot by any strandard...

    Backing it all up will be a challenge, though. Unfortunately, backup technology lags behind storage technology... LTO-3 tape drives are about as good as it gets at the moment.

    If you have a need for true archival storage then have a look at this:
    LVL 13

    Accepted Solution

    If you need the storage space, a NAS is more suitable.  I have both a SAN and a NAS.  A NAS is less expensive, and the data you're archiving is probably less important than the data you're not archiving.  With the NAS, you don't need to purchase HBAs for all servers to connect to it; you can use your existing infrastructure.  Also, the NAS doesn't really require much management at all, and it doesn't require another interface to learn.  If you're using Microsoft now, you can purchase an HP NAS preloaded with Microsoft Storage Server.  Also, if you need to, you can even run databases directly off of the NAS.

    As for managing the files and database, Veritas just bought KVS and renamed the product Veritas Enterprise Vault:

    One of the modules you're looking for is File Server Archiving:
         "Enabling “out-of-the-box” archiving support for any application that generates and stores
          information in standard files, Enterprise Vault File System Archiving saves critical space on
          file servers by seamlessly moving files to alternative storage devices without impact on the
          end-user. Using policy-based settings, Enterprise Vault enables you to archive files, by age,
          size or other criteria. Items are compressed, stored and indexed centrally by Enterprise
          Vault, and can be searched as well as retrieved transparently by users via optional placeholders."

    This software is excellent.  You can setup policies to move data based on size, age, whatever, and leave a placeholder for the file so that the move is transparent to your users.  No user training involved!  You can even set it to delete data based on policies--good for regulations.

    Then, you can backup the NAS on a different schedule.  I automate the movement of my data and schedule that movement in such a fashion that I only need to backup the NAS once a month--for that moved data.  Some other parts of the NAS are being used for other things.

    1. Store least important data on least expensive storage; NAS
    2. Automate data movement; Veritas Enterprise Vault
    3. Backup the NAS on a different schedule.
    LVL 13

    Expert Comment

    Hi.  Have you narrowed your choices yet?

    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Join & Write a Comment

    Hi, I've made you some graphics for a better understanding how RAID works. First of all, there are two ways a raid can be generated: - By hardware - By software What does that mean? This means: If you have a hardware RAID controller, there…
    How to update Firmware and Bios in Dell Equalogic PS6000 Arrays and Hard Disks firmware update.
    This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
    This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

    745 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    13 Experts available now in Live!

    Get 1:1 Help Now