File Archiving and Compression Best Practices

Posted on 2011-02-16
Last Modified: 2013-11-14
I have been given the assignment of recommending a file archiving and compression procedure. Our files are on a dedicated Linux Server running MySQL 5.5, and PHP 5.3.5.

Our files need to be archived and stored so they can be accessed when necessary, but shouldn't be accessible to those without permission (this part may be a separate question). My question is:

Is there a software to handle the file archiving that you have used and recommend?

Is it best to archive manually, or to use an automated software?

What is the best format for compressing files without data loss?

Please include a reason for your recommendation.
Question by:jeremyjared74
  • 2
  • 2

Expert Comment

ID: 34907161
I preffer tar.gz. As to handle archiving I use rsync and bash script that automates it. I use rsync uncompressed daily (so it just new files get copyed) and a weekly backup.

Accepted Solution

rationalboss earned 275 total points
ID: 34907211
How about just using .htaccess on a directory with the following?
Deny From All

Open in new window

The directory will not be accessible via http:// but files like PHP may be able to read it using file_get_contents(), and other languages too.

Don't archive manually if you have several files. PHP has a class for zipping.

To recursively zip directories, you can check this:

ZIP is already fine unless every byte counts in your server. There are no losses in ZIP compression. Other compressions might save you more bytes like RAR and 7z but not plenty of resources are available (not as much as for ZIP). I'll suggest go with ZIP :)

Assisted Solution

florjan earned 225 total points
ID: 34907344
This is the rsync code I use on school server. Crontab runs it once a day at 5 am (if you need any help with crontab just say) Also if you want extra security you can use chattr +i so even root has to unblock it before able to edit. Save file as and modify as needed.
## where to store backup, with trailing slash
## if you want more security
#chattr -i $BACKUP_DIR
archive() {
    echo "Archiving \"${from}\" \"${target}\""
    rsync --archive --cvs-exclude --one-file-system --delete --quiet \
        "${from}" "${target}"

## as many entries as you want to backup (if you have more than 1 folder), no trailing slash
archive "/path/to/dir1"        "."
archive "/path/to/dir2"        "."
## optional if you need anything removed (on out school server we don't want moodle sessions)
/bin/rm -Rf $BACKUP_DIR/moodledata/sessions/

## if you want to mod permissions on files, like we want read access to people in group webadmin so they can restore backup in case of a problem but no write (that's up to root)
/bin/chmod 640 -R $BACKUP_DIRdir1/
/bin/chmod 640 -R $BACKUP_DIR/dir2/
/bin/chown -R root:webadmin $BACKUP_DIR/dir1/
/bin/chown -R root:webadmin $BACKUP_DIR/dir2/
/usr/bin/find $BACKUP_DIR/dir1/ -type d -exec chmod a+x {} \;
/usr/bin/find $BACKUP_DIR/dir2/ -type d -exec chmod a+x {} \;
## if you want more security
#chattr +i $BACKUP_DIR

Open in new window

And I do not recommend backup stored to be accessible via http or https. If user is autorized he probably has access to server via ssh client. If you also need code for weekly backup and mysql backup just say.
LVL 23

Author Comment

ID: 34907938
Thank you for the quick response's. I will look into each suggestion and decide which fits my situation. I will leave the question open long enough to review the suggestions (and maybe get a few more suggestions). I am leaning toward rationalboss's: suggestion, but I would like to give florjan's code a shot.
Again, thanks for the quick replies.
LVL 23

Author Comment

ID: 34932669
Both of the experts were very helpful with their descriptive and thorough suggestions. I appreciate it very much, you saved me unknown amounts of time and headaches.

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Finding original email is quite difficult due to their duplicates. From this article, you will come to know why multiple duplicates of same emails appear and how to delete duplicate emails from Outlook securely and instantly while vital emails remai…
In this article we will learn how to backup a VMware farm using Nakivo Backup & Replication. In this tutorial we will install the software on a Windows 2012 R2 Server.
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question