Link to home
Start Free TrialLog in
Avatar of mandhjo
mandhjoFlag for United States of America

asked on

Uploaded File Storage Options/Comparisons

Hello all.  I haven't been around EE for a while, but was active a few years ago.  Anyway, I've done some searching on the following and I felt it would be good to have one discussion that covered the pros and cons of different approaches for storage of files uploaded to a web server.  So, here's the situation...

1.  All users who upload/view files are registered.
2.  Users can upload files which can be accessed by themselves and other registered members (no public access to the files).
3.  Directory level security is not possible, ie, restrict access to a directory by user account

Question: What is the best approach for storing the files?  The two that I understand are (1) store the files in a directory or (2) store the files in a database in binary format.

What are the pros and cons (good points and bad points) of each approach?  What is most widely used?

Thanks in advance.  Ultimately, I'll be implementing this in ASP.NET 2.0, but I believe the audience to discuss this is much wider than that subset of developers/experts.

Thanks.  I'm looking forward to helpful comments/answers.
Avatar of nschafer
nschafer
Flag of United States of America image

Hi mandhjo,

Generally I avoid storing files in the database.  I do however frequently store file information including the path to files in a database.  I do this so that I have greater flexibility in the storage locations of my files.  I prefer to keep the database as small as possible and let the file system handle files.

But to address your question as fully as possible:

Storing Files in the Database
Pros:
   Security.  - You must be authenticated to the server to access the files, since you specified that directory level security is not possible, this provides some protection against those that have physical access to the server.

  Backup - I'm actually not 100% sure about this, but I belive that when you backup the database it will backup the embeded files.  I haven't tested this, but can see no reason why this would not be the case.   This could also be considered a CON though as this will increase the size and time of your database backup.

  Replication - You can replicate the database, including the files to another SQL server.  This gives you a very nice safety net if something happens to your database or file system.  You could of course do this seperately for the files and database using the file system for the files, but in my opinion it would be difficult to guarantee that they stay in sync.

  Portability - You can move the database to another server or network and all of the files move with it.  You don't have to worry about moving the files seperatly.


Cons:
  Database Size - The database size will be increased by the size of the image files.  If you are storing large numbers of files this could become an issue

  Backup - As mentioned in the PRO's section the larger database size could cause backup time and media to be difficult to maintain.

  Flexibility - I could be wrong here and will be happy if someone points out that I am, but in my understanding it is not possible to split a database between multiple storage units.  If you have a large number of files (or even a few very large files) this could cause problems with space on the SQL server.  An example here is one system I operate has a bit over 4 TB of image files currently.  Those image files are split up over several RAIDs and have Backups on other systems which also use multiple RAIDs.  Each RAID constitues its own file system therefor in a Windows Environment a seperate drive letter (we actually use linux for these systems but the point is the same).  None of these is the same as the SQL server.  This allows me a great deal of flexibility in storing files and choices of hardware (for example I don't need to purchase a single 4+TB server, or an expensive SAN to store these images)

  Performance - I've heard a lot on both sides of this issue and I haven't run any significant testing of my own so here I'm more parroting the opinions of others than what I have solidly tested.  That being said the concensus of the articles I have read on this issue appear to state that having the SQL server process the data for the images provides a performance hit.  How big of a hit is dependant on many things:  The number, type and size of files being stored.  The relative performance of the computer storing the files in a file system method vs the performance of the SQL server.  The utilization of the SQL server.   I'm sure I've left out lots of variables that help determine performance.

  Ease of Use - It is generally considered easier to access files that are stored in a file system rather than to have to stream them out from the SQL server.  This is a bit more subjective though.  I don't think that it is all that difficult to stream a file from the SQL server and frankly often stream files even when they are on the file system, but if the images are being used in web pages it is certainly easier to use them in the case of something like an <img> tag.



 
   

Neal.
Avatar of mandhjo

ASKER

Thanks, Neal.  I appreciate your very thorough comments and will reward you.  I'd like to leave the question unanswered at this time so as to peak the curiousity of other experts.

One additional comment, the files aren't limited to images...could be anything (which I recognize brings up additional concerns from a hacking perspective).

One follow up question, how would you go about "securing" the file locations?  One potential solution would be to:

1  have one big fat directory
2  turn off browsing/listing
3  have the default doc re-direct them elsewhere
4  hash up the file names and only serve up the ones that the user has access to (using the db data)

Can you think of another solution or add depth to what I've suggested above?  Thanks.
ASKER CERTIFIED SOLUTION
Avatar of nschafer
nschafer
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial