Link to home
Start Free TrialLog in
Avatar of BespokePaul
BespokePaulFlag for Bermuda

asked on

Best Practice SQL2005 table design for document images

We have been developing a number of in-house document management solutions over the last couple of years and are very comfortable with the storing of documents in the SQL2005 image field type.

However we are about to extend our solution to a department that recevies / generates documents at a volume that is at least a magnitude above what we have been accomodating to date.

I am interested to see if someone can recomend some best practice table design guidelines for the storing of large volumes of records with image data.

For each document record we will have a single image field and a number of fields used to categorise and index the document (doc title, refenernce number, source, notes, etc).

My question is; anticipating that we will have high volumes of documents should I look to design my table structures to split the document table into two seperate tables with a one to one relationship, one with the document attributes, and one with just the document image?

Would this help with index maintenance and storage if I were to place the "Document Image" table on a seperate file group?

Is the TEXTIMAGE_ON option on the Create Table statement good enough for achieving the same thing?

Is there perhaps a better design approach that someone is aware of?

We are utilising SQL2005 Standard, there are no immediate plans to move to SQL2008.
Thanks in advance!
Avatar of 8080_Diver
8080_Diver
Flag of United States of America image

SQL Server 2005 has an improvement over SS2000 when it comes to handling embedded images such as you are looking at.  SQL Server can automatically handle the data in a manner that SQL Server had to have application assistance with . . . you actually store the image on a file server and a pathed filename to that location in the database.  This lets you have the best of both worlds, in a sense.  By storing the data in this manner, your table isn't bulked up beyond belief, yet you have access to the data as though the image is actually in the table.
Think about it for a moment, if you have, say, 1,000,000 entries and each entry has a 3 megabyte image in it, you just consumed 3 million megabytes of storage and you don't have an index or any other data yet.
Look into the way SQL Server 2005 handles the VarBinary(MAX) columns and look into the settings for the manner in which it is stored.
Avatar of Anthony Perkins
>>SQL Server 2005 has an improvement over SS2000 when it comes to handling embedded images such as you are looking at. <<
I suspect you are talking about the FILESTREAM data type, but that is a SQL Server 2008 feature.  The hybrid version you are referring to did not exist in SQL Server 2005.
For information on the new SQL Server 2008 data types see here:
SQL Server 2008: The New Data Types
http://www.simple-talk.com/sql/learn-sql-server/sql-server-2008-the-new-data-types/
Avatar of BespokePaul

ASKER

THanks everyone, I am aware of the new filestream data type in MS SQL2008. however it is not an option at this point as we are restricted for the near term to MS SQL 2005.

What I am trying to determine is the best way to struct the image fields/tables so that they can be easily scaled and also to potentially improve the SQL DB engines ability to search the index fields without having the large image files "gumming up" the works.
I also agree wth 8080_Diver:
what we do is to store the images in a file server and store the image path in the database. We have two tables one with index data and the doc ID (a unique identifier) other with the image path and doc ID. With this method we can even use SQL express to handle high volumes.
ASKER CERTIFIED SOLUTION
Avatar of 8080_Diver
8080_Diver
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you for all the valuable comments.

Backups are not an issue for us, neither is network capacity. My biggest concern is interactive performance. While I recognise the benefits in storing documents outside of the database, I personally do not like that solution for a number of reasons.

I have applied normal best practice for table design, and understand about how to tune indexes. My question was directed towards how it is best to store images within a database table considering options like partitioning and using differing fiels groups etc.

As an interesting note for the group, I did discover a table hint that when used on creation automatically stores image files in a seperate named filegroup. I am going to use this approach for my solution as it will keep the files that store all the indexing data of the image and actual database indexes seperate from the large images.
I would raise the point that the request for "Best Practices" does not necessarily mean that one's plan has to be included in the "Best Practices".  I believe that, in general, there would be agreement that the "Best Practice" for handling images in SS2005 (and SS2000) would be to not store them in the database.  
The fact that the requested "Best Practices" contradict the OP's desires doesn't mean that they are not "Best Practices".
>>I believe that, in general, there would be agreement that the "Best Practice" for handling images in SS2005 (and SS2000) would be to not store them in the database.<<
Absolutely.  Only in very rare cases should the database be used to store images.  It is just a nightmare ready to happen.  Unfortunately you only learn this with experience.  On the bright-side, if you ever implement it in the database you will become a very stong advocate for saving binary data, where it belongs:  In the file system.
Amen! to that! ;-)
3) Award points to 8080_Diver #24224477