• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 404
  • Last Modified:

which database is better to store unstructured documents

would you please let me know which database is better to store pdf, xls, doc, xml files?
I have read about filestream and filetables on sqlserver2012 but still needing a clear understanding of each database advantages just to start with the best solution. We intend to store all of our accounting and invoicing documents.

still undefined, is the issue related to storing the file into the database or store just a "pointer" to the file that may be saved on NTFS

I appreciate your help
smile be brave
0
smilebebrave
Asked:
smilebebrave
  • 3
  • 3
  • 2
  • +5
12 Solutions
 
PortletPaulCommented:
What are you evaluation criteria?
Must it be from only those 3 databases?

What is the scale of this need?
e.g. 100 documents a month or 100 per hour?
Will you be spooling invoice images (pdf/tiff/whatever) at this solution? at what rate?
How many invoices per month do you generate?
How big will this get?

That;s just a few that popped into the head. There isn't much detail in the question, can you share more about the requirements?

e.g. Do you need version control?
will you only use NTFS? what about Linux?

How vital is cost?
will you rule out open source?
0
 
Dave BaldwinFixer of ProblemsCommented:
The number 1 problem with storing documents in a database is that they have to be extracted back to files for the original applications to make use of them.  And then you have to restore them to the database after they are edited.
0
 
tliottaCommented:
As effectively noted by PortletPaul, there is no "best". There is rarely any "best" anything in this industry and it rarely makes sense to ask about it. It's a subjective evaluation.

You need to list your criteria/requirements and prioritize them or assign weights. Then match them with capabilities of the products and judge which is the best fit.

Criteria might include cost (acquisition, installation, training, implementation, operation, maintenance and support), platform and application compatibility, data volumes (both absolute and on-going), familiarity for all involved, and others.

"Best" is different for me at different sites. In many cases, it'll be DB2; but PostgreSQL can work very well in certain cases. In other cases... other DBMSs are needed.

It depends.

Tom
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
Éric MoreauSenior .Net ConsultantCommented:
FileTable is only available on SQL Server 2012 and UP.

FileStream can be used starting with SQL Server 2008.

FileTable is really FileStream 2.0 and is easier to use.

I wrote an article about the FileTable (http://emoreau.com/Entries/Articles/2012/12/Microsoft-SQL-Server-2012-FileTable.aspx) and another one about the Filestream (http://emoreau.com/Entries/Articles/2009/09/Microsoft-SQL-Server-2008--The-FileStream-feature.aspx)
0
 
Kent OlsenData Warehouse Architect / DBACommented:
Good comments.  :)

One critical criteria is how many documents you intend to store and how they will be accessed.

If new documents come in at a modest pace of about 1 per second, your system will be recording 86,400 documents per day.  That's a lot of files to be added to a file system.  You'll probably want to keep that many documents inside the database.

Your application will be responsible for moving the document from the client side to the database.  You won't be able to treat the document as a column value in a SELECT or INSERT statement.

And like Tom, I'm a big DB2 proponent.  But all of the major engines should be able to do this on a modest scale.
0
 
sdstuberCommented:
The best platform is the one that matches your staff's skill set.

People can learn new skills; but if you're already a sqlserver shop with a sqlserver staff, then going with oracle or db2 is not likely your best bet.
0
 
smilebebraveAuthor Commented:
I thank you all for your feedback, please find below additional information:


Prefered databases: db2, oracle, sql serve

about 25,000 documents per month with size ranging from 220Kb to 4 MB

No version control is required.
We can consider NTFS and Linux
open source may be consider

best regards
smile be brave
0
 
Kent OlsenData Warehouse Architect / DBACommented:
25,000 per month is about 800 per day.  All of the major DBMS should handle that without breathing hard.

But that is more than I'd want to put into the file system.  I'd strongly consider storing the documents as BLOB objects in the database.

In the arena of "paid for licensed products", you should be able to develop and test a small application with the DB2, Oracle, or SQL Server free version.  Note that some of the limitations may impact your testing, but the schemas would be remarkably similar so you might even be able to test several.

DB2 Express places no limits on the size of the database, but limits resources to 4GB and 4 cores.
Oracle limits the size of the database to 1GB to 4GB, depending on the version.
SQL Server limits the database to 4GB to 10GB and 4 cores depending on the version.


Kent
0
 
sdstuberCommented:
oracle size limit for the free version  is 4gb for 10gXE  and 11gb for 11gXE
0
 
Kent OlsenData Warehouse Architect / DBACommented:
>> oracle size limit for the free version  is 4gb for 10gXE  and 11gb for 11gXE

Cool.  I didn't know that it had been bumped up at far...
0
 
Scott PletcherSenior DBACommented:
More critically, SQL Server Express limits RAM to 1GB (last I checked).  You can get around the db size limitations, but the RAM limitation is unavoidable.  I don't know if/what the RAM limitations on the other DBMSs are.
0
 
Éric MoreauSenior .Net ConsultantCommented:
latest version of SQL express is 10gb
0
 
Scott PletcherSenior DBACommented:
Not according to MS documentation it isn't.  Remember, I said RAM was limited.  The db size can easily be gotten around anyway :-) .
0
 
smilebebraveAuthor Commented:
your comments are extreamly helpfull, thank you.

If I decide to  use sql server, is the filestream and filetable functionality widely used by .NET developers? or definitely storing a document as BLOB is most used at the moment?

best regards
smile be brave
0
 
Éric MoreauSenior .Net ConsultantCommented:
>>If I decide to  use sql server, is the filestream and filetable functionality widely used by .NET developers? or definitely storing a document as BLOB is most used at the moment?

the feature is only available since SQL Server 2008 and not everybody is aware of if. It is one of these unknown gem.
0

Featured Post

Upgrade your Question Security!

Add Premium security features to your question to ensure its privacy or anonymity. Learn more about your ability to control Question Security today.

  • 3
  • 3
  • 2
  • +5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now