We are developing a web based application (asp.net 3.5 with sql server 2005) where we need to handle and search inside the documents uploaded by users.
We have the following flow which is working fine
User uploads the document ( doc / xls / pdf etc)
Its stored in a particular folder on the server
Our code reads it and extract all text inside it and store in a database field
Later whenever a user search for any string we look for that string in the database field and sisplay the search results. ( ie : there are some documents which are ~5 MB in size and have 450+ pages of text in them and all of this gets extracted and stored in the db field)
Our main question is: is this a right approach? If not wha's the best approach in this scenario?
If there are too many documents with too much text inside them will it adversely affect our database performance?
Since it's a web based app running on shared hosting we may not be able to use any standard library like lucene.