Solved

Parsing Word Documents to a database field

Posted on 2008-10-17
3
294 Views
Last Modified: 2009-01-02
Hi there, I have a classic ASP VBScript site using MS SQL 2000 the database doesn't have Full text search enabled as it's a remote database.

What I would like to do is create some code ideally on the database that searches through the profile table, for all records where cvparse = n and CV not null, then for each record parse all of the information from the CV doc (doc, pdf) and stores it in CVdetail and then updates the cvparse from n  to y.

I'd like it to run automatically daily?

Is this possible?

thank you
0
Comment
Question by:garethtnash
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
3 Comments
 
LVL 51

Accepted Solution

by:
Mark Wills earned 500 total points
ID: 22747597
yes it is possible to an extent, but really need to consider doing it at all. Best to all several "key tags" to be maintained and then search on those - the database then has a full qualified path name to the original document. PDF's can be challenging to parse, so again, when "submitting" a document to "file", then categorise of write a summary prior to commiting. Now, if trying to do in unattended mode, then it will become difficult. Again the PDF will be the challenge. These types of document can very quickly choke any chance of performance if held inside the database. Now parsing document for searching criteria, then you will have to create noise words / thesaurus to make sure that entire documents are not part of key criteria and indexed lookups. Is this an automatic or operator invoked task, and are the files biliographic in nature where you can aut generate several attributes... (e.g. known content such as law documents / specification sheets etc).
0
 

Author Comment

by:garethtnash
ID: 22752673
You've completely lost me, but, I can change the upload to only accept .doc or .docx... and the documents are cvs??

Any advice?

Thank you
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction SQL Server Integration Services can read XML files, that’s known by every BI developer.  (If you didn’t, don’t worry, I’m aiming this article at newcomers as well.) But how far can you go?  When does the XML Source component become …
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question