Solved

extract words from a document and store in a database field

Posted on 2013-11-06
3
601 Views
Last Modified: 2013-11-08
I would like to store the words from a word document, .doc or .docx, in a database field in order to speed up searches.

I have used the textstream command in access 10 vba to convert my text documents but wondered about the best way to extract the words from a Word document.

I thought about opening the .doc and saving it as a .txt and then using the textstream command to extract the data but wondered if there was a quicker, cleaner method of doing this?
0
Comment
Question by:Nemetona
3 Comments
 
LVL 31

Accepted Solution

by:
Helen_Feddema earned 300 total points
ID: 39627696
I would do the searching in Word, which has superior methods for finding text strings.  Make a Word macro to do the searching, and store the found words in your Access database.  You can use code like the following (from a Word VBA procedure) to work with an Access database.

Public Sub OpenAnotherDatabase()
'Created by Helen Feddema 14-Feb-2010
'Last modified by Helen Feddema 14-Feb-2010

   Dim appAccess As New Access.Application
   Dim strDBNameAndPath As String
   Dim dbs As DAO.Database
   Dim rst As DAO.Recordset
   Dim dbe As DAO.DBEngine
   
   'Change to your db name and path
   strDBNameAndPath = "G:\Documents\Access 2002-2003 Databases\General.mdb"
   appAccess.Visible = True
   appAccess.OpenCurrentDatabase filepath:=strDBNameAndPath, _
      exclusive:=False
      
   'Run a procedure
   'appAccess.Run "PrintOrdersReport"
   
   'Run a macro
   'appAccess.DoCmd.RunMacro "mcrPrintOrdersReport"
   
   'Run an action query
   'appAccess.DoCmd.OpenQuery "qryDeleteSomeOrders"
   
   'Run SQL code
   strSQL = "DELETE tblOrders.ShippedDate FROM tblOrders WHERE ShippedDate = #8/4/1994#;"
   Debug.Print "SQL string: " & strSQL
   'appAccess.DoCmd.RunSQL strSQL
   
   'Iterate through a recordset
   Set dbe = appAccess.DBEngine
   Set dbs = dbe.OpenDatabase(strDBNameAndPath)
    
   Set rst = dbs.OpenRecordset("tblCategories")
   Do Until rst.EOF
      Debug.Print rst![CategoryName]
      rst.MoveNext
   Loop
   rst.Close
    
   Set dbs = Nothing
   Set appAccess = Nothing
   
End Sub

Open in new window

0
 
LVL 14

Assisted Solution

by:DrTribos
DrTribos earned 200 total points
ID: 39628933
I'm not sure about textstream, never used it... you could perhaps load all the words from your document into a scripting dictionary and then move them straight into your database.  

I am assuming you will not be searching for phrases, just words.  You could exclude duplicates...  and for that mater also exclude words shorter than 4 characters etc...
0
 

Author Comment

by:Nemetona
ID: 39633937
Thanks for your responses, I obviously did not explain my problem to well and have since found a work around using saveas2
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

It is often necessary in this forum and others to illustrate Word fields as text with the field delimiters replaced with the curly brackets that the delimiters resemble when field codes are being displayed on the document. This means that the text c…
Shortcuts in Word Just the other day I had a training for Microsoft and they wanted me to show how well the new Windows and Office behaved on a touch device, which by the way is great, but it was only then that I realized that using keyboard shortc…
Learn how to number pages in an Access report over each group. Activate two pass printing by referencing the pages property: Add code to the Page Footers OnFormat event to capture the pages as there occur for each group. Use the pages property to …
This Experts Exchange video Micro Tutorial shows how to tell Microsoft Office that a word is NOT spelled correctly. Microsoft Office has a built-in, main dictionary that is shared by Office apps, including Excel, Outlook, PowerPoint, and Word. When …

914 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now