Link to home
Start Free TrialLog in
Avatar of j_stone
j_stone

asked on

visual basic spell check with a twist / an OCR problem

I know that you can use the microsoft word spell checker in visual basic but what I need to do i am not sure is possible:

I am working on an ocr application that recognizes forms based on the text on them and bookmarks them in a pdf.

the program works pretty well on scans that are high quality but when the quality degrades so does the ocr that i am using (MODI).

I need a way to replace words not in the dictionary with words of equal length (these words seem to be the closest to the actually pre-OCR words) or barring that possibility the first suggested word automatically

any idea how i can do spell checking and replacing without user intervention? (the text isn't saved anywhere it is just used as a tool for recognizing a given page)  I have looked for OCR programs that have this feature and LEADTOOLS seems to do what I want it to, and the OCR is better than MODI but i don't really have several thousand dollars to spend on a hobby program.

So what i am looking for is a spell checker that i can force matches on or a free OCR program that will do best matching
Avatar of Corey Scheich
Corey Scheich
Flag of United States of America image

I found this example a long long time ago i stripped some application specific code out of it, but didn't have time to test it.  I hope it gets you close.

corey2



' Notes:
'       1) in the VBA editor, the following must be checked
'           in the "Tools -> References" dialog:
'
'               Microsoft Word 10.0 Object Library
'
'       2) based on MSDN article:
'           "Give Excel the Power of Word Spelling and Grammar Checking"
'               Charlie Kindschi
'               Microsoft Corporation
'               January 1998
'
'       3) all UI for spell checker is provided by Microsoft Word
 
Option Explicit
 
Function StartWord() As Word.Application
    Dim wdApp                       As Word.Application
    
    Set wdApp = New Word.Application
    
    ' have to show so user can interact with spell
    ' checker dialog
    wdApp.Visible = True
    wdApp.WindowState = wdWindowStateMinimize
    
    Set StartWord = wdApp
End Function
 
Sub StopWord _
( _
    wdApp As Word.Application _
)
    wdApp.Quit
    Set wdApp = Nothing
End Sub
 
Sub CheckNoteText _
( _
byval CheckThis as string
)
    Dim wdDoc                       As Word.Document
    Dim wdSpell                     As Word.Dialog
    Dim nRetval                     As Long
    Dim bRet                        As Boolean
 
   ' Document object required in order to use Show method of
   ' Dialogs collection.
    Set wdDoc = wdApp.Documents.Add
    Set wdSpell = wdApp.Dialogs(wdDialogToolsSpellingAndGrammar)
    
    ' Pass the contents of note to Word and check spelling
    wdApp.Selection.Text = checkthis
    nRetval = wdSpell.Show
         
    ' Check if Cancel button was clicked in Spelling dialog
    ' box. Place text from Word document back into note
    If Len(wdApp.Selection.Text) <> 1 Then
        text = wdApp.Selection.Text
    End If
        
    wdDoc.Close wdDoNotSaveChanges
End Sub
 
Sub main()
    
    Dim wdApp                       As Word.Application
    
    Dim bRet                        As Boolean
    
    
    Set wdApp = StartWord
    
    CheckNoteText "Chk teh Splling"
    
    StopWord wdApp
    
End Sub
'-----------------------------------

Open in new window

Avatar of j_stone
j_stone

ASKER

i can already spell check the document but it requires user intervention which really won't work for a 150 page tiff file...your code seems to require the same interaction
When it comes up with a list of 20 suggestions how do you suggest one gets forced?  Just take the first one?  
Perhaps you can duplicate the AutoCorrect options.  Create a dictionary of AutoCorrect items.  This will require a learning curve. OCR spell check add common repeatable items to AutoCorrect.  Then recurse your document word by word replacing anything found in the autoCorrect Keys with the autocorrect value.
Avatar of j_stone

ASKER

Corey2:
>When it comes up with a list of 20 suggestions how do you suggest one gets forced?  Just take the first >one?

either the first choice or the first choice with the same amount of characters (this seems to be the more correct answer)

the program is supposed to represent phrases so if a word is not in the dictionary it is most likely not ocr'ed correctly
ASKER CERTIFIED SOLUTION
Avatar of c0ldfyr3
c0ldfyr3
Flag of Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of j_stone

ASKER

just got a chance to try this tonight...found the code hidden in what c0ldfyr3 wrote