Link to home
Start Free TrialLog in
Avatar of scbdpm
scbdpmFlag for United States of America

asked on

Find text in PDF then return the text in that line

I am using the attached code that I found elsewhere here to search a PDF file.

The code works to search, however, I would then like to grab the text that is in the line of the document where the search string was found.

how can I do that?
Sub Search(strSearch As String, strFileName As String)
    'IAC objects
    Dim gAvDoc As Object
    
    'variables
    Dim Resp 'For message box responses
    Dim gPDFPath As String
    Dim sText As String 'String to search for
    Dim sStr As String 'Message string
    Dim foundText As Integer 'Holds return value from "FindText" method
        
    'hard coding for a PDF to open, it can be changed when needed.
    gPDFPath = "C:\Documents and Settings\eko013\My Documents\EKO013.pdf"
    gPDFPath = strFileName
 
    'Initialize Acrobat by creating App object
    Set gApp = CreateObject("AcroExch.App")
    gApp.Hide
        
    'Set AVDoc object
    Set gAvDoc = CreateObject("AcroExch.AVDoc")
        
    ' open the PDF
    If gAvDoc.Open(gPDFPath, "") Then
        'sText = "enter your searchstring here"
        'FindText params: StringToSearchFor, caseSensitive (1 or 0), WholeWords (1 or 0), ResetSearchToBeginOfDocument (1 or 0)
        sText = strSearch
        foundText = gAvDoc.FindText(sText, 1, 0, 1) 'Returns -1 if found, 0 otherwise
        
    Else
        ' if failed, show error message
        Resp = MsgBox("Cannot open" & gPDFPath, vbOKOnly)
    End If
    If foundText = -1 Then
        'compose a message
        sStr = "Found " & sText
        Resp = MsgBox(sStr, vbOKOnly)
    Else
        ' if failed, show error message
        Resp = MsgBox("Cannot find" & sText, vbOKOnly)
    End If
    gApp.Show
    gAvDoc.BringToFront
End Sub

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of puppydogbuddy
puppydogbuddy

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of scbdpm

ASKER

that appears to find the text, however, to clarify, what I mean by "grab the text that is in the line of the document" is that I would like to stick taht line of text in a string to then process.
 
 
Avatar of puppydogbuddy
puppydogbuddy

I don't understand why you are not already getting it........this part of the above code should display it (sText) for you.

If foundText = -1 Then
              'compose a message
               sStr = "Found " & sText
               Resp = MsgBox(sStr, vbOKOnly)
     
Avatar of scbdpm

ASKER

with that code, all I'm getting back is what I am sraching for, not the text that is in that line of the documetn.
For instance, if I search for Florida all that will be returned is "Found Florida" but what I want is the line where it finds florida- "There are palm trees in Florida".
Ok, I know what you want, but I will have to think about how to do it.  Will get back to you.
Avatar of scbdpm

ASKER

ok, thanks... BTW, can you please test my original code?
I could swear it was working, not it's returning 'found' even if the text isn't in the document.
One of the reasons I modified your original code is that it did'nt work as is because of the code lines that were commented out, and the absence of an input box to enter the search word or phrase.  

Are you asking because there was something that worked in your original code that is not working in the code I wrote for you? It might be that I omitted the following line from your original code because it was commented out.  

'FindText params: StringToSearchFor, caseSensitive (1 or 0), WholeWords (1 or 0), ResetSearchToBeginOfDocument (1 or 0)

If that is the problem, rather then revert to your original code....I would add the line without the comment symbol into my revised code as shown below:

' open the PDF
    If gAvDoc.Open(gPDFPath, "") Then
        sText = InputBox("enter your searchstring here")
        FindText params: StringToSearchFor, caseSensitive (1 or 0), WholeWords (1 or 0), ResetSearchToBeginOfDocument (1 or 0)
        foundText = gAvDoc.FindText(sText, 1, 0, 1) 'Returns -1 if found, 0 otherwise
Avatar of scbdpm

ASKER

I am testing the original code again (before your additions/changes) and it is always coming back as true regardless if the search strign is in the document
Keep in mind that before I modified your code, you or someone else on your end commented out portions of the code, rendering that portion of your code unexcutable.  If you are not familiar with comment lines, they are the lines that have a single quote as the first character typed in on the line....... note the single quote before the word FindText below.  To make it executable again, remove the single quote:

>>>>>>>>>>> 'FindText params: StringToSearchFor, caseSensitive (1 or 0), WholeWords (1 or 0), ResetSearchToBeginOfDocument (1 or 0)

 
Avatar of scbdpm

ASKER

these don't effect.
ok, the following code should work in a similar, but better fashion then your original code.  For example my code provides an input box to enter the search string. If you have the same or better level of satisfaction with my code, then I will proceed with trying to come up with a method of capturing the entire string that includes the search string.  If I can't capture the entire sentence that includes the search string, would you be satisfied if am able to capture say 30 characters preceding the search string and 30 characters after the end of the search string?

see attached code.
Sub Search(strFileName As String)
    'IAC objects
    Dim gAvDoc As Object
    
    'variables
    Dim Resp 'For message box responses
    Dim gPDFPath As String
    Dim sText As String 'String to search for
    Dim sStr As String 'Message string
    Dim foundText As Integer 'Holds return value from "FindText" method
        
    'hard coding for a PDF to open, it can be changed when needed.
    gPDFPath = "C:\Documents and Settings\eko013\My Documents\EKO013.pdf"
    gPDFPath = strFileName
 
    'Initialize Acrobat by creating App object
    Set gApp = CreateObject("AcroExch.App")
    gApp.Hide
        
    'Set AVDoc object
    Set gAvDoc = CreateObject("AcroExch.AVDoc")



' open the PDF
    If gAvDoc.Open(gPDFPath, "") Then
        sText = InputBox("enter your searchstring here")
        sText = Trim(sTest)
        foundText = gAvDoc.FindText(sText, 1, 0, 1) 'Returns -1 if found, 0 otherwise
         If foundText = -1 Then
              'compose a message
               sStr = "Found " & sText     
               Resp = MsgBox(sStr, vbOKOnly)
          Else
               ' if failed, show error message
               Resp = MsgBox("Cannot find" & sText, vbOKOnly)
          End If
    Else
         ' if failed, show error message
          Resp = MsgBox("Cannot open" & gPDFPath, vbOKOnly)
    End If
    gApp.Show
    gAvDoc.BringToFront

End Sub

Open in new window

Avatar of scbdpm

ASKER

no, doesn't work. Still returning -1 when the text is clearly not in teh document.
 
Part of your problem is with the FindText method (lines 28 and 29 of the code that I posted.  See the link for documentation of the FindText method.  
                  http://msdn.microsoft.com/en-us/library/ms536422.aspx

Change this:
        sText = Trim(sTest)
        foundText = gAvDoc.FindText(sText, 1, 0, 1) 'Returns -1 if found, 0 otherwise

To the following and make sure that foundText has been declared as Boolean
        sText = Trim(sText)
        foundText = gAvDoc.FindText(sText, 1, 0, 1) 'Returns -1 if found, 0 otherwise


If the above change does not fix, remove the following line of code.
                  sText = Trim(sText)
Avatar of scbdpm

ASKER

with all due respect, are you even testing this before posting?
None of the above options list you in your last post work! They all return -1 even if the text to be searched for doesnt' exist in the document.
I've looked for my last name in the document, clearly not there and am getting -1!
Avatar of scbdpm

ASKER

ok, with hat in hand I come requesting an apology.
I was wrong to post the last comment
Apparently, the issue is ME!
I just upgraded my Adobe Pro to 9 (from 6) and FindText is working!!!!
 
 
<<<with all due respect, are you even testing this before posting?>>>
Remember that I am volunteering my time to help as many people as I can w/o hurting my clients.  I do not always have the time to test, and am dependent on feedback from your testing.  At any rate, I told you what was wrong and provided you a reference.....but forgot to change it in the code.  According to the reference source, your find text function has too many parameters and one wrong value.  Change as indicated below and look at the reference source:

To the following and make sure that foundText has been declared as Boolean
        sText = Trim(sText)
        foundText = gAvDoc.FindText(sText, 1, 2) 'Returns -1 if found, 0 otherwise


If the above change does not fix, remove the following line of code.
                  sText = Trim(sText)
 
Avatar of scbdpm

ASKER

you might have posted before seeing my follow up. please read.
 
I do appreicate the assistance. However, I don't see this as 'volunteering' as you will get compensated for your assistance by being awarded the points from this question.
 
And what do the points do for me or any of the EE volunteers?  They don't even buy a cup of coffee.....the points are intended as a competitive measurement of how much each EE expert has contributed to the forum.

At any rate, did you look at the  reference source I gave you for the FindText method and look at the parameter options.  Are you now getting the return values you wanted?