uconnfb13
asked on
Using VBScript to Read data within PDF in order to name file.
Experts,
I am hoping you can help me with an issue. I have about 200 single page invoices that are all in PDF format, each unqiue with an invoice#, person's name, etc. They come out of access as one document, I am able to split them into single pages very easily using adobe pro, but of course it just names them, invoice_1, invoice_2, invoice_3, etc.....
My goal is to rename each file to the actual invoice#, by parsing the PDF file and pulling the invoice#. I have been able to find code that will do a "find" within a PDF (which I will attach). However, I will have no idea what the invoice number will be so, that won't work. My hope as to find the text "Invoice#" and then pull the 8 characters to the right of that, thus giving me the invoice# that I need and then use vbscript to rename the file, which I already know how to do. I just can't seem to figure out how to do the text manipulation within vbs with the Adobe object. If it were TextStream Object it would be a piece of cake. I know there are a few ways to convert the PDF to text and then read it, but I was hoping there was a simpler way. Also, if I could do it without actually having adobe open on my screen that would be a bonus. Another point, I have access to Adobe Pro and Adobe reader, either is fine.
Any help on this is much appreciated. It is my first time working with a PDF in VBScript.
Thanks,
Mike
I am hoping you can help me with an issue. I have about 200 single page invoices that are all in PDF format, each unqiue with an invoice#, person's name, etc. They come out of access as one document, I am able to split them into single pages very easily using adobe pro, but of course it just names them, invoice_1, invoice_2, invoice_3, etc.....
My goal is to rename each file to the actual invoice#, by parsing the PDF file and pulling the invoice#. I have been able to find code that will do a "find" within a PDF (which I will attach). However, I will have no idea what the invoice number will be so, that won't work. My hope as to find the text "Invoice#" and then pull the 8 characters to the right of that, thus giving me the invoice# that I need and then use vbscript to rename the file, which I already know how to do. I just can't seem to figure out how to do the text manipulation within vbs with the Adobe object. If it were TextStream Object it would be a piece of cake. I know there are a few ways to convert the PDF to text and then read it, but I was hoping there was a simpler way. Also, if I could do it without actually having adobe open on my screen that would be a bonus. Another point, I have access to Adobe Pro and Adobe reader, either is fine.
Any help on this is much appreciated. It is my first time working with a PDF in VBScript.
Thanks,
Mike
Option Explicit
Dim accapp, acavdocu
Dim pdf_path, bReset, Wrd_count
pdf_path="C:\LS\Test\Invoices\02_2011_PDF\rpt_Invoice_1.pdf"
'AcroExch is acrobat application object
Set accapp=CreateObject("AcroExch.App")
accapp.Show()
'Need to create one AVDoc object par displayed document
Set acavdocu=CreateObject("AcroExch.AVDoc")
'Opening the PDF
If acavdocu.Open(pdf_path,"") Then
acavdocu.BringToFront()
bReset=1 : Wrd_count = 0
'Find Text Finds the specified text, scrolls so that it is visible, and highlights it
Do While acavdocu.FindText("Invoice#", 1, 1, bReset)
bReset=0 : Wrd_count=Wrd_count+1
'Wait 0, 200
Loop
End If
accapp.CloseAllDocs()
accapp.Exit()
msgbox "The word 'Invoice#' was found " & Wrd_count & "times"
Set accap=nothing : Set accapp=nothing
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Hope that helps