HeitmanProgrammers
asked on
Word To PDF Programmatically
How can I export a word document into PDF using Acrobat SDK 6.0 and VB .NET? Eventually my objective is going to be opening multiple format documents and creating one PDF file from it. I know I have to add Adobe Acrobat 6.0 Reference but from there it seems confusing as to which way to proceed.
Any help is welcome
Any help is welcome
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
You can certainly try to implement the functionality that is in DocConverter, but this requires that you understand _ALL_ the file formats that you want to convert.
Just in case I did not make myself clear: You can do this programatically: You can automate Word by using the published Word API in your own program to write out the PostScript file, and then you would use the published Distiller API in your own program to convert the PostScript file to PDF.
There is nothing in the Acrobat SDK that helps you to convert to PDF (with the exception of PostScript to PDF with Distiller).
If you really mean "something tha tis not dependent on recieving specific format file", your task is impossible: Any format can be converted to PDF by printing to PDF, but this requires that you have the application that supports this specific format. You cannot take a format that you've never seen before and automatically convert it to PDF.
Even if you use DocConverter, which supports 280 different file formats, there are plenty of formats out there that it cannot convert. It's impossible to cover all possible formats in an automatic fashion.
Just in case I did not make myself clear: You can do this programatically: You can automate Word by using the published Word API in your own program to write out the PostScript file, and then you would use the published Distiller API in your own program to convert the PostScript file to PDF.
There is nothing in the Acrobat SDK that helps you to convert to PDF (with the exception of PostScript to PDF with Distiller).
If you really mean "something tha tis not dependent on recieving specific format file", your task is impossible: Any format can be converted to PDF by printing to PDF, but this requires that you have the application that supports this specific format. You cannot take a format that you've never seen before and automatically convert it to PDF.
Even if you use DocConverter, which supports 280 different file formats, there are plenty of formats out there that it cannot convert. It's impossible to cover all possible formats in an automatic fashion.
ASKER
I certainly do not have any problem with creating a post-script file and using that and distiller to create the PDF, however, I don't know what will user select. Unless I limit it to only certain formats. Let me ponder upon this.
ASKER
the sole purpose of SDK is to be able to manipulate files programmatically, correct? If so, how can you not open a word file and save it as PDF?
No, the only purpose of the SDK is to sell more copies of Acrobat: As I said before, it's mostly documentation about how to automate (or program with/for) Acrobat. For most things in the SDK you need a full version of Acrobat, and the SDK only tells you how to use the APIs in Acrobat.
You can open some file types in Acrobat (Word's DOC format is one of them). If you want to limit the task to just those file formats that are supported by Acrobat directly, you can certainly do this programatically. You do need the full version of Acrobat however. The easiest way to do this is by using the Visual Basic/JavaScript bridge. Look into the directory C:\Program Files\Adobe\Acrobat 6.0 SDK\Documentation\JavaScri pt (this is part of the SDK). You will find a file (AcroJS.pdf) that describes the JavaScript API, and one file (VBJavaScript.pdf) that describes this VB/JS bridge.
The process is to get a handle to the app object, and then call app.openDoc() method (e.g. app.openDoc("/c/test.doc") ;) Once you have the doc object for the new document (which is now in PDF format), you can then save the document with the doc.saveAs() method and close the document. This will handle all those formats directly supported by Acrobat.
You can open some file types in Acrobat (Word's DOC format is one of them). If you want to limit the task to just those file formats that are supported by Acrobat directly, you can certainly do this programatically. You do need the full version of Acrobat however. The easiest way to do this is by using the Visual Basic/JavaScript bridge. Look into the directory C:\Program Files\Adobe\Acrobat 6.0 SDK\Documentation\JavaScri
The process is to get a handle to the app object, and then call app.openDoc() method (e.g. app.openDoc("/c/test.doc")
ASKER
I know how to get a handle on the app object, however, I don't see [b]OpenDoc[/b] method. Also, I actually got a hold of the menuItemExecute method from which I could get a dialog box (file > create pdf > from multiple files) however, I can't get my program to click on the browse button so I could browse for a file. I looked into sendKeys() method which send keystroke combinations to the active application (which in my case is my vb .net windows application). is there a way I can change what the active application is then use the sendkeys() to click on the browse button?
about what you said, I looked in to vb/js bridge however, that doesn't tell me a whole lot. It is what I know already.
about what you said, I looked in to vb/js bridge however, that doesn't tell me a whole lot. It is what I know already.
Maybe this might help. Heres an example using VBScript.
http://www.suodenjoki.dk/us/productions/articles/word2pdf.htm
A few years back we had evaluated technology from a company called Outside In or so (they changed their name later) that inter converts file formats. we wanted all documents to be in pdf. Ended up having the user upload postscript documents and used
Ghostscript (its free and command line) to convert .ps to .pdf
Ghostscript was pretty good and this was on the server side.
http://www.suodenjoki.dk/us/productions/articles/word2pdf.htm
A few years back we had evaluated technology from a company called Outside In or so (they changed their name later) that inter converts file formats. we wanted all documents to be in pdf. Ended up having the user upload postscript documents and used
Ghostscript (its free and command line) to convert .ps to .pdf
Ghostscript was pretty good and this was on the server side.
ASKER
I actually got it to export to files to PDF programmatically using SDK. Using the commandline arguments, I am passing in a list of files (in different file formats e.g. doc, xls, pdf) which will all merge into one PDF. It works as long as the file is on my local computer. However, when I have a file that is out on the network, it opens the file without a problem, however it just hangs. In another words, if I was to copy the same file to my local computer, it works like it should.
below is the code that I am using... this codes create a PDF using the first file passed-in then from there, it just inserts pages into the file already created. the code where the file hangs is below followed by **********
Try
'************************* ********** ********** ********** ********** ***
Dim cmdLine() As String
Dim splitArgs() As String
Dim i As Integer
Dim k As Integer
Dim filepath As String
'************************* ********** ********** ********** ********** ***
Dim AcroExchApp As Acrobat.CAcroApp
Dim PDDoc As Acrobat.CAcroPDDoc
Dim avdoc As Acrobat.CAcroAVDoc
Dim avdoc2 As Acrobat.CAcroAVDoc
Dim AcroExchPDDocSource As Object
Dim insertpddoc As Acrobat.CAcroPDDoc
Dim numberOfPages As Integer
'************************* ********** ********** ********** ********** ***
cmdLine = Environment.GetCommandLine Args()
'get the product version number (defined in the assembly file)
'if there are any arguments present
If (cmdLine.Length > 1) Then
For i = 0 To cmdLine.Length - 1
splitArgs = Split(cmdLine(i), "/")
For k = 0 To splitArgs.Length - 1
Select Case UCase(Mid(splitArgs(k), 1, 1))
Case "F"
filepath = Trim(UCase(Mid(splitArgs(k ), 2, splitArgs(k).Length)))
End Select
Next k
Next i
End If
numberOfPages = -1
If (System.IO.File.Exists("c: \converted file.pdf") = True) Then
Kill("c:\convertedfile.pdf ")
End If
Dim filename() As String
filename = Split(filepath, ",")
For i = 0 To filename.Length - 1
' Create our Exchange application object (this starts Exchange)
AcroExchApp = CreateObject("AcroExch.App ")
' And our PDDoc object
PDDoc = CreateObject("AcroExch.PDD oc")
insertpddoc = CreateObject("AcroExch.PDD oc")
avdoc = CreateObject("AcroExch.AVD oc")
avdoc2 = CreateObject("AcroExch.AVD oc")
If (i = 0) Then
avdoc.Open(filename(i), filename(i)) '***********THIS IS WHERE THE CODE HANGS IF IT'S THE FIRST FILE
avdoc = AcroExchApp.GetActiveDoc
If avdoc.IsValid Then
PDDoc = avdoc.GetPDDoc
' Fill in pdf properties.
PDDoc.SetInfo("Title", "My Title")
PDDoc.SetInfo("Author", "The Author")
PDDoc.SetInfo("Subject", "The Subject")
PDDoc.SetInfo("Keywords", "Keywords")
If PDDoc.Save(1 Or 4 Or 32, "c:\convertedfile.pdf") <> True Then
MsgBox("Failed to save file")
End If
numberOfPages += PDDoc.GetNumPages
PDDoc.Close()
End If
'Close the PDF
avdoc.Close(True)
closeWord()
Else
avdoc2.Open(filename(i), "file#" & i) '***************THIS IS WHERE THE CODE HANGS IF IT'S NOT THE FIRST FILE
avdoc2 = AcroExchApp.GetActiveDoc
If insertpddoc.Open("c:\conve rtedfile.p df") = False Then
MsgBox("failed insertpddoc")
End If
If (avdoc2.IsValid) Then
PDDoc = avdoc2.GetPDDoc
If insertpddoc.InsertPages(nu mberOfPage s, PDDoc, 0, PDDoc.GetNumPages, 0) <> True Then
MsgBox("failed insert page")
End If
If (insertpddoc.Save(1 Or 4 Or 32, "c:\convertedfile.pdf") <> True) Then
MsgBox("Failed save after insert")
End If
numberOfPages += PDDoc.GetNumPages
PDDoc.Close()
insertpddoc.Close()
End If
PDDoc.Close()
insertpddoc.Close()
avdoc2.Close(True)
closeWord()
End If
PDDoc = Nothing
avdoc = Nothing
avdoc2 = Nothing
insertpddoc = Nothing
AcroExchApp.Exit()
AcroExchApp = Nothing
Next
'Cleanup
Catch ex As Exception
MsgBox(ex.GetBaseException .ToString)
End Try
below is the code that I am using... this codes create a PDF using the first file passed-in then from there, it just inserts pages into the file already created. the code where the file hangs is below followed by **********
Try
'*************************
Dim cmdLine() As String
Dim splitArgs() As String
Dim i As Integer
Dim k As Integer
Dim filepath As String
'*************************
Dim AcroExchApp As Acrobat.CAcroApp
Dim PDDoc As Acrobat.CAcroPDDoc
Dim avdoc As Acrobat.CAcroAVDoc
Dim avdoc2 As Acrobat.CAcroAVDoc
Dim AcroExchPDDocSource As Object
Dim insertpddoc As Acrobat.CAcroPDDoc
Dim numberOfPages As Integer
'*************************
cmdLine = Environment.GetCommandLine
'get the product version number (defined in the assembly file)
'if there are any arguments present
If (cmdLine.Length > 1) Then
For i = 0 To cmdLine.Length - 1
splitArgs = Split(cmdLine(i), "/")
For k = 0 To splitArgs.Length - 1
Select Case UCase(Mid(splitArgs(k), 1, 1))
Case "F"
filepath = Trim(UCase(Mid(splitArgs(k
End Select
Next k
Next i
End If
numberOfPages = -1
If (System.IO.File.Exists("c:
Kill("c:\convertedfile.pdf
End If
Dim filename() As String
filename = Split(filepath, ",")
For i = 0 To filename.Length - 1
' Create our Exchange application object (this starts Exchange)
AcroExchApp = CreateObject("AcroExch.App
' And our PDDoc object
PDDoc = CreateObject("AcroExch.PDD
insertpddoc = CreateObject("AcroExch.PDD
avdoc = CreateObject("AcroExch.AVD
avdoc2 = CreateObject("AcroExch.AVD
If (i = 0) Then
avdoc.Open(filename(i), filename(i)) '***********THIS IS WHERE THE CODE HANGS IF IT'S THE FIRST FILE
avdoc = AcroExchApp.GetActiveDoc
If avdoc.IsValid Then
PDDoc = avdoc.GetPDDoc
' Fill in pdf properties.
PDDoc.SetInfo("Title", "My Title")
PDDoc.SetInfo("Author", "The Author")
PDDoc.SetInfo("Subject", "The Subject")
PDDoc.SetInfo("Keywords", "Keywords")
If PDDoc.Save(1 Or 4 Or 32, "c:\convertedfile.pdf") <> True Then
MsgBox("Failed to save file")
End If
numberOfPages += PDDoc.GetNumPages
PDDoc.Close()
End If
'Close the PDF
avdoc.Close(True)
closeWord()
Else
avdoc2.Open(filename(i), "file#" & i) '***************THIS IS WHERE THE CODE HANGS IF IT'S NOT THE FIRST FILE
avdoc2 = AcroExchApp.GetActiveDoc
If insertpddoc.Open("c:\conve
MsgBox("failed insertpddoc")
End If
If (avdoc2.IsValid) Then
PDDoc = avdoc2.GetPDDoc
If insertpddoc.InsertPages(nu
MsgBox("failed insert page")
End If
If (insertpddoc.Save(1 Or 4 Or 32, "c:\convertedfile.pdf") <> True) Then
MsgBox("Failed save after insert")
End If
numberOfPages += PDDoc.GetNumPages
PDDoc.Close()
insertpddoc.Close()
End If
PDDoc.Close()
insertpddoc.Close()
avdoc2.Close(True)
closeWord()
End If
PDDoc = Nothing
avdoc = Nothing
avdoc2 = Nothing
insertpddoc = Nothing
AcroExchApp.Exit()
AcroExchApp = Nothing
Next
'Cleanup
Catch ex As Exception
MsgBox(ex.GetBaseException
End Try
ASKER
I actually got a logic down for my issue with network file hanging after it opened. My logic was simple,
-check if the file passed in is a network file
if it is a network file then
copy the file to a temporary directory on the local hard drive
do what needs to be done with the file (converting to PDF)
delete the file from the temporary directory
if anyone else has a better logic, please feel free to comment. Now What I need to figure out is how to add digital signature to the file. does anyone have any input on that?
-check if the file passed in is a network file
if it is a network file then
copy the file to a temporary directory on the local hard drive
do what needs to be done with the file (converting to PDF)
delete the file from the temporary directory
if anyone else has a better logic, please feel free to comment. Now What I need to figure out is how to add digital signature to the file. does anyone have any input on that?
I did a google search and came across these.
Manually setting it.
http://www.planetpdf.com/enterprise/article.asp?ContentID=6396
http://www.adobe.com/epaper/tips/acr5digsig/page2.html
Progamatically setting it. 1st article is in C++. I guess once you understand how its done, then its easy to
implement it in .Net.
http://codeproject.com/useritems/PdfDigiPad.asp
http://www.15seconds.com/issue/040225.htm
Manually setting it.
http://www.planetpdf.com/enterprise/article.asp?ContentID=6396
http://www.adobe.com/epaper/tips/acr5digsig/page2.html
Progamatically setting it. 1st article is in C++. I guess once you understand how its done, then its easy to
implement it in .Net.
http://codeproject.com/useritems/PdfDigiPad.asp
http://www.15seconds.com/issue/040225.htm
How do you specify the files on the network (e.g. do you have a mapped network drive, or do you use the \\machine\path\file.pdf synatx)?
ASKER
khremer:
Yes, network path is specified as \\machine\path\file.pdf
Avi247:
I looked the links that you had provided, however, I can't seem to find that as useful. Is there anything else?
Yes, network path is specified as \\machine\path\file.pdf
Avi247:
I looked the links that you had provided, however, I can't seem to find that as useful. Is there anything else?
Try to map a driveletter to your share and see if this still fails. I think it's not the location of where these files are stored, but how you access them (e.g. UNC vs. mapped network drive).
I did some tests here, and I can access information on a network drive wihtout any problems with either the UNC syntax or the mapped network drive. What error are you getting?
ASKER
as I mentioned earlier, I don't have any problem getting to to the file, however, conversion to PDF hangs as soon as it opens the document (the one over the network)
Do you have write access to the network folder? The conversion process may try to write a file to the same directory (AFAIK it will not actually write anything, but it may test for write access). If that's the case, you could check for this and only copy your files to your local temp. directory if you don't have write access.
As I said, it works for me: I can convert Word documents both on a remote and a local drive. I'm however doing it slightly different - I'm using the JSO object (the VB/JS bridge):
Dim jso As Object
Dim Doc As Object
...
If PDDoc.Create() Then
Set jso = PDDoc.GetJSObject
Set Doc = jso.App.openDoc(FileName)
MsgBox (Doc.numPages)
End If
I only use the MsgBox to verify that the document was actually converted and opened.
As I said, it works for me: I can convert Word documents both on a remote and a local drive. I'm however doing it slightly different - I'm using the JSO object (the VB/JS bridge):
Dim jso As Object
Dim Doc As Object
...
If PDDoc.Create() Then
Set jso = PDDoc.GetJSObject
Set Doc = jso.App.openDoc(FileName)
MsgBox (Doc.numPages)
End If
I only use the MsgBox to verify that the document was actually converted and opened.
ASKER
yes I do have write access to that folder. I am actually the admin on that computer. It just seems very odd. Let me play around with it and see if I can find a better solution. Else if, I will just stick with what I had. Thanks a lot for your help Khkremer
ASKER
From SDK documentation, I found a Core API Object which would hopefully will enable me to add a digital signature to the document created it is:
PDPermReqObjSignature
'this will add digital signature to a document
PDPermReqObjSignature
'this will add digital signature to a document
ASKER
does anyone know which namespace it's in or how do I get to it from VB .nET?
Anything that is in the "Core API" is only available to Acrobat plug-ins. A plug-in needs to be written in C/C++, so this function will not be available to your VB.NET application. AFAIK there is no API for VB.NET programs to add signatures to a PDF document.
You need to use a 3rd party component to sign documents. Here is one example:
http://www.pdfstore.com/details.asp?ProdID=540
You need to use a 3rd party component to sign documents. Here is one example:
http://www.pdfstore.com/details.asp?ProdID=540
ASKER
hmmm.... that's not a good thing. I looked at the link that you sent me however it doesn't say anything about adding digital signature. I will look at that in detail and post back. Thank you very much for your help so far Khkremer, I really appreciate it.
Sorry about this. I thought that this component would also do digital signatures.
This leaves me with just one recommendation for a pretty expensive package:
http://www.example-code.com/vb/vbPdfDetachedSig.asp
You might be able (never tried this myself) to use JavaScript: You can add a signature to a document with JavaScript. However, this can only be done when the JS is executed during a menu event, a batch process, or when the application is initialized. You will not be able to do this from your VB program - at least not directly: You can create JavaScript that adds a menu item to the Acrobat menu, and you can execute a menu item from your VB code. So, if you create a menu item that signs a document, and your VB app executes this menu item, you can actually sign a document from a VB app.
This leaves me with just one recommendation for a pretty expensive package:
http://www.example-code.com/vb/vbPdfDetachedSig.asp
You might be able (never tried this myself) to use JavaScript: You can add a signature to a document with JavaScript. However, this can only be done when the JS is executed during a menu event, a batch process, or when the application is initialized. You will not be able to do this from your VB program - at least not directly: You can create JavaScript that adds a menu item to the Acrobat menu, and you can execute a menu item from your VB code. So, if you create a menu item that signs a document, and your VB app executes this menu item, you can actually sign a document from a VB app.
ASKER
khkremer thank you very much for your help. I need to a software or hardware to handle digital signatures. My boss does not want to have a signature jpeg laying around on the network (which is what we are using right now) instead he would like to have unique digital signature using windows authentication (username) and secure that. I hope that makes sense. Do you know of any hardware or software that would let me accomplish what he requires?
Take a look at this application: http://www.ascertia.com/products/pdfsigner/
I don't know it, I know almost nothing about it, have never used it, and almost don't dare to recommend it.
Is this enough of a disclaimer? :-)
I don't know it, I know almost nothing about it, have never used it, and almost don't dare to recommend it.
Is this enough of a disclaimer? :-)
ASKER
it seems pretty kewl but seems like something that I could already do with Adobe 6.0 Standard. The product showed, typical digital signature creation using Adobe 6.0 Professional. Am I wrong? Yeah, how about that disclaimer!!! :)
You are probably not wrong :-)
Have you looked into the JavaScript option?
Have you looked into the JavaScript option?
ASKER
I have not looked into javascript option since this is going to be either in a windows application or console application. Thanks for all your help Khkremer, I really appreciate it.
I'm talking about Acrobat Javascript: You can create a JavaScipt that adds a menu button that when executed would create a signature field. Your application would then execute this menu item.
ASKER
I will look into that. Do you have anything to get me started or there is documentation in the SDK?
The document "Acrobat JavaScript Scripting Reference" (which is part of the SDK, but can also be downloaded from Adobe's web site) contains all information about how to create menu items and how to create a signature annotation. I've never done the latter part.
ASKER
awsome. Thanks a lot for pointing that out. I wish I could award you more points :) Have a good one
You can always open a new question :-) And, I may just have the script for you :-)
ASKER
I will if I can't figure it out on my own. :-{)
Please help me if you can
I want to install Acrobat Professional 7 on a net work PC and setup few network folders where users can save their MS Office 2002 documents (Mainly Word, Excel and Visio) in there. Then write some code to convert MS Office documents to PDF and save back to the network folder to access by the users. Here security is not my concern
I know MS Office 2007 has got some add-in to do this job, but we have Office 2002, therefore that option is not suitable.
I red your post shown below and I would be grateful, if you could give me the source code and advise implementing my idea will have any hidden technical issues. The porpose is reduce the cost of Acrobat licences.
I want to install Acrobat Professional 7 on a net work PC and setup few network folders where users can save their MS Office 2002 documents (Mainly Word, Excel and Visio) in there. Then write some code to convert MS Office documents to PDF and save back to the network folder to access by the users. Here security is not my concern
I know MS Office 2007 has got some add-in to do this job, but we have Office 2002, therefore that option is not suitable.
I red your post shown below and I would be grateful, if you could give me the source code and advise implementing my idea will have any hidden technical issues. The porpose is reduce the cost of Acrobat licences.
ASKER
I will find the code and uploade it tomorrow.
HP
HP
Please don't ask questions in an already closed question - be fair to everybody and put up some points and ask a new question.
ASKER
As far as printing of the file goes, I want user to browse any file and convert that to FTP. If I use the PostScript functionality, I would have to specify a function for each type of file that is supported in order to convert that to PDF. Therefore, I want to create something that is not dependent on recieving specific format file, instetad I want user to be able to take any logical format (format that can be converted to PDF) and create a PDF.
Any suggestions?