Link to home
Start Free TrialLog in
Avatar of HeitmanProgrammers
HeitmanProgrammers

asked on

Word To PDF Programmatically

How can I export a word document into PDF using Acrobat SDK 6.0 and VB .NET? Eventually my objective is going to be opening multiple format documents and creating one PDF file from it. I know I have to add Adobe Acrobat 6.0 Reference but from there it seems confusing as to which way to proceed.

Any help is welcome
ASKER CERTIFIED SOLUTION
Avatar of Karl Heinz Kremer
Karl Heinz Kremer
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of HeitmanProgrammers
HeitmanProgrammers

ASKER

I don't know if I made my self clear, but I would like to do this programmatically. I do have standard version of Acrobat, and it will not be for server use. It will mostly be stand-alone application.

As far as printing of the file goes, I want user to browse any file and convert that to FTP. If I use the PostScript functionality, I would have to specify a function for each type of file that is supported in order to convert that to PDF. Therefore, I want to create something that is not dependent on recieving specific format file, instetad I want user to be able to take any logical format (format that can be converted to PDF) and create a PDF.

Any suggestions?
You can certainly try to implement the functionality that is in DocConverter, but this requires that you understand _ALL_ the file formats that you want to convert.

Just in case I did not make myself clear: You can do this programatically: You can automate Word by using the published Word API in your own program to write out the PostScript file, and then you would use the published Distiller API in your own program to convert the PostScript file to PDF.

There is nothing in the Acrobat SDK that helps you to convert to PDF (with the exception of PostScript to PDF with Distiller).

If you really mean "something tha tis not dependent on recieving specific format file", your task is impossible: Any format can be converted to PDF by printing to PDF, but this requires that you have the application that supports this specific format. You cannot take a format that you've never seen before and automatically convert it to PDF.

Even if you use DocConverter, which supports 280 different file formats, there are plenty of formats out there that it cannot convert. It's impossible to cover all possible formats in an automatic fashion.
I certainly do not have any problem with creating a post-script file and using that and distiller to create the PDF, however, I don't know what will user select. Unless I limit it to only certain formats. Let me ponder upon this.
the sole purpose of SDK is to be able to manipulate files programmatically, correct? If so, how can you not open a word file and save it as PDF?
No, the only purpose of the SDK is to sell more copies of Acrobat: As I said before, it's mostly documentation about how to automate (or program with/for) Acrobat. For most things in the SDK you need a full version of Acrobat, and the SDK only tells you how to use the APIs in Acrobat.
You can open some file types in Acrobat (Word's DOC format is one of them). If you want to limit the task to just those file formats that are supported by Acrobat directly, you can certainly do this programatically. You do need the full version of Acrobat however. The easiest way to do this is by using the Visual Basic/JavaScript bridge. Look into the directory C:\Program Files\Adobe\Acrobat 6.0 SDK\Documentation\JavaScript (this is part of the SDK). You will find a file (AcroJS.pdf) that describes the JavaScript API, and one file (VBJavaScript.pdf) that describes this VB/JS bridge.
The process is to get a handle to the app object, and then call app.openDoc() method (e.g. app.openDoc("/c/test.doc");) Once you have the doc object for the new document (which is now in PDF format), you can then save the document with the doc.saveAs() method and close the document. This will handle all those formats directly supported by Acrobat.
I know how to get a handle on the app object, however, I don't see [b]OpenDoc[/b] method. Also, I actually got a hold of the menuItemExecute method from which I could get a dialog box (file > create pdf > from multiple files) however, I can't get my program to click on the browse button so I could browse for a file. I looked into sendKeys() method which send keystroke combinations to the active application (which in my case is my vb .net windows application). is there a way I can change what the active application is then use the sendkeys() to click on the browse button?

about what you said, I looked in to vb/js bridge however, that doesn't tell me a whole lot. It is what I know already.
Maybe this might help. Heres an example using VBScript.
http://www.suodenjoki.dk/us/productions/articles/word2pdf.htm

A few years back we had evaluated technology from a company called Outside In or so (they changed their name later) that inter converts file formats. we wanted all documents to be in pdf. Ended up having the user upload postscript documents and used
Ghostscript (its free and command line) to convert .ps to .pdf
Ghostscript was pretty good and this was on the server side.


I actually got it to export to files to PDF programmatically using SDK. Using the commandline arguments, I am passing in a list of files (in different file formats e.g. doc, xls, pdf) which will all merge into one PDF. It works as long as the file is on my local computer. However, when I have a file that is out on the network, it opens the file without a problem, however it just hangs. In another words, if I was to copy the same file to my local computer, it works like it should.

below is the code that I am using... this codes create a PDF using the first file passed-in then from there, it just inserts pages into the file already created. the code where the file hangs is below followed by **********

Try
            '********************************************************************
            Dim cmdLine() As String
            Dim splitArgs() As String
            Dim i As Integer
            Dim k As Integer
            Dim filepath As String
            '********************************************************************
            Dim AcroExchApp As Acrobat.CAcroApp
            Dim PDDoc As Acrobat.CAcroPDDoc
            Dim avdoc As Acrobat.CAcroAVDoc
            Dim avdoc2 As Acrobat.CAcroAVDoc
            Dim AcroExchPDDocSource As Object
            Dim insertpddoc As Acrobat.CAcroPDDoc
            Dim numberOfPages As Integer
            '********************************************************************
            cmdLine = Environment.GetCommandLineArgs()
            'get the product version number (defined in the assembly file)
            'if there are any arguments present
            If (cmdLine.Length > 1) Then
                For i = 0 To cmdLine.Length - 1
                    splitArgs = Split(cmdLine(i), "/")
                    For k = 0 To splitArgs.Length - 1
                        Select Case UCase(Mid(splitArgs(k), 1, 1))
                            Case "F"
                                filepath = Trim(UCase(Mid(splitArgs(k), 2, splitArgs(k).Length)))
                        End Select
                    Next k
                Next i
            End If

            numberOfPages = -1
            If (System.IO.File.Exists("c:\convertedfile.pdf") = True) Then
                Kill("c:\convertedfile.pdf")
            End If
            Dim filename() As String
            filename = Split(filepath, ",")

            For i = 0 To filename.Length - 1


                ' Create our Exchange application object (this starts Exchange)

                AcroExchApp = CreateObject("AcroExch.App")

                ' And our PDDoc object


                PDDoc = CreateObject("AcroExch.PDDoc")
                insertpddoc = CreateObject("AcroExch.PDDoc")
                avdoc = CreateObject("AcroExch.AVDoc")
                avdoc2 = CreateObject("AcroExch.AVDoc")

                If (i = 0) Then
                    avdoc.Open(filename(i), filename(i))           '***********THIS IS WHERE THE CODE HANGS IF IT'S THE FIRST FILE

                    avdoc = AcroExchApp.GetActiveDoc


                    If avdoc.IsValid Then

                        PDDoc = avdoc.GetPDDoc

                        ' Fill in pdf properties.
                        PDDoc.SetInfo("Title", "My Title")
                        PDDoc.SetInfo("Author", "The Author")
                        PDDoc.SetInfo("Subject", "The Subject")
                        PDDoc.SetInfo("Keywords", "Keywords")

                        If PDDoc.Save(1 Or 4 Or 32, "c:\convertedfile.pdf") <> True Then
                            MsgBox("Failed to save file")
                        End If

                        numberOfPages += PDDoc.GetNumPages
                        PDDoc.Close()

                    End If

                    'Close the PDF
                    avdoc.Close(True)
                    closeWord()
                Else

                    avdoc2.Open(filename(i), "file#" & i)            '***************THIS IS WHERE THE CODE HANGS IF IT'S NOT THE FIRST FILE
                    avdoc2 = AcroExchApp.GetActiveDoc

                    If insertpddoc.Open("c:\convertedfile.pdf") = False Then
                        MsgBox("failed insertpddoc")
                    End If

                    If (avdoc2.IsValid) Then
                        PDDoc = avdoc2.GetPDDoc
                        If insertpddoc.InsertPages(numberOfPages, PDDoc, 0, PDDoc.GetNumPages, 0) <> True Then
                            MsgBox("failed insert page")
                        End If

                        If (insertpddoc.Save(1 Or 4 Or 32, "c:\convertedfile.pdf") <> True) Then
                            MsgBox("Failed save after insert")
                        End If
                        numberOfPages += PDDoc.GetNumPages
                        PDDoc.Close()
                        insertpddoc.Close()
                    End If
                    PDDoc.Close()
                    insertpddoc.Close()
                    avdoc2.Close(True)
                    closeWord()
                End If
                PDDoc = Nothing
                avdoc = Nothing
                avdoc2 = Nothing
                insertpddoc = Nothing
                AcroExchApp.Exit()
                AcroExchApp = Nothing
            Next
        'Cleanup
        Catch ex As Exception
            MsgBox(ex.GetBaseException.ToString)
        End Try
I actually got a logic down for my issue with network file hanging after it opened. My logic was simple,

-check if the file passed in is a network file
     if it is a network file then
           copy the file to a temporary directory on the local hard drive
           do what needs to be done with the file (converting to PDF)
           delete the file from the temporary directory

if anyone else has a better logic, please feel free to comment. Now What I need to figure out is how to add digital signature to the file. does anyone have any input on that?
I did a google search and came across these.
Manually setting it.
http://www.planetpdf.com/enterprise/article.asp?ContentID=6396
http://www.adobe.com/epaper/tips/acr5digsig/page2.html

Progamatically setting it. 1st article is in C++. I guess once you understand how its done, then its easy to
implement it in .Net.
http://codeproject.com/useritems/PdfDigiPad.asp
http://www.15seconds.com/issue/040225.htm
How do you specify the files on the network (e.g. do you have a mapped network drive, or do you use the \\machine\path\file.pdf synatx)?
khremer:

Yes, network path is specified as \\machine\path\file.pdf

Avi247:

I looked the links that you had provided, however, I can't seem to find that as useful. Is there anything else?
Try to map a driveletter to your share and see if this still fails. I think it's not the location of where these files are stored, but how you access them (e.g. UNC vs. mapped network drive).
I did some tests here, and I can access information on a network drive wihtout any problems with either the UNC syntax or the mapped network drive. What error are you getting?
as I mentioned earlier, I don't have any problem getting to to the file, however, conversion to PDF hangs as soon as it opens the document (the one over the network)
Do you have write access to the network folder? The conversion process may try to write a file to the same directory (AFAIK it will not actually write anything, but it may test for write access). If that's the case, you could check for this and only copy your files to your local temp. directory if you don't have write access.
As I said, it works for me: I can convert Word documents both on a remote and a local drive. I'm however doing it slightly different - I'm using the JSO object (the VB/JS bridge):


    Dim jso As Object
    Dim Doc As Object
...
    If PDDoc.Create() Then
        Set jso = PDDoc.GetJSObject
        Set Doc = jso.App.openDoc(FileName)
        MsgBox (Doc.numPages)
    End If

I only use the MsgBox to verify that the document was actually converted and opened.
yes I do have write access to that folder. I am actually the admin on that computer. It just seems very odd. Let me play around with it and see if I can find a better solution. Else if, I will just stick with what I had. Thanks a lot for your help Khkremer
From SDK documentation, I found a Core API Object which would hopefully will enable me to add a digital signature to the document created it is:

PDPermReqObjSignature

'this will add digital signature to a document
does anyone know which namespace it's in or how do I get to it from VB .nET?
Anything that is in the "Core API" is only available to Acrobat plug-ins. A plug-in needs to be written in C/C++, so this function will not be available to your VB.NET application. AFAIK there is no API for VB.NET programs to add signatures to a PDF document.

You need to use a 3rd party component to sign documents. Here is one example:
http://www.pdfstore.com/details.asp?ProdID=540

hmmm.... that's not a good thing. I looked at the link that you sent me however it doesn't say anything about adding digital signature. I will look at that in detail and post back. Thank you very much for your help so far Khkremer, I really appreciate it.
Sorry about this. I thought that this component would also do digital signatures.
This leaves me with just one recommendation for a pretty expensive package:
http://www.example-code.com/vb/vbPdfDetachedSig.asp

You might be able (never tried this myself) to use JavaScript: You can add a signature to a document with JavaScript. However, this can only be done when the JS is executed during a menu event, a batch process, or when the application is initialized. You will not be able to do this from your VB program - at least not directly: You can create JavaScript that adds a menu item to the Acrobat menu, and you can execute a menu item from your VB code. So, if you create a menu item that signs a document, and your VB app executes this menu item, you can actually sign a document from a VB app.
khkremer thank you very much for your help. I need to a software or hardware to handle digital signatures. My boss does not want to have a signature jpeg laying around on the network (which is what we are using right now) instead he would like to have unique digital signature using windows authentication (username) and secure that. I hope that makes sense. Do you know of any hardware or software that would let me accomplish what he requires?
Take a look at this application: http://www.ascertia.com/products/pdfsigner/ 
I don't know it, I know almost nothing about it, have never used it, and almost don't dare to recommend it.
Is this enough of a disclaimer? :-)
it seems pretty kewl but seems like something that I could already do with Adobe 6.0 Standard. The product showed, typical digital signature creation using Adobe 6.0 Professional. Am I wrong? Yeah, how about that disclaimer!!! :)
You are probably not wrong :-)

Have you looked into the JavaScript option?
I have not looked into javascript option since this is going to be either in a windows application or console application. Thanks for all your help Khkremer, I really appreciate it.
I'm talking about Acrobat Javascript: You can create a JavaScipt that adds a menu button that when executed would create a signature field. Your application would then execute this menu item.
I will look into that. Do you have anything to get me started or there is documentation in the SDK?
The document "Acrobat JavaScript Scripting Reference" (which is part of the SDK, but can also be downloaded from Adobe's web site) contains all information about how to create menu items and how to create a signature annotation. I've never done the latter part.
awsome. Thanks a lot for pointing that out. I wish I could award you more points :) Have a good one
You can always open a new question :-) And, I may just have the script for you :-)
I will if I can't figure it out on my own. :-{)
Please help me if you can

I want to install Acrobat Professional 7 on a net work PC and setup few network folders where users can save their MS Office 2002 documents (Mainly Word, Excel and Visio) in there. Then write some code to convert MS Office documents to PDF and save back to the network folder to access by the users. Here security is not my concern

I know MS Office 2007 has got some add-in to do this job, but we have Office 2002, therefore that option is not suitable.

I red your post shown below  and I would be grateful, if you could give me the source code and advise implementing my idea will have any hidden technical issues. The porpose is reduce the cost of Acrobat licences.
I will find the code and uploade it tomorrow.

HP
Please don't ask questions in an already closed question - be fair to everybody and put up some points and ask a new question.