• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 322
  • Last Modified:

What is causing some merges to be doubles of the first pdfs while other merges are not?

I have the vb.net code to select a folder and merge all the PDFs in that folder into 1 PDF file.
I have not yet experienced any problems if I was to download random PDF files and place them all into one folder and merge them. If I were to use the files I generated using ItextSharp, the combined file would be the first file repeated for every PDF file that was in the folder. If I was to use Adobe Acrobat to merge the files into 1 I would get the same result. The first file would repeat itself for every PDF file that was in the folder.

All of the PDF files have the same page size and they only contain 1 page.
I guess this boils down to what setting  in a PDF file that doesn't allow it to merge with other files and how do I change that setting in ItextSharpe? Any Ideas would be greatly appreciated.

Note: I'm using Adobe Acrobat to merge the PDF files (not ITextsharpe).

0
JohnnyBCJ
Asked:
JohnnyBCJ
  • 4
  • 3
1 Solution
 
oobaylyCommented:
Can you show the code you're using to merge the files? Does vaguely sound like you're referencing the first file every time, though I'd imagine you've already checked for that.
0
 
JohnnyBCJAuthor Commented:
I can't believe it's within the code because if I try to merge the files using acrobat I get the exact same results as if using the code. That is the reason why I'm posting here at Experts Exchange is to see if anyone knows a property in the PDF file that would cause that (or any other reason why) the first file repeats itself for every PDF file in the folder (even when I'm using Adobe Acrobat!!!)

Here is the code.

Option Explicit On ' Force variable declaration
Imports System.IO
 
Public Class frmCombinePDFs
 
    Public Const PDF_WILDCARD = "*.pdf"
    Public Const PDF_DIRECTORY = "C:\PDFs\"
    Public Const JOIN_FILENAME = "complete.pdf"
    Dim DM As New DMClass  
 
   Private Sub frmCombinePDFs_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
        JoinAllAcrobatDocsInDir()
    End Sub
 
    ' Author : Planet PDF
    ' E-Mail: info@artspdf.com
    ' Date : 08 March 1998
    ' Description: JoinAllAcrobatDocsInDir
    ' This vb method uses IAC to join all PDF documents in a folder 
    ' to a predetermined filename
    ' This method / function should be extended to suit the requirements
    ' of an organisation
    Sub JoinAllAcrobatDocsInDir()
 
        Dim AcroExchApp As Object, AcroExchPDDoc As Object, AcroExchInsertPDDoc As Object
        Dim strFileName As String, strPath As String
        Dim iNumberOfPagesToInsert As Integer, iLastPage As Integer
        AcroExchApp = CreateObject("AcroExch.App")
        AcroExchPDDoc = CreateObject("AcroExch.PDDoc")
 
        ' Show the Acrobat Exchange window
        AcroExchApp.Show()
 
        ' Set the directory / folder to use
        strPath = PDF_DIRECTORY
 
        ' Get the first pdf file in the directory
        strFileName = Dir(strPath + PDF_WILDCARD, vbNormal)
 
        ' Open the first file in the directory
        AcroExchPDDoc.Open(strPath + strFileName)
 
        ' Get the name of the next file in the directory [if any]
        If strFileName <> "" Then
            strFileName = Dir()
 
            ' Start the loop.
            Do While strFileName <> ""
 
                ' Get the total pages less one for the last page num [zero based]
                iLastPage = AcroExchPDDoc.GetNumPages - 1
                AcroExchInsertPDDoc = CreateObject("AcroExch.PDDoc")
 
                ' Open the file to insert
                AcroExchInsertPDDoc.Open(strPath + strFileName)
 
                ' Get the number of pages to insert
                iNumberOfPagesToInsert = AcroExchInsertPDDoc.GetNumPages
 
                ' Insert the pages
                AcroExchPDDoc.InsertPages(iLastPage, AcroExchInsertPDDoc, 0, iNumberOfPagesToInsert, True)
 
                ' Close the document
                AcroExchInsertPDDoc.Close()
 
                ' Get the name of the next file in the directory
                strFileName = Dir()
            Loop
 
            ' Save the entire document as the JOIN_FILENAME using SaveFull [0x0001 = &H1]
            AcroExchPDDoc.Save(&H1, strPath + JOIN_FILENAME)
 
        End If
 
        ' Close the PDDoc
        AcroExchPDDoc.Close()
 
        ' Close Acrobat Exchange
        AcroExchApp.Exit()
    End Sub

Open in new window

0
 
oobaylyCommented:
If the problem occurs when you attempt to create a new PDF document from several existing PDFs in the Acrobat application itself it does imply an issue with Acrobat itself. I've never come across that behaviour in Acrobat before (though we're still using 7 as it does what we need).

I'm afraid I'm not an Acrobat guru, so I'll be of limited use. I will suggest some changes to the script as the use of Dir() makes readability pretty awkward (and in my personal opinion nasty, but then that's 20 century VB code for you :-)
Dim files As FileInfo() = New DirectoryInfo(strPath).GetFiles("*.pdf")
If (files.Length > 0) Then
  '' Create code
 
  For Each file As FileInfo In files
    '' Use file.FullName in place of strFileName
    '' Append code
 
  Next
 
  '' Save code
 
End If

Open in new window

0
NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

 
JohnnyBCJAuthor Commented:
I converted the code to the following.
Now I'm getting an error:
The server threw an exception. (Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT))

On the line:
AcroExchPDDoc.Close()

 Private Sub JoinPDFsInDir()
        Dim AcroExchApp As Object, AcroExchPDDoc As Object, AcroExchInsertPDDoc As Object
        Dim strFileName As String, strPath As String
        Dim iNumberOfPagesToInsert As Integer, iLastPage As Integer
        AcroExchApp = CreateObject("AcroExch.App")
        AcroExchPDDoc = CreateObject("AcroExch.PDDoc")
        ' Show the Acrobat Exchange window
        AcroExchApp.Show()
        ' Set the directory / folder to use
        strPath = PDF_DIRECTORY
        ' Get the first pdf file in the directory
        strFileName = Dir(strPath + PDF_WILDCARD, vbNormal)
        ' Open the first file in the directory
        AcroExchPDDoc.Open(strPath + strFileName)
 
 
 
 
        Dim files As FileInfo() = New DirectoryInfo(strPath).GetFiles("*.pdf")
        If (files.Length > 0) Then
            '' Create code
            strFileName = files.First.FullName
 
            For Each file As FileInfo In files
                '' Use file.FullName in place of strFileName
                '' Append code
                ' Get the total pages less one for the last page num [zero based]
                iLastPage = AcroExchPDDoc.GetNumPages - 1
                AcroExchInsertPDDoc = CreateObject("AcroExch.PDDoc")
                ' Open the file to insert
                AcroExchInsertPDDoc.Open(file.FullName)
                ' Get the number of pages to insert
                iNumberOfPagesToInsert = AcroExchInsertPDDoc.GetNumPages
                ' Insert the pages
                AcroExchPDDoc.InsertPages(iLastPage, AcroExchInsertPDDoc, 0, iNumberOfPagesToInsert, True)
                ' Close the document
                AcroExchInsertPDDoc.Close()
                ' Get the name of the next file in the directory
                'strFileName = Dir()
            Next
            ' Save the entire document as the JOIN_FILENAME using SaveFull [0x0001 = &H1]
            AcroExchPDDoc.Save(&H1, strPath + JOIN_FILENAME)
        End If
        ' Close the PDDoc
        AcroExchPDDoc.Close()
        ' Close Acrobat Exchange
        AcroExchApp.Exit()

Open in new window

0
 
oobaylyCommented:
You probably shouldn't be acting on the AcroExch objects outside of the If block as there's nothing to be done. Did you step through the code to confirm that the any files are being returned.

The suggestion I made about updating the code was merely to improve readability, rather than functionality. The following compiles, but I can't verify that it executes properly. I still believe you'll need to look into the Acrobat issues anyway.
    Private Sub JoinPDFsInDir()
      Dim files As FileInfo() = New DirectoryInfo(PDF_DIRECTORY).GetFiles(PDF_WILDCARD)
      If (files.Length > 0) Then
        '' Create code
        Dim AcroExchApp As Object, AcroExchPDDoc As Object, AcroExchInsertPDDoc As Object
        AcroExchApp = CreateObject("AcroExch.App")
        AcroExchPDDoc = CreateObject("AcroExch.PDDoc")
        ' Show the Acrobat Exchange window
        AcroExchApp.Show()
 
        ' Open the first file 
        AcroExchPDDoc.Open(files(0).FullName)
 
        ' Append the remaining files
        For i As Integer = 1 To files.Length - 1
          ' Get the total pages less one for the last page num [zero based]
          Dim iLastPage As Integer = AcroExchPDDoc.GetNumPages - 1
          AcroExchInsertPDDoc = CreateObject("AcroExch.PDDoc")
 
          ' Open the file to insert
          AcroExchInsertPDDoc.Open(files(i).FullName)
 
          ' Get the number of pages to insert
          Dim iNumberOfPagesToInsert As Integer = AcroExchInsertPDDoc.GetNumPages
 
          ' Insert the pages
          AcroExchPDDoc.InsertPages(iLastPage, AcroExchInsertPDDoc, 0, iNumberOfPagesToInsert, True)
 
          ' Close the document
          AcroExchInsertPDDoc.Close()
 
        Next 
        ' Save the entire document as the JOIN_FILENAME using SaveFull [0x0001 = &H1]
        AcroExchPDDoc.Save(&H1, PDF_DIRECTORY + JOIN_FILENAME)
 
        ' Close the PDDoc
        AcroExchPDDoc.Close()
        ' Close Acrobat Exchange
        AcroExchApp.Exit()
      End If
    End Sub

Open in new window

0
 
JohnnyBCJAuthor Commented:
Sorry for the delay.
I solved the issue using a 3rd party program. I do appreciate the help you offered me.
0
 
JohnnyBCJAuthor Commented:
Thanks for your help!
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now