Combine PDFs into one file in multiple folder

I have 1 folder called PDFFILES, and in that folder that are many suborders.
In each subfolder, either 1 or 2 or 3 pdfs might be found there.

I want to run a script, that will look in the main PDFILES subfolder.
Then it needs to look into each subfolder and if it finds more than 1 pdf in each subfolder, I would like them to be joined all in 1 pdf file, per subfolder.

Can this be done?
Many thanks for all your assistance.
100questionsAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Lee W, MVPTechnology and Business Process AdvisorCommented:
It can be, but not DIRECTLY in a vbscript as far as I can tell (without potentially purchasing third party software).  At one client, I have a vbscript that launches a free command line tool to do this.  The vbscript locates the files I need, copies them to a working directory after renaming so I can assemble in a specific order, and assembles them into a single PDF.  Your circumstance may be easier if the files are all in the same directory and ALL need to be combined.
0
100questionsAuthor Commented:
Thanks kindly.  And to clarify, in each subfolder, there would be files which have the same exact 5 digits.  Such as 12345Q.pdf, 12345A.pdf, and perhaps even 12345C.pdf etc.. and I want them all combined into one file called 12345.pdf.

what software do you recommend, even if I need to purchase this?
0
Bill PrewIT / Software Engineering ConsultantCommented:
PDFtk is a great free / low cost option for merge and split type manipulations and has command line support.

You could do all of what you want from a fairly simple BAT script if you wanted to go that route?

https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/


»bp
0
OWASP: Forgery and Phishing

Learn the techniques to avoid forgery and phishing attacks and the types of attacks an application or network may face.

Lee W, MVPTechnology and Business Process AdvisorCommented:
PDKtk is what I've been using as the command line tool to combine things with the vbScript.
0
Bill PrewIT / Software Engineering ConsultantCommented:
Just to give you an idea, here's an approach in BAT script, pretty simple, you just have to edit the SET statements near the top...

@echo off
setlocal EnableDelayedExpansion

rem Define location of files and folders
set BaseDir=B:\EE\EE29072521\PDFFILES
set DestDir=B:\EE\EE29072521\PDFMERGE
set PDFtk=C:\_pf\PDFtk\bin\pdftk.exe

rem Look at each subfolder in the base source folder
for /d %%D in ("%BaseDir%\*.*") do (

    rem Look at the PDF files in this folder and get a name from one
    for %%F in ("%%~D\*.pdf") do (
        set Name=%%~nF
    )

    rem Remove the rightmost character for the name to get the name for the merged PDF file
    set Name=!Name:~0,-1!

    rem Merge all PDF files into one and store in the destination folder
    "%PDFtk%" "%%~D\*.pdf" cat output "%DestDir%\!Name!.pdf"
)

Open in new window


»bp
0
Joe Winograd, Fellow&MVEDeveloperCommented:
I wrote a program called CAPTAIN, which is an acronym for Combine All PDFs Together with Agreeing Initial Names. CAPTAIN is an enhancement of a program that I first discussed in my EE article, How to Combine-Merge PDF Files in Many Subfolders (which is an offshoot of my earlier EE article, How To Combine-Merge-Append a Large Batch of TIFF Files — both of which are programmed in the AutoHotkey language).

As stated at the article (both articles, actually), I removed the source code to make major changes to the program. My initial intention was to re-attach the source code there, but as mentioned in a more recent comment at the article, I've decided not to post the full program (I plan to rewrite the article as a "design roadmap" when I have some spare time). The "major changes" to that program have resulted in CAPTAIN, for which the Quick Start Guide is attached (recently updated). Most of us don't like to read manuals, but I encourage you to read the Quick Start Guide — it's only four pages and will give you a good sense of what it can do.

In your case, you would use the command line interface (CLI), which offers the RS (Recurse Subfolders) option, which I'll probably add to the GUI in the future (but right now it's available only in the CLI). You would, of course, specify 5 as the number of leading characters that need to match.

I'm not ready yet for broad distribution of CAPTAIN, but let me know if it interests you and I'll get you a copy to try. Regards, Joe
CAPTAIN_v1.4_Quick_Start_Guide.pdf
0
100questionsAuthor Commented:
Thanks. I would like to try CAPTAIN for sure.. thanks,
0
100questionsAuthor Commented:
Thanks Bill.  I tried your script however a few questions.. it's not merging the files at all, and I wanted to know if the PDFFILES and the PDFMERGE folders need to be different?
Can I change the"%%~D\*.pdf") be changed to Z, instead of D?
0
Bill PrewIT / Software Engineering ConsultantCommented:
It worked as expected in a test here, did you get any errors?

Thanks Bill.  I tried your script however a few questions.. it's not merging the files at all, and I wanted to know if the PDFFILES and the PDFMERGE folders need to be different?

No, they don't need to be different, I set it up that way to be flexible.

Can I change the"%%~D\*.pdf") be changed to Z, instead of D?

Why do you want to change that, it's just the reference to the loop variable %%D, it doesn't relate to the actual file name.


»bp
0
Joe Winograd, Fellow&MVEDeveloperCommented:
Out of my office now on my mobile. Will reply properly when I return, probably in an hour or two.
0
100questionsAuthor Commented:
HI Bill, the script does not work then. It says it can't find the path... it looks like it identified the files in each folder but then the part right after that, well it's not working.  Could it be the pdftk path that you specified?
0
Bill PrewIT / Software Engineering ConsultantCommented:
Well, you would need to change that PDFtk path to be wherever you installed the software, which I'm sure is different than where I did...


»bp
0
100questionsAuthor Commented:
Hi Bill...Ok I changed the path, but now it takes the combined file and put's it in the destination folder, but not in the original subfolder it was created in, and the original files are still there they are not deleted.
0
Bill PrewIT / Software Engineering ConsultantCommented:
Okay, try this adjustment.  Test carefully!

@echo off
setlocal EnableDelayedExpansion

rem Define location of files and folders
set BaseDir=B:\EE\EE29072521\PDFFILES
set PDFtk=C:\_pf\PDFtk\bin\pdftk.exe

rem Look at each subfolder in the base source folder
for /d %%D in ("%BaseDir%\*.*") do (

    rem Look at the PDF files in this folder and get a name from one
    for %%F in ("%%~D\*.pdf") do (
        set Name=%%~nF
    )

    rem Remove the rightmost character for the name to get the name for the merged PDF file
    set Name=!Name:~0,-1!

    rem Merge all PDF files into one and store in the destination folder
    "%PDFtk%" "%%~D\*.pdf" cat output "%%~D\!Name!.pdf"

    rem Delete all put merged PDF file...
    ren "%%~D\!Name!.pdf" "!Name!.xxx"
    del /q "%%~D\*.pdf"
    ren "%%~D\!Name!.xxx" "!Name!.pdf"
)

Open in new window


»bp
0
Joe Winograd, Fellow&MVEDeveloperCommented:
> the original files are still there they are not deleted

I've had many users ask for this feature in my programs that combine files and I always recommend against it strongly. The reason is that if the combining/merging process goes haywire, you don't want to delete the source files. My advice is always to wait until you know that the combining process has been 100% successful before deleting the source files, which means doing the deletions in a separate step after the combining process. All of that said, since you really seem to want a "Delete the source files after they are combined" option, I put it in the GUI with this dialog:

CAPTAIN delete source files option
Note that I made the No button the default so that an accidental Enter key won't select the "Delete the source files after they are combined" option. I added it to the CLI via a K (Keep) or D (Delete) option as the second parameter on the command line, moving the Source and Destination folders to the third and fourth parameters.

Also, I added the Recurse Subfolders feature to the GUI with this dialog:

CAPTAIN gui recurse subfolders
So, you'll be able to use the GUI for your purposes if you feel more comfortable with that (rather than the CLI). Regards, Joe
0
100questionsAuthor Commented:
Thanks Joe. I don't have the CAPTAIN program to try yet though, however thanks for the screen shots.
0
100questionsAuthor Commented:
Thanks Bill.  I tried the script, however sometimes in a subfolder, there will be on file only with an A to Z appended to the end fo the 5 digits, like this..  12345.pdf...  when it finds a file like this, it takes off the last digit.
Only if it finds to files that start with the same 5 digits, within the same subfolder and if those files contain a letter at the end, then it should just join the two files, and remove the letter at the end..  
So in  practical scenario, let's say in one subfolder it finds 98765A.pdf and 98765B.pdf and also 98765.pdf.... it should join all three into 98765.pdf and delete the other files.
0
Joe Winograd, Fellow&MVEDeveloperCommented:
You're welcome, 100questions (???). I'm not ready for distribution to everyone on the Internet (yet), so I'll write to you in the EE Message system to make arrangements for you to get it. I need to do some additional testing/QA, especially of the new features, create an installer and upload it, and update the Quick Start Guide, but I should be able to do all that in the next few hours. Regards, Joe
0
Bill PrewIT / Software Engineering ConsultantCommented:
Okay, give this small change a try.

@echo off
setlocal EnableDelayedExpansion

rem Define location of files and folders
set BaseDir=B:\EE\EE29072521\PDFFILES
set PDFtk=C:\_pf\PDFtk\bin\pdftk.exe

rem Look at each subfolder in the base source folder
for /d %%D in ("%BaseDir%\*.*") do (

    rem Look at the PDF files in this folder and get a name from one
    for %%F in ("%%~D\*.pdf") do (
        set Name=%%~nF
    )

    rem Remove the rightmost character for the name to get the name for the merged PDF file
    set Name=!Name:~0,5!

    rem Merge all PDF files into one and store in the destination folder
    "%PDFtk%" "%%~D\*.pdf" cat output "%%~D\!Name!.tmp"

    rem Delete all put merged PDF file...
    del /q "%%~D\*.pdf"
    ren "%%~D\!Name!.tmp" "!Name!.pdf"
)

Open in new window


»bp
0
Joe Winograd, Fellow&MVEDeveloperCommented:
Hi 100questions,
I just sent you a PM via the EE Message system. Looking forward to hearing back from you. Regards, Joe
0
100questionsAuthor Commented:
Bill, thanks for this change, however now it combines anything it finds in a subfolder all together, regardless of the first 5 digits, if they are they same or not.
0
Bill PrewIT / Software Engineering ConsultantCommented:
All of my script so far have assumed that only one set of files to be combined exist in each subfolder.  Are you saying now that is not true, and that there could be:

11111.pdf
22222.pdf
22222A.pdf
33333A.pdf
33333B.pdf

all in the same subfolder, and you want three different PDF's created?

11111.pdf
22222.pdf
33333.pdf


»bp
0
Joe Winograd, Fellow&MVEDeveloperCommented:
all in the same subfolder, and you want three different PDF's created?

11111.pdf
22222.pdf
33333.pdf
As you can see from the Quick Start Guide that I posted, that's what CAPTAIN does, so I hope the answer to Bill's question is Yes. :)
0
100questionsAuthor Commented:
Hi Bill, yes exactly..
0
Bill PrewIT / Software Engineering ConsultantCommented:
Okay, this seems to do that here, give it a try.

@echo off
setlocal EnableDelayedExpansion

rem Define location of files and folders
set BaseDir=B:\EE\EE29072521\PDFFILES
set PDFtk=C:\_pf\PDFtk\bin\pdftk.exe

rem Look at each subfolder in the base source folder
for /d %%D in ("%BaseDir%\*.*") do (

    rem Clear list of base names to process
    set BaseNames=
    set LastName=

    rem Look at the PDF files in this folder and get a list of unique base names
    for /f "tokens=*" %%F in ('dir /b /a-d /on "%%~D\*.pdf"') do (
        set Name=%%~nF
        set Name=!Name:~0,5!
        if "!LastName!" EQU "" (
            set BaseNames=!Name!
            set LastName=!Name!
        ) else (
            if "!Name!" NEQ "!LastName!" (
                set BaseNames=!BaseNames!,!Name!
                set LastName=!Name!
            )
        )
    )

    rem Process list of base names, moving PDF files into a single merged file
    for %%N in (!BaseNames!) do (
        rem Merge all PDF files into one and store in the destination folder
        "%PDFtk%" "%%~D\%%N*.pdf" cat output "%%~D\%%N.tmp"

        rem Delete all but merged PDF file...
        del /q "%%~D\%%N*.pdf"
        ren "%%~D\%%N.tmp" "%%N.pdf"
    )

)

Open in new window


»bp
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Joe Winograd, Fellow&MVEDeveloperCommented:
> Tested and working solution provided.

Yes, deserves to be a solution. Haven't tested it myself, but have no doubts that it works — 100% trust Bill's comment on that.

> Since Joe's beta test solution was done outside the question

I don't know what "outside the question" means. I am simply recommending a software package that needs to be downloaded and installed, same as I and other members have done thousands of times here at EE. My list of recommended software products includes AutoHotkey, GIMP, GraphicsMagick, ImageMagick, IrfanView, the NirSoft utilities, PaperPort, PDFtk, Power PDF, the Xpdf utilities — the list is huge. This one, CAPTAIN, happens to be my own product. I don't want to publish the download site publicly, but am happy to provide the download link to any EE member who requests it via PM. I don't think that is any more "outside the question" than providing a download link to other software packages — the only difference is in public exposure.

CAPTAIN does exactly what was requested in this question and has been confirmed in production usage by many users, as well as my own internal testing and QA prior to releasing it to users over a several year period. As such, I am objecting to Bill's close. I would be happy to see his #a42399975 post as the Accepted Solution, but believe that my #a42395610 post (the one with the CAPTAIN Quick Start Guide attached) is worthy as an Assisted Solution.

Btw, CAPTAIN is not "beta test" (Bill's words). It is in production usage by many users. I do currently have an enhanced beta version of CAPTAIN that is capable of combining PDF files or TIFF files, but that's a post for another day. :)  Regards, Joe
0
Joe Winograd, Fellow&MVEDeveloperCommented:
I recommend closing this in a different way from the previous suggestion, as explained in my #a42415506 post.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VB Script

From novice to tech pro — start learning today.