<

How To Combine-Merge-Append a Large Batch of TIFF Files

Published on
29,089 Points
10,389 Views
12 Endorsements
Last Modified:
Awarded
Joe Winograd, Fellow&MVE
50+ years in computer industry. Everything from development to sales. CIO. Document imaging. EE MVE 2015, EE MVE 2016, EE FELLOW 2017.
Update 21-May-2015: I temporarily removed the source code and the code snippets to make major changes to the program. Regards, Joe

INTRODUCTION

The inspiration for this Article was a fascinating question here at Experts Exchange on combining TIFF files. Since it is in an area of extreme interest to me (Document Imaging) and since the solution involves two of my all-time favorite freeware products – IrfanView for the TIFF image processing and AutoHotkey for the scripting – I decided to publish the solution as an Article, with a lot more detail put into it than a typical response to a question.

INSTALLATION INSTRUCTIONS

The original poster (OP) of the problem (KHMaddox) said he has no programming experience at all, so I made the solution suitable for such a user. All you have to be capable of doing is download and install the two freeware products, IrfanView and AutoHotkey, and then run the script attached to this Article, as follows:

(1) Install AutoHotkey – http://ahkscript.org (also, see my EE article: AutoHotkey - Getting Started)

Click the Download button at the page above, save the install file, and run it.

(2) Install IrfanView – http://www.irfanview.com/

Click the Download button at the page above, save the install file, and run it. The script assumes that IrfanView is installed in the default location. If it isn't, you'll need to modify the script accordingly. As a side note, if you'd like to have PDF support in IrfanView, click the PlugIns link at the page above, save the install file, and run it. PDF capability is not needed for the solution in this Article, but as long as you're installing IrfanView, it's a great feature to have.

(3) Run Program – Combine-TIFF.ahk

Download the attached script called Combine-TIFF.ahk and then run it by simply double-clicking on it in Windows Explorer (or whatever file manager you use). Since its file type is AHK, AutoHotkey will be launched to process it. If you prefer, the file may be turned into an executable via the AutoHotkey compiler, which is installed during the standard installation of AutoHotkey. If you right-click on an AHK file in Windows Explorer (or whatever file manager you use), there will be a context menu pick called Compile Script:

AutoHotkey Compile Script
Select that and it will create an EXE file, which is a stand-alone, no-install executable of the AHK program.

PROBLEM DESCRIPTION

The OP's problem is that a folder has many TIFF files with a specific naming convention, as follows:

1979_00032983_0001.tif
1979_00032983_0002.tif
1979_00032984_0001.tif
1979_00032984_0002.tif
1979_00032984_0003.tif
1979_00032985_0001.tif

When files have the first 13 characters the same, as in the case of the first two files above, they should be in the same file. The idea is to take those two files and create a new file with the file name being just the first 13 characters. In other words, put [1979_00032983_0001.tif] and [1979_00032983_0002.tif] (in that order) into a file named [1979_00032983.tif]. Likewise, using the example above, put [1979_00032984_0001.tif], [1979_00032984_0002.tif], and [1979_00032984_0003.tif] (in that order) into [1979_00032984.tif]. Even items with just one file in the list should be processed, so using the example above, put [1979_00032985_0001.tif] into [1979_00032985.tif].

Note that the input TIFF files may have any number of pages in them.

The OP specifically had 13 lead-in characters, but I generalized the solution so it handles any number of lead-in characters that are needed to match (the user enters the number). This number may be as small as 1, such as with these source files:

11.tif
12.tif
21.tif
22.tif
23.tif
31.tif

The above would result in the output files:

1.tif
2.tif
3.tif

Before proceeding with my solution, I want to say that there are many GUI tools out there that will do this, although I've never tried any of them. A web search for "merge tiff" or "combine tiff" will give plenty of hits, such as these:

    http://www.tiff-split-combine.com/
    http://www.pdf-tiff-tools.com/TIFF-Combiner.html

Some of them even claim to have batch capability, but since I haven't tried any of them myself, I don't know if they can really do what the OP requested, and I don't know the cost. So I decided to proceed with my home-brew solution, where the two tools needed are freeware.

HOW THE PROGRAM WORKS

For those interested in understanding how the script works, the remainder of the Article shows the entire script broken down into code snippets, with a description of what each snippet does, including screenshots where appropriate.

Code snippet:
 
SetBatchLines,-1 ; run at maximum speed

Open in new window

What it does: Sets the script to run at maximum speed, i.e., no "sleeping" will occur in the program.

Code snippet:
 
temporarily removed

Open in new window

What it does: Checks to see if IrfanView is installed. It looks in the standard location for 32-bit Windows C:\Program Files\IrfanView\i_view32.exe and the standard location for 64-bit Windows C:\Program Files (x86)\IrfanView\i_view32.exe. If it doesn't find IrfanView in either folder, it displays a message and exits.

IrfanView not found
Code snippet:
 
temporarily removed

Open in new window

What it does: Warns the user that existing files in the destination folder will be overwritten with no warning, and then gives the user the opportunity to exit or continue.

Existing files overwritten
Code snippet:
 
temporarily removed

Open in new window

What it does: Initializes some variables.

Code snippet:
 
temporarily removed

Open in new window

What it does: Asks the user for the number of lead-in characters that need to match.

Enter number characters need to match
If the entry is not an integer and/or not greater than zero, it displays a message and gives the user the opportunity to try again or exit.

NumFirstChars must be integerNumFirstChars must be at least 1
Code snippet:
 
temporarily removed

Open in new window

What it does: Asks the user to enter the full path of the source folder. It allows the user to navigate/browse to it or type/paste it in. It looks for an ending backslash on the path name and if one was not entered, it appends one (in other words, it works whether or not the user includes the ending backslash in the path).

Navigate to or type-paste Source folder
It then checks to see if a source folder was entered, and if so, if the folder exists. If either is not true, it gives the user the opportunity to exit or continue. Note: whether or not the source folder can be reported as null with the Browse For Folder dialog depends on the operating system, so the program checks for it.

Source folder must be specifiedSource folder does not exist
Code snippet:
 
temporarily removed

Open in new window

What it does: Asks the user to enter the full path of the destination folder. It allows the user to navigate/browse to it or type/paste it in or create it. It looks for an ending backslash on the path name and if one was not entered, it appends one (in other words, it works whether or not the user includes the ending backslash in the path).

Navigate to or type-paste or create Destination folder
It then checks to see if a destination folder was entered, and if so, if the folder exists, giving the user the opportunity to create it, exit, or try again to enter the name. Note: whether or not the destination folder can be reported as null with the Browse For Folder dialog depends on the operating system, so the program checks for it.

Destination folder must be specifiedDestination folder does not exist
Code snippet:
 
temporarily removed

Open in new window

What it does: Asks the user to confirm that the chosen source and destination folders are correct, providing the option at this point to continue or exit.

Confirm folders
Code snippet:
 
temporarily removed

Open in new window

What it does: Asks the user to enter compression method and checks that it is correctly specified. It keeps trying until the user enters a valid number or decides to exit.

Enter compression methodCompression must be specifiedCompression must be integerCompression must be 0-7
Code snippet:
 
temporarily removed

Open in new window

What it does: Initializes variables that are used to track operational statistics, which will be reported in Operation Complete dialog.

Code snippet:
 
temporarily removed

Open in new window

What it does: Loops through all of the files in the source folder, sorted in file name order (ascending), ignoring any non-TIFF files. It counts the number of TIFF files and the number of non-TIFF files, and it accumulates the total size (in bytes) of the TIFF files. It displays a dialog box with a green progress bar that moves to the right during processing, also showing the name of the file currently being processed.

Progress Bar
Code snippet:
 
temporarily removed

Open in new window

What it does: Checks to see if the current file in the list has the same lead-in characters as the previous file. If it doesn't, then it represents a new output file, and the program copies the current file in the source folder to a new file in the destination folder, using a file name composed of just the lead-in characters. If the copy fails, it displays the error code and all of the relevant parameters to assist in troubleshooting the problem. Also, it counts the number of TIFF files created and accumulates the total size (in bytes) of the TIFF files created.

Code snippet:
 
temporarily removed

Open in new window

What it does: The current file has the same lead-in characters as the previous file, so it appends the current file to the combined file in the destination folder. To do this, it (i) runs IrfanView with the "/multitif" parameter to create a temporary file; (ii) deletes the existing combined file in the destination folder; and (iii) renames the temporary file to have the same name as the just-deleted combined file. If the IrfanView call or the file deletion or the file renaming fails, it displays the error code and all of the relevant parameters to assist in troubleshooting the problem.

Code snippet:
 
temporarily removed

Open in new window

What it does: Finalizes and formats all of the statistics from the operation and displays them in an Operation Completed dialog box.

Operation Completed
It also asks if the user wants to save the statistics in a text file. If the user says Yes, it creates a file with the name Operational_Statistics_YYYY-MM-DD_HH.MM.SS.txt in the destination folder (where YYYY-MM-DD_HH.MM.SS are the ending date and time of the run).

Operational Statistics saved
The text file looks like this:

Beginning date and time: 2013-03-18_21.06.15
Compression method used: ITU-T Group 4
Number of source files processed: 23
Size of source files (bytes): 324,806
Number of destination files created: 6
Size of destination files (bytes): 324,670
Number of non-TIFF files ignored: 3
Ending date and time: 2013-03-18_21.06.17
Elapsed time (minutes:seconds): 0:2

I hope this helps the OP as well as other EE members. Although I did a bit of generalization, I realize that the solution is still rather specific to the OP's requirements. However, by providing the source code, I'm sure that other folks with similar needs will be able to modify the program to suit their purposes.

If you find this article to be helpful, please click the thumbs-up icon below. This lets me know what is valuable for EE members and provides direction for future articles. Thanks very much! Regards, Joe
12
Comment
  • 23
  • 7
  • 4
  • +7
46 Comments

Expert Comment

by:KHMaddox
flawless... amazingly simple & fast
I used it on 2 machines:
1) XP Pro v2002 SP3 32 bit- external hard drive connected via USB
Combined 14,788 files into 4,297 files; Completion time: 16 minutes
1) W7 Pro SP1 32 bit - mapped network folder on server
Combined 14,788 files into 4,297 files; Completion time: < 15 minutes

Great job!
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
@KHMaddox,
Thanks for your kind words and for taking the time to report your results – both are much appreciated! I did not test it on anywhere near that number of files, so that is great data to have. Thanks, Joe
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
This note documents the changes made in Version 2 of the program, which I will be submitting for review and approval later today. I'd like to thank the EE Page Editor, @Qlemo, for his helpful comments on improving the updated version of the article, as well as his efforts on the original article. He went through all of my code snippets to add the Style Code format, which was a lot of work – and I really appreciate it!

I'd also like to thank the OP, Kit Maddox (@KHMaddox), for posting the original question addressed by this article, as well as providing feedback all along that improved the program. This includes the biggest change in Version 2, viz., providing a compression capability on the created TIFF files. Kit also gathered and reported extremely valuable performance data on utilizing the solution with a large number of files.

Here are brief descriptions of the changes in V2:

(1) V1 did not send any compression setting to IrfanView, which means that it defaulted to the Packbits method. V2 prompts the user to enter the compression method desired, allowing all of the methods supported by IrfanView. Kit Maddox reported a big improvement in file sizes with ITU-T Group 4 compression. He said that a file created in V1 had a size of 2,162 KB, while the same file created in V2 with Group 4 compression was just 187 KB, an order of magnitude smaller.

(2) V1 checked for the number of first characters needed to match being 2 or more, meaning that 1 was considered invalid, which is incorrect. V2 fixes the range check to allow any integer greater than zero for the number of lead-in characters needed to match.

(3) Most of the error conditions in V1 resulted in exiting the program, such as an invalid number of lead-in characters or a non-existent source folder. This means the user had to start over and specify all of the parameters again. V2 improves upon this by asking if the user wants to either try again or exit the program when an entry error is detected.

(4) Dialog boxes were improved in appearance, utilizing standard Windows icons, such as the white on blue question mark, the black on yellow exclamation point, the white on blue letter "i" (information), and the white on red large X (error/stop).

I will be happy to work on a V3 if any EE members find bugs or require enhancements – just let me know. Regards, Joe
0
Introduction to Web Design

Develop a strong foundation and understanding of web design by learning HTML, CSS, and additional tools to help you develop your own website.

LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
This note documents the changes made in Version 3 of the program, which I will be submitting for review and approval later today. I'd like to thank again the OP, Kit Maddox (@KHMaddox), for his feedback on the program, which resulted in these enhancements:

(1) Improved the Operation Completed dialog box to contain statistics of the run, as follows:

Beginning date and time
Compression method used
Number of source files processed
Size of source files
Number of destination files created
Size of destination files
Number of non-TIFF files ignored
Ending date and time
Elapsed time

(2) Added a progress bar during operation to assure the user that processing is taking place. In addition to a standard, green bar moving to the right, the dialog box shows the name of the file currently being processed.

(3) Instead of the user having to type in (or copy/paste) the source and destination folder names, the program now uses a standard Windows "Browse For Folder" dialog that allows the user to navigate to and select the folder. But it also still allows typing in (or copying/pasting) the folder names, and in the case of the destination folder, provides a Make New Folder button.

(4) Added a parameter to the program that will make it run at maximum speed (for those familiar with AutoHotkey scripts, it now sets the variable SetBatchLines to -1).

I will be happy to work on a V4 if any EE members find bugs or require enhancements – just let me know. Regards, Joe
0

Expert Comment

by:KHMaddox
XP Pro v2002 SP3 32 bit & external hard drive connected via USB

source folder was on the external hard drive
destination folder was on the desktop

Result: flawless

v3 result
Great job!!!
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
@KHMaddox,
Thank you again for your kind words and for taking the time to report your results. It is extremely valuable to have data on the program's performance with more than 34,000 source TIFF files in excess of 1.4GB. Thanks, Joe
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
This note documents the changes made in Version 4 of the program, which I will be submitting for review and approval later today or tomorrow. Here are the changes:

(1) Corrected a serious issue due to the way the "/append" parameter works in the IrfanView command line. When I wrote the program, I thought that the "/append" option would append the entire source TIFF file to the destination TIFF file. For example, I thought that

i_view32.exe c:\12345678_0001.tif /append=c:\12345678.tif

would append all of the pages of [c:\12345678_0001.tif] to [c:\12345678.tif]. I have just discovered that it does not. It appends only the first page of [c:\12345678_0001.tif] to [c:\12345678.tif].

I don't know if this is considered to be a "bug" or "feature" of IrfanView, but my experimentation has confirmed this behavior. Of course, if all of the source TIFF files are one page, this "/append" behavior is fine and Version 3 of the program works correctly. However, this is not the case if source files have more than one page.

To overcome this issue with the "/append" option, I modified the program to use the "/multitif" command line option, which allows all of the files in the operation (source and destination) to be multipage TIFF files. Experimentation with numerous multipage TIFF files has confirmed that "/multitif" works correctly.

(2) Provided an option to save the operational statistics (displayed in the Operation Completed dialog) in a plain text file.

(3) Added a Title to all message boxes (MsgBox commands).

(4) Added checks for error codes on calls to routines that return an "ErrorLevel" value, such as FileCopy, FileDelete, FileMove, and IrfanView. If a non-zero value is returned, the message box displays all of the relevant parameters to assist in troubleshooting the problem.

(5) Changed the file type of the source code to AHK. Experts Exchange now allows AHK files to be uploaded, so uploading the program as a TXT file (then renaming it to AHK after downloading) is no longer required.

Regards, Joe
0

Expert Comment

by:Jon Pitzer
Joe,

What a great utility!  It does the job with minimal effort and is very simple to use.  I did a quick test on 5 single page TIF files and it created a combined file.  However, the pages were combined in reverse order so you had to go to page 5 to see page 1 of the document.  Did I do something wrong?  The files are named...

00000225.001.tif
00000225.002,tif
00000225.003.tif
00000225.004.tif
00000225.005.tif

Any help you can provide would be greatly appreciated!

Thank you,
N3627L
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Jon,

First, thanks for the compliment — much appreciated!

You caught me just as I was about to shut down the computer and head out for a long car trip (7-8 hours), so taking a close at this will have to wait until tomorrow. It's supposed to combine the files in file name order (sorted ascending), so off the top of my head I don't know why it would combine them in reverse order.

I'm going offline now. Will get back with you tomorrow. Regards, Joe

Update1: I was able to reproduce the problem. Will try to fix it today.

Update2: Found the bug — combining the files in the wrong order. The fix is to change this line:
multiparam:=DestFolder . "temp.tif," . SourceFolder . FileNameCurrent . "," . DestFolder . FirstCharsCurrent . ".tif"

Open in new window

to this:
multiparam:=DestFolder . "temp.tif," . DestFolder . FirstCharsCurrent . ".tif" . "," . SourceFolder . FileNameCurrent

Open in new window

Thanks for reporting the problem! Please confirm that the above change fixes it. Regards, Joe
0

Expert Comment

by:Jon Pitzer
Joe,

Sorry for the delay.  I just tried the modified code and it works brilliantly.  Thank you very much!!  Awesome program!

Thanks,
Jon Pitzer
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Jon,
Thanks again for the kind words and for confirming that the modified code works for you. I really appreciate it! I'd also appreciate it if you click the big green "Vote this article as helpful" button at the end of the article. :)  Regards, Joe
0

Expert Comment

by:jpas84
Joe,

I tried the utility with some files that were extraced from a document storage system. The pre-merged tif files display ok in Windows Photo viewer, but after the merge nothing but blank pages are displayed in the viewer. When I tested tif files from other sources, they merge and display ok. I'm assuming there's an issue with the files I'm trying to merge. Any suggestions on what to check, or how to verify the initial tif file? Attached is a sample of a doc file I'm trying to merge.
Regards,
JPS
T01-a-dummy1.tif
0

Expert Comment

by:jpas84
I neglected to mention in the issue above that the TIF files are tagged with "Pixel Translations Inc., PIXTIFF Version 55.0.218.709" in the file. Maybe that is causing the issue?
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi JPS,

Well, this is a strange one! I've never seen this before. When IrfanView opens it, the page is blank:

IrfanView shows page as blank
Every Microsoft product I tried opens it fine, including Paint, Picture Manager, Photo Gallery, and Windows Photo Viewer:

Windows Photo Viewer shows page as not blank
Most third-party products that I tried open it fine, too, including FineReader, OmniPage, PaperPort, Power PDF, and XnView. The only other products I tried (besides IrfanView) that open it as a blank page are GIMP and ImageMagick. GIMP gives an indication of the problem with the error message, "Read error on strip 0; got 53745 bytes, expected 53746". So it seems that IrfanView, ImageMagick, and GIMP are detecting a problem in the TIFF file that is causing it to be displayed as blank, while all the other products are ignoring (or not finding) the error and happily displaying the page fine.

My program relies on using the command line call of IrfanView, so the only hope for fixing this is to (1) have Irfan Skiljan, the author of IrfanView, modify IrfanView so it works on this file or (2) use a different piece of software to merge the TIFF files. I'll send an email today to Irfan Skiljan and will also research other command line tools that can merge TIFFs.

I don't know if this is a viable work-around for you, but I discovered that after saving the file with another piece of software, then IrfanView can display it! Specifically, I used PaperPort to do a Save As to a TIFF Group 4 file and IrfanView opened it fine. Weird!

Thanks for reporting this problem. I'll do my best to come up with a solution. Regards, Joe
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi JPS,

I have some confirmation on my comments above. I searched the IrfanView Forum and found this thread:
https://irfanview-forum.de/showthread.php?t=10172

Although the thread talks about the later pages of a multi-page TIFF, the real issue is that a particular page (even the only page in a one-page TIFF, like yours) could be an improperly formatted page. This is why GIMP reported the error mentioned in my previous post, and is the same type of error reported by LibTIFF in the IrfanView Forum thread. In fact, I ran a LibTIFF tool on your file and it came back with the same message as GIMP, namely:

Read error on strip 0; got 53745 bytes, expected 53746

A key comment at the IrfanView Forum thread is this:
Some programs/libs may ignore such file errors, some not ...
This echoes my previous comment that other products are ignoring (or not finding) the error and happily displaying the page fine.

I wrote an email to Irfan Skiljan before finding that thread at the IrfanView Forum. However, since that thread is from more than five months ago, I don't think we'll see a fix in IrfanView, as the Moderator at that thread sent the TIFF file to Irfan Skiljan, but his response was clearly that the problem lies with the file, not with IrfanView.

I'm doing research to find a solution. I just tested a command line tool called NConvert (free for non-commercial use) that read your file fine and converted it to a Group 4 TIFF which IrfanView handled fine. Of course, that means each TIFF file will be processed twice — once by NConvert and then by IrfanView. It will be better if I can find a command line tool that performs the combine/merge/append and is able to handle your TIFFs without conversion. Regards, Joe
0

Expert Comment

by:jpas84
Thanks for looking at this Joe. When I return from vacation I can check if it's as easy as adding one byte to the end of the tif file before closing it as IrfanView seems to be expecting one more byte than it receives.  jps
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
JPS,

You're welcome — happy to do it. If you have the ability to add a byte, that may work — definitely worth a try.

I received this reply from Irfan Skiljan today:
Newer LibTIFF versions (which I use in IrfanView) are more strict regarding TIF compression errors :)
=> the error is in the TIF, the LibTIFF says:

"error: Read error on strip 0; got 53745 bytes, expected 53746"
(maybe are there other errors too, the library will stop on the first bigger error)
-----

You can use IrfanView 4.33, this version uses an older LibTIFF and this error will be ignored :)
----

Conclusion: the problem is in the program which created the TIF.
----
I don't like the idea of dropping back to Version 4.33, which is almost three years old (released 28-Mar-2012). If you can add the byte, that would be ideal. If not, I'm already looking into other possible solutions. Have a great vacation! Regards, Joe
0

Expert Comment

by:jpas84
In this situation, adding a null byte to the end of each extracted tiff file as it's created fixed the issue. IrfanView merged the files and they view ok. On to the next issue. Thanks,  jps
0

Expert Comment

by:jpas84
Is there a script similar to this one that prompts for file name characters to match, source  directory, and destination directory for appending text files?
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
JPS,

If I'm understanding correctly, you want everything this script does, except that instead of source TIFF files, you want it to work on text files. So it would:

(1) Ask the user for the number of lead-in characters that need to match.

(2) Ask the user to enter the full path of the source folder.

(3) Ask the user to enter the full path of the destination folder.

(4) Not ask for the TIFF compression parameter, as it is irrelevant for text files.

Is that what you want? Regards, Joe
0

Expert Comment

by:jpas84
Yes, your description matches my question. A copy command can be used to merge all of the files in a directory, but this would have the advantage of being able to merge related files in a directory with multiple output files. It's an idea that others may find it useful. I'm not certain if I'd be able to use it for my current project or not. It'll depend on how much I need to manipulate the files to be merged. Thanks again, jps
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
> It's an idea that others may find it useful.

I agree — interesting idea! While I'm at it, I'm thinking that I can do the same for PDF files.

For text files, I would probably switch from the IrfanView command line call to the FileAppend command, which is built into AutoHotkey. For PDF files, I would probably switch the IrfanView command line option from "/multitif=" to "/multipdf=", which has the same syntax as the "/multitif=" option, but operates on PDF files instead of TIFF files. I would also consider using PDFtk Server, which has worked very well in two other articles/programs that I published here at EE, How to Combine-Merge PDF Files in Many Subfolders and How To Split-Rename-Move a Batch of PDF Files Based on Contents of the Files.

The new UI would look something like this:

Proposed new UI
Regards, Joe
0

Expert Comment

by:jpas84
That'd be a nice way to fit all three features into one program. jps
0

Expert Comment

by:Nathan Patterson
Joe - I realize this is some time later, but I am looking for this exact script!  Appreciate any thoughts or suggestions on how to tackle this.  I have Autohotkey, Irfanview, and Ghostscript installed.
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Nathan,
I'll send you a message tomorrow with some thoughts on how we may tackle this. Regards, Joe
0

Expert Comment

by:Nathan Patterson
Thank you so much.  I am fairly confident this is possible but am at a bit of a loss for how to start.  I am pretty good at tweaking scripts and macros but absolutely horrid at creating from whole cloth, so appreciate any and all insight you can offer.  As background:

I have a folder containing 2000 or so single page TIFFs. The TIFFs have a variable number of common file names that indicate how the files should be combined, with a trailing "." that acts as the page delimiter. The common file names are different lengths.  I am trying to parse through the folder and combine the TIFFs based on the common file name.  I am using Windows 7.

For example, we have the following files in the folder: sample.1.tif, sample.2.tif, samplefoo.1.tif, samplefoo.2.tif, samplefoo.3.tif.  After running the batch process, we would have the following multi-page TIFFs (or PDFs, etc.) in the folder: sample.tiff (containing sample.1.tif and sample.2.tif), samplefoo.tif (and samplefoo.1.tif, samplefoo.2.tif, and samplefoo.3.tif).
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Nathan,
Thanks for the clarification on your requirements. I'll give this some thought and will send you a message soon. Regards, Joe
0

Expert Comment

by:Alok Purohit
Hello Joe,

This is really very educative and great article to automate the things. Since you have removed the code; I am sending this request to share the same and learn how to implement.

regards,
Alok
0

Expert Comment

by:Joe E
Hi all. I was looking for a solution to this exact issue. I was initially looking at using Autohotkey and PDFTK, but this script seems to do exactly what I'm trying to accomplish. The intention seemed to be to repost the code snippets, but I'm not seeing them anywhere. Will the examples be reposted?
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Joe E,
It doesn't use PDFtk, because PDFtk is for PDF files, not TIFF files. I do have numerous similar AutoHotkey programs that use PDFtk on PDF files, such as the one described in my EE article, How to Combine-Merge PDF Files in Many Subfolders. But the program discussed in this article uses the /multitif option of IrfanView.

Yes, my intention was to re-post the code, but I have decided not to do that. Instead, I am going to rewrite this article — and the PDF article mentioned above — as "roadmap design" articles, with enough information to help folks write their own programs/scripts. I will send you a message here at EE with some other thoughts. Regards, Joe
1

Expert Comment

by:Joe E
I got ahead of my self on the explanation, but yes my intention was to start with tiffs. Thanks for the update.
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
You're welcome, Joe E, and thanks to you for joining EE today and reading my article. Regards, Joe W
0

Expert Comment

by:KHMaddox
Joe,
I can't seem to find the current version of the "Combine-TIFF.ahk" to download... please help!
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Kit,
Great to hear from you after such a long time — almost five years! Amazing how time flies. As you can see from my comments above, I decided not to re-post the source code, but instead will rewrite this article (when I have the time) as a "design roadmap" paper, with enough information to help folks write their own programs/scripts.

I haven't worked on this program in a few months, but I can pull something together for you. I won't be ready to share anything publicly for a while, but if you can be a beta tester for the new version of the program, that will be terrific. I'll write you a PM in the EE Message System to discuss this further. Regards, Joe
0

Expert Comment

by:Deacon Aspinwall
Hi Joe,

First off, I want to thank you for doing this routine. This should be a lifesaver for me! I have nearly the same issue that Kit does.

I have 16000+ jpeg images that I'd like to combine into single-paged pdfs based on the unique 12-digit value at the beginning of each file name (see screenshot below).

I'm running W7 64bit. I have Irfanview and AHK installed.

I also have no programming experience (I wish!). I'd be really grateful if you could help me out by sending me the script!!
Screenshot of data
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Deacon,
First, thanks for joining EE today, reading my article, and endorsing it. Welcome aboard!

This article is about merging TIFF files into TIFF files. Your requirement is to merge JPG files into PDF files. While you are right that there is a similar issue to Kit's (in that the leading characters of the file names must match), the program that works for Kit will not work for you, as it deals exclusively with TIFF files.

Before going any further, I want to be sure that I understand your requirements. So, a few questions:

(1) From your screenshot, you have these JPG files:

571914100044.jpg
571914100044-barn.jpg
571914100044-eq shed2.jpg
571914100044-eq stg shed.jpg
571914100044-grn bins.jpg
571914100044-NV.jpg

I think you're saying that you want each of those files converted into a PDF page (one PDF page for each JPG), and then the six PDF pages combined into one, six-page PDF file with a file name of 571914100044.pdf — is that right?

(2) Are all 16,000 JPG files in a single folder? If not, are all the folders underneath a single, root folder? If not, what does the folder/subfolder structure look like?

(3) Do you want the PDF output files stored back into the same source folder(s) as the JPG input files, or into different destination folder(s)?

I'm sure other questions will come up before we nail down the exact specs, but those are a good start. Regards, Joe
0

Expert Comment

by:Deacon Aspinwall
Thank you so much for your quick response, Joe! If it'd make it easier, I could simply batch convert the .jpg file to .tiff through Irfanview. This is just how they came to me.

In answer to your questions:
1) That is correct. I'd like each file converted into a merged PDF page. The example you provide is exactly right.
2) All the files are in a single folder.
3) When doing it the long way, I have been putting the multipage PDFs in a "Converted" folder in the same root folder as the jpegs. That would be the ideal destination folder.

Thanks a million!
Deacon
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Deacon,
Responses to your responses:

> If it'd make it easier, I could simply batch convert the .jpg file to .tiff through Irfanview.

That would help, since the logic in my existing program, called MergeTIFF, would not have to be changed to allow the input files to be JPGs. That said, I'm considering an enhancement that would allow the input files to be any image file type that ImageMagick and IrfanView allow (those are the two engines that MergeTIFF uses).

> I'd like each file converted into a merged PDF page. The example you provide is exactly right.

Re your second sentence, that's good, since the current merging logic already handles file names like that. However, re your first sentence, the current MergeTIFF creates only TIFF files, not PDF files, so it would have to be changed to create PDFs. This is another enhancement that I'm considering, but just as you said earlier that you could simply batch convert the JPGs to TIFFs with IrfanView, you could also batch convert MergeTIFF's output TIFFs to PDFs with IrfanView (as long as you have IrfanView's Plugins installed, which are required to create PDFs with IrfanView). With those techniques on the input and output side of things, you could use the current MergeTIFF as is. However, I'm not yet ready to expose the program publicly on the Internet, so I'll write you a PM in the EE Message System to discuss this further, as I did recently with Kit.

> All the files are in a single folder.

That's good, although MergeTIFF already supports recursion into subfolders to an unlimited depth.

> When doing it the long way, I have been putting the multipage PDFs in a "Converted" folder in the same root folder as the jpegs.

MergeTIFF already supports that, allowing you to specify the destination folder (when not recursing into subfolders, which works for you, since all your input files are in a single folder).

Regards, Joe
1

Expert Comment

by:Deacon Aspinwall
Hi Joe,

Thank you very much for your response. So to sum, I will batch convert the JPEGs to TIFFs, run your MergeTIFF program to create the multipage TIFFs, then batch convert the TIFFs (which are now multipaged) to PDF. I look forward to applying your program!

Thanks again,
Deacon
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Deacon,
Yes, that's a perfect summary! I'll send you a PM in the EE Message System soon to discuss how we may move forward. Regards, Joe
0

Expert Comment

by:Deacon Aspinwall
Hi Joe,

I can't thank you enough for helping me with your MergeTIFF program. For others reading this and in need of help with a similar problem, Joe isn't ready to release his program publicly here, but I encourage you to contact him directly. Using MergeTIFF was a resounding success, and I certainly would persuade anyone who has the need for such a program to contact you about it.

Joe is really thorough and easy to work with, quick to respond, and can break things down for even technological troglodytes like myself to understand.

Thanks again,
 Deacon
1
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Deacon,
You're very welcome, and my thanks to you for the compliments — I really appreciate hearing them! I'm glad to know that MergeTIFF worked well for you and was a resounding success — music to my ears! Regards, Joe
0

Expert Comment

by:Nathan Emch
Your script sounds like exactly what we need, but I do not see it available for download anywhere on this page.  Am I missing something, or did your "Combine-TIFF.ahk" script get pulled from this page?
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Nathan,
Yes, I pulled the script from the article. My initial intention was to re-post the code, but I have decided not to do that. Instead, I am going to rewrite this article, and a similar article on merging PDF files, as "roadmap design" articles, with enough information to help folks write their own programs/scripts. However, I have enhanced the program over the years into what I now call MergeTIFF™. As you can see in the comments above from May of this year, Kit Maddox and Deacon Aspinwall had great results with MergeTIFF. But, as also mentioned above, I'm not yet ready to expose the program publicly on the Internet, so I'll write you a PM in the EE Message System to discuss this further, as I did with both Kit and Deacon. Regards, Joe
0

Expert Comment

by:Brandon G
Hello Joe,
This sounds like a very magical solution and exactly the type of resource needed to process the nearly 1Million individual tif image files that I have, which are stored similarly as your article describes i.e. "tdr-2772-134446_page_1.tif","tdr-2772-134446_page_2.tif","tdr-2772-134446_page_3.tif", and "tdr-2772-134446_page_4.tif". All the files are in a single location and I am looking to merge the multiple pages into single multi-page tif, just as you describe.

Anyway you can be of assistance or I can be a part of the Beta testing?

Any assistance is much appreciated. Thank you in advance.
Brandon
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Brandon,
Thanks for joining Experts Exchange today, reading my article, and endorsing it. You're right — MergeTIFF does exactly what you want. For example, by specifying 15 as the number of first characters that need to match, it will merge these files...

tdr-2772-134446_page_1.tif
tdr-2772-134446_page_2.tif
tdr-2772-134446_page_3.tif
tdr-2772-134446_page_4.tif

...into this file:

tdr-2772-134446.tif

I'll write you a PM in the EE Message System to discuss this further, as I've done with many EE members for this program and several other programs. I should point out that this is now an acceptable method, compliant with EE's Terms of Use, since EE removed the Gigs product and the Hire Me feature. Regards, Joe
1

Featured Post

Exploring SharePoint 2016

Explore SharePoint 2016, the web-based, collaborative platform that integrates with Microsoft Office to provide intranets, secure document management, and collaboration so you can develop your online and offline capabilities.

Join & Write a Comment

In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…
I've published three five-minute Experts Exchange video Micro Tutorials that describe terrific features in an excellent, free PDF product called PDF-XChange Editor: How to rotate pages in a PDF with free software (https://www.experts-exchange.com…

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month