Link to home
Start Free TrialLog in
Avatar of KHMaddox
KHMaddox

asked on

Merge TIFF files with the same part of a filename

I am using Windows.
I have an identical problem to the one found here:
https://www.experts-exchange.com/questions/26864426/Merge-TIFF-files-with-the-same-part-of-a-filename.html

The problem is that I have no programming experience at all.

Therefore, I am not looking for code but rather I am looking for a program that will accomplish this.


eg:
I have a directory with thousands of TIFF files. An example of the files are:

1979_00032983_0001.tif
1979_00032983_0002.tif
1979_00032984_0001.tif
1979_00032984_0002.tif
1979_00032984_0003.tif
1979_00032985_0001.tif
1979_00032986_0001.tif
1979_00032986_0002.tif
and so on...

As you can see, the first 13 digits of some of the filenames are exactly the same. These actually belong together as one file. I need a script that joins 2 (or more) TIFF files together which have the same first 13 digits AND removing the second underscore and everything after it. So based on the above example, the final output should look like this:

1979_00032983.tif
1979_00032984.tif
1979_00032985.tif
1979_00032986.tif
and so on...

Thanks in advance.

-kitmaddox@yahoo.com
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

What version of Windows (2K, XP, Vista, W7, W8) are you using? 32-bit or 64-bit?
Hi KHMaddox,

I guess you want to merge the TIFs to one TIF with multiple pages, not into one TIF with one larger page containing all merged TIFs side by side, right?

If so I think you can do it with ImageMagick and a small script. To do so you need to install ImageMagic on your computer - you can download a setup at i.e. http://mirror.checkdomain.de/imagemagick/binaries/.

After you installed it with default options you can use its convert tool from a batch file. To do so easiest is to create a batch (i.e. called mergetif.bat) file in the directory with the TIFs, open it in a text editor and paste this code:
@echo off
for %%i in (*.tif) do call :adjoin %%i
goto :EOF

:adjoin
set fname=%1
set fname=%fname:~0,13%.tif
if exist %fname% (
	convert %1 %fname% -adjoin %fname%
) else (
	copy %1 %fname%
)
goto :EOF

Open in new window

Save the file, open a command linie window, go to the directory with the batch file and start it, this should do the job.

If wanted you can make it more comfortable by adding a path argument so you can call the batchfile from anywhere.

Further you have to take care you first have to manually delete all files which were previously created by the patch, otherwise the TIFs may appear more than once in the result. This could either be done in the script - if you want to have it in the script but don't know how to do it please tell me, I can add it.

Hope that helps,

ZOPPO
Avatar of KHMaddox
KHMaddox

ASKER

thank you for the feedback...
I will be back at that computer later this morning and I will:
1. answer joewinograd: I know it's W7, but don't know if it is 32-bit or 64-bit.
2. try the process described by Zoppo
joewinograd: W7 Pro SP1 32 bit

Yes, I want to merge the TIFs to one TIF with multiple pages
ok, then you really should try what I suggested. I tested it and it works in the way I think you need it to work ...
Zoppo,

I just wanted to document how it's progressing for your info as well as for the next person with this issue:

I installed "ImageMagick-6.7.9-10-Q16-windows-static.exe" from your link above.
I then copied one of the folders (which contains 14,788 tif files) to my desktop to use as a test.
I then pasted your script from above into notepad and saved it in the new  folder as "mergetif.bat"
I then doubleclicked mergetif.bat and it appears to be processing;
-  a window popped up which has the following header "C:\Windows\system32\cmd.exe"
-  in this window, the following appears:
 
convert.exe: Unknown field with tag 32934 (0x80a6) encountered. 'TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/824
~or~
convert.exe: Unknown field with tag 292 (0x80a6) encountered. 'TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/824
~or~
convert.exe: Unknown field with tag 33000 (0x80a6) encountered. 'TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/824


I will keep you updated...

again, I cannot thank you enough for your assistance!!

-KHMaddox
Hi Kit,
The reason I asked about your version of Windows is that I was working on a script for you, and knowing that you're not a programmer, I wanted to include everything in the script so that you don't have to edit it.

There are many GUI tools out there that will do what you want, although I've never personally tried any of them. A Google search for "merge tiff" or "combine tiff" will give you plenty of hits, such as these:
http://www.tiff-split-combine.com/
http://www.pdf-tiff-tools.com/TIFF-Combiner.html

Some of them even claim to have batch capability, but since I haven't tried any of them myself, I don't know if they can really do what you want. In any case, I agree with Zoppo's approach – a script to run through all of the files in the folder and an image processing tool that can combine TIFFs. I chose different tools from Zoppo (a different scripting language and a different imaging tool) and hadn't quite finished the script for you when I saw Zoppo's excellent reply. So I'll stop working on my script now, assuming you can get Zoppo's solution to work. But if you can't for some reason, let me know, and I'll be happy to finish my script for you. Regards, Joe
Thanks Joe! I will certainly let you know what happens!
-Kit
Zoppo,

This didn't appear to work.
Once it completed processing
It did not combine the files with the same prefix into one document, but rather
I now have the same files, but when I open one, it now contains two copies of the same image.
eg:
"00032983_0001.tif" was a single page and now contains the same page twice.
"00032983_0002.tif" was a single page and now contains the same page twice.
"00032984_0001.tif" was a single page and now contains the same page twice.
"00032984_0002.tif" was a single page and now contains the same page twice.

whereas, what I am looking for is a file named
"00032983.tif" (which contains "00032983_0001.tif" and "00032983_0002.tif" )
"00032984.tif" (which contains "00032984_0001.tif" and "00032984_0002.tif" )

did I do something wrong in the steps described in my post on 2012-10-09 at 11:18:13?

Also, just kinda fyi... these files are copies of documents and they can range from one page to >100 pages...  
I don't mean to give you the impression that that they will all be just 2 pages.
eg:
"00032985.tif" (should contain "00032985_0001.tif" and "00032985_0002.tif" and "00032985_0003.tif" )
Kit,
I really hope that you can get Zoppo's solution to work, but if not, I wanted to let you know that I continued the development of my solution (the problem really piqued my interest) and I decided to write it up into a comprehensive EE Article. I generalized the solution a bit, so that it doesn't have to be a 13-character lead-in for matching the file names (that's a user-specifiable number), and I also made the source folder and destination folder user-specifiable. I wouldn't say it's the most robust program in the world, but I did put in some error checking, such as making sure the source and destination folders exist (if the destination folder doesn't exist, the program offers to create it). Anyway, it was a fun mini-project, and I'll let you know when the Article is published (should be just a day or two – the EE Page Editors must review and approve it before it's published, and they're usually very fast). Regards, Joe
Hi,

the problem is I thought it's a given fact the prefix is always 13 characters since you mentioned this length in the question. So I wrote the script which iterates through all files with the same 13-character prefix and created the new TIF's name from these 13 characters.

In you test the filenames have exactly 13 characters, so the name of source files and destinationf files are equal.

I'll try to change the script so it can handle any *_????.tif and find the length dynamically, but this may take some time, it's not such easy as my first script.

ZOPPO
ok, here's the changed script. It first collects the filenames (to avoid files created during the batch run are used as input again) of form '*_????.tif', then cuts off the '_????' part and calls the convert command. I think this should work for both of the shown cases. If you have other cases where it doesn't work please tell.
@echo off

set files=

for /F %%i in ('dir /b ^| findstr "_....\.tif"') do call :collect %%i
for %%i in (%files%) do call :adjoin %%i
goto :EOF

:collect
set files=%files%;%1
goto :EOF

:adjoin
set fname=%1
set fname=%fname:~0,-9%.tif
if exist %fname% (
	convert %1 %fname% -adjoin %fname%
) else (
	copy %1 %fname%
)
goto :EOF

Open in new window

Just replace the content of the batch file you created with this code.

Hope that helps,

ZOPPO


PS: I didn't test it for a large number of files. As you told you have thousands of files at once, so maybe (I'm not sure if this can happen) it's not possible to store all filenames in the the variable %files%. If this happens please tell me, then I change the script in a way the filenames first a written into a temporary file, not into a variable.
Addition to my last comment: I added a counter to this script so you can simply check if the correct amount of files was handled:
@echo off

set files=
for /F %%i in ('dir /b ^| findstr "_....\.tif"') do call :collect %%i
set /A count=0
for %%i in (%files%) do call :adjoin %%i
echo %count% files processed.
goto :EOF

:collect
set files=%files%;%1
goto :EOF

:adjoin
set /A count=%count%+1
set fname=%1
set fname=%fname:~0,-9%.tif
if exist %fname% (
	convert %1 %fname% -adjoin %fname%
) else (
	copy %1 %fname%
)
goto :EOF

Open in new window

Zoppo, I will test it later this am @ let you know what I find - thanks for all your hard work!
joewinograd, I'm very glad that this topic is so interesting to you and I am looking forward to reading your article... please be sure to post a link in this thread or email me a link so I can find it.
-kitmaddox@yahoo.com
Dear Zoppo,

WOW... we are so close!!!

the result of your Post on 2012-10-10 at 02:56:13
is that it does combine the files and properly names them
but the pages are reversed...
meaning that the result of 00032985.tif is that its order of pages is 3,2,1 instead of 1,2,3...

also, you mentioned that there is a way to delete the parent files as they are no longer needed.

can you please re-write to incorporate these two changes?

I just gotta say that I am truly amazed at how diligently you and joewinograd have worked on this... can't begin to thank you enough!

-Kit
Hi. Nice to hear this. Unfortunateley I'm just leaving the office, so I can't do it before tomorrow morning (CET). I'll contact you as soon as I have more ...
no worries... thanks!
Hi Kit,
I'd like to make a recommendation – do NOT have the script automatically delete the source files. If something goes awry with the process, you could be left in a worst case situation – the new files not being created correctly and the old files being deleted. My strong advice is to make sure that the new files are correct after the script terminates, and then, and only then, delete them – in a separate process. It is simple to use Windows Explorer to delete a folder. If you don't know how to delete folders/files in Windows, I could add a new prompt in my script to do it, or even safer, write a completely separate script to do it, but I strongly believe that it should NOT be part of the standard operation of the main script. You must validate that the destination files are correct before deleting the source files. Remember Murphy's Law! :)

Btw, I have not heard back yet from EE's Page Editors regarding publication of my article. But I'm glad to see that you're making great progress with Zoppo. Regards, Joe
Joe,

That's excellent advice... it's good to have experienced people looking out for us novices and keeping us away from the landmines!

What are your thoughts about putting the concatenated files in a new folder instead of having them mixed in with the originals?
Kit,
I think that putting them in a separate folder is the way to go. In fact, my script prompts for the source folder and destination folder, as follows:
User generated imageUser generated imageIf the source folder doesn't exist, it gives an error message and exits:
User generated imageIf the destination folder doesn't exist, it offers to create it:
User generated imageAlso, as I mentioned earlier, my script prompts for the number of lead-in characters needed to match, for those situations where it is something other than 13:
User generated imageRegards, Joe
dang that's cool Joe!

whenever you're ready, I would be very happy to give it a test run... just let me know how to download it.

Zoppo's fix has been very educational for me and I am looking forward to using his method.
It has really opened my eyes to what can be done with this .bat stuff - I had no idea!!!

cheers!
-Kit
Yes, Zoppo's code is very clever! He's using standard Windows/DOS batch files, which in an expert's hands like his can do powerful stuff. I chose a different scripting language called AutoHotkey. I also chose a different TIFF image processing program called IrfanView. These are two of my all-time favorite freeware packages that I've been using for many years. You may get a head-start on my solution by downloading and installing both of them:

http://www.irfanview.com/
http://www.autohotkey.com/

When you install them, accept all of the defaults, including the installation folder. In particular, my script looks for the IrfanView program in the default location and if it doesn't find it there, it exits – a programmer could change that in my script, but I know that's not your cup of tea. :)   Regards, Joe
lol... no it's not my cup-o-tea...
got 'em both installed
Well, as long as you have them installed, you may as well try my script. If you have any problems with it, please let me know so that I may update/correct my Article before it is published.

The file type of AutoHotkey scripts is AHK. However, Experts Exchange does not allow files with a file type of AHK to be uploaded. So I have attached the script with a file type of TXT. After downloading it, rename it to a file type of AHK (let me know if you don't know how to rename it and I'll walk you through it). In other words, the file should be renamed to [Combine-TIFF.ahk]. After renaming it, you may run it by simply double-clicking on it in Windows Explorer or whatever file manager you use. Since its file type is AHK, AutoHotkey will be launched to process it. Cheers, Joe
Joe,

works great!

fyi: I still had the "mergetif.bat" file that Zoppo has been working on for me in my test folder with 26 .tif files - the result was flawless except that it also created a file in the destination folder called "mergetif.tif" which does nothing when you try to open it... not a problem for me at all as my actual folders only contain .tif's... just so you know what happens in this case.

I will try it on larger folders next (the first one contains 14,788 .tif files) and I will let you know how it does.

here's some before & after screenshots (i created a folder called "MERGED" prior to executing your program).

User generated image

User generated image
Excellent, Kit! Thanks for validating that the script works. I trust that you looked at the combined files it created in the MERGED folder and verified that all of the input files (with all of their pages) made it into the appropriate output files, and in the correct sequence. Right? Regards, Joe
Joe,

It combined 14,788 files into 4,297 files - took approximately 15 minutes - of course, I could not check each one to verify accuracy, but I did spot check 30 files and they all appear to be correct - totally amazing!!!
here's some before & after pics:


Now I can't wait to see how Zoppo's solution works!

User generated image

User generated image
Oh, and thanks for the feedback on what it did with the [mergetif.bat] file. I didn't think to test it with anything other than .TIF files in the source folder. That's a good catch! I should put some code in there to ignore everything in the source folder that isn't a .TIF file. Cheers, Joe
14,788 files into 4,297 files in 15 mins?! Very cool! Thanks for the data – extremely valuable!
by the way, that large folder was on  a server (not my local machine) and your program didn't even hiccup... just thought you would want to know that it works on a mapped drive too:

source:
Y:\Texas, Panola County\Panola Instruments\2_Panola Deed Records 1979-2010\1979\0003

destination:
Y:\Texas, Panola County\Panola Instruments\2_Panola Deed Records 1979-2010\1979\0003\1979merged
Yes, that's very good to know. I tested it on a small number of files on a local hard drive, so your experience with a large number of files on a mapped network drive is really helpful.
Kit,
I added some code that checks the file type in the source folder and ignores any file that does not have the .TIF file type. It is case insensitive, so .tif and .TIF are both fine (or any combination of upper and lower case letters in TIF). The revised script is attached as [Combine-TIFF-v2.txt]. Of course, as before, change it to [Combine-TIFF-v2.ahk] after downloading. I tested it here, but would appreciate it if you would take it for a test drive, too. Regards, Joe
Hi again. Now I changed the batch file. It now sorts the files in the correct order. Further you can pass a destination directory as optional argument. If you don't pass one the files are created in the current folder, otherwise in the given one.
@echo off

set files=
set dest=%1

if %dest%x==x goto :start
if exist %dest% goto :start

mkdir %dest% 2>nul
if %ERRORLEVEL%==0 goto :start

echo Error: Couldn't create directory %dest%
goto :EOF

:start
for /F %%i in ('dir /b /o-n ^| findstr "_....\.tif"') do call :collect %%i
set /A count=0
for %%i in (%files%) do call :adjoin %%i
echo %count% files processed.
goto :EOF

:collect
set files=%files%;%1
goto :EOF

:adjoin
set /A count+=1
set fname=%dest%\%1
set fname=%fname:~0,-9%.tif
if exist %fname% (
	convert %1 %fname% -adjoin %fname%
) else (
	copy %1 %fname%
)
goto :EOF

Open in new window

To pass a destination you can either call it via command line like mergetfif d:\merged_tiffs\ or you can create a shortcut, i.e. on desktop, and set the argument as command options in the shortcut's properties.

ZOPPO
Addition: If needed I can even add a parameter for the source directory. Further it might be a good idea to unset the Archive attribute for sourcefiles which were processed. Thus it's possible to distinguish between processed and unprocessed files in the source directory when executing the batch multiple times and only process those files with set Archive attribute to avoid source files are put multiple times into already existing destination files.
thanks!!!
I will test them later this am when I get back to that computer and let you know what I find.
Kit,
No rush. Just make sure that you have a few non-TIF test files in the source folder, and verify that they stay in the source folder AND do not appear in the destination folder. Thanks, Joe
Hi Kit,
The article was reviewed and published:
https://www.experts-exchange.com/Web_Development/Document_Imaging/A_10745-How-To-Combine-Merge-Append-TIFF-Files-in-Batch-Mode.html

I would appreciate it if you would read it and give me some feedback, including any suggestions for improvements. Thanks much, Joe
Zoppo,
re: Zoppo Posted on 2012-10-11 at 01:15:22
flawless... thanks for the assistance!!!!
Joe,
This is in regards to the new version you created, “Combine-TIFF-v2.AHK”

Re: non-TIF test files in the source folder;
I put several files with various extensions (.doc, .jpg, .bat, .txt) in the source folder along with the .tif files.
The .tif files are processed correctly and the NON-TIF files DO stay in the source folder and they do NOT appear in the destination folder and they are unchanged.
Great job!

Also,

Observation about versions 1 AND 2;
Upon completion, the file size is large by comparison to the size of the parent files.
(note: the same is true when using Zoppo's method)

eg:
Original:
00032983_0001.tif = 122Kb
00032983_0002.tif =   69Kb

Version 2 output:
00032983.tif = 2,162Kb

As a test, I combined the originals in Acrobat, then saved them as .pdf, then changed the file extension to .tif, and the output was:
00032983 - manual merge via acrobat.tif = 159Kb

Any idea what is causing the large file size or how to fix it?
I suspect it will fill up the hard drive in a hurry by multiplying the storage by approx 10x.

Here’s a screenshot of the test folder:

User generated image
Also,

I brought some files home on an external hard drive to test them with the new version. (My home computer is XP Pro v2002 SP3)

I mapped the external hard drive and left the files on it for the test
Result:
1. it works on an external hard drive connected via USB
2. Combined 14,788 files into 4,297 files
    2.a.Completion time: 16 minutes
3. non-TIF test files in the source folder are ignored


Cheers,
Kit
Joe,
re: your article

when I click the link above (by: joewinograd Posted on 2012-10-11 at 11:35:20)

I get the following:

Experts Exchange Unauthorized Access
Permission Denied
This article is currently still in progress and yet to be approved.
Kit,
Thanks for verifying that it handles non-TIF files correctly. Interesting that the processing time on a network drive is about the same as a USB drive – not what I would have predicted.

Re the large file size, TIFF files may be compressed. There are various compression techniques, such as Huffman, ITU-T Group 3, ITU-T Group 4, LZW, Packbits, and others. My program called IrfanView without any compression parameter, so that's probably why you're seeing the large file sizes. Attached is a new version using ITU-T Group 4 compression (my personal favorite). Let me know it it makes a difference. We could easily experiment with other compression techniques to see which produces the best results. When I get a chance later today, I'll add an input dialog box asking the user which compression method to use (or none).

Re the article, I decided to make some cosmetic changes (nothing substantive – just formatting), but any change to an article triggers a Page Editor review prior to re-publication. EE likes to keep standards high on published articles, which is a good thing, but in this case makes the article inaccessible to you. I'll let you know when it is re-published. They're usually pretty fast. Regards, Joe
Kit,
I just determined that when sending no TIFF compression parameter to IrfanView, it defaults to the Packbits method. My guess is that you'll see smaller files with Group 4 compression, and the good news is that Group 4 is "lossless" (as opposed to "lossy") compression, so the compressed file will not lose any quality. I revised the script to prompt for compression method desired, allowing for all of the methods that are supported by IrfanView, as follows:

0=None
1=LZW
2=Packbits
3=Group3
4=Group4
5=Huffman
6=JPG
7=ZIP

The revised script is attached. Regards, Joe
ASKER CERTIFIED SOLUTION
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Joe,

Your article was very well written.... Concise yet thorough.

Also, I thought it was very intuitive on your part to have it add the "\" at the end of the file path if the user doesn't... I was opening the folder then using CTRL+C to copy the path then pasting it to the program... after reading your article, I realized that this method didn't  contain the "\" at the end.... I remember thinking "nice catch Joe!"

I look forward to reading the edits... should I post a comment in that thread to make sure it emails me when you've made edits? I'm new to EE, so I don't want to be inappropriate by posting a random comment there if that's considered a Faux pas.

Cheers,
Kit
Kit,
Thanks for the kind words – much appreciated! Posting a comment there is fine. I've received several comments on the various articles I've written – definitely NOT a faux pas! Regards, Joe
Kit,
I'm working on a revision of the article to include the Compression parameter, as well as some stylistic changes suggested by the Page Editor. Also, one of the EE Admins recommended that I include a comment to the revision that explains the changes, and while writing that, I came across the need to use "he/she" and "him/her" constructs, which I don't like much. Since "Kit" could be either, I'm hoping you won't mind telling me if you're male or female, so that I may use the correct pronouns. Thanks, Joe
lol... Mr. Kit H. Maddox... pleasure to meet you, Joe :)
Pleasure here, as well, Mr. Kit. :)
Btw, did the Group 4 compression result in smaller files?
Hm - sorry, KHMaddox, I don't want to complain, but I would like to know if the accepted solution is really the best for your needs.

As far as I can see it cannot work with different file name patterns but only files with a fixed number of leading characters (13 in the sample). Since you already posted two samples with different patterns I thought that's not ok for you. My suggested solution can handle any kind of file name pattern as long as it ends with a '_xxxx.tif', so it's possible to have something like this processed fine too with one call of the script:
  00032985_0001.tif
   00032985_0002.tif
   00032985_0003.tif
   1979_00032983_0001.tif
   1979_00032983_0002.tif
   1979_00032985_0001.tif
   1979_00032985_0002.tif
   1979_00032985_0003.tif


On the other hand my script can run unattended which might be interesting if you i.e. want to let it run automatically once every hour.

ZOPPO
Joe,
Yes, the compression resulted in smaller file sizes
eg:
previously (in my post on 2012-10-11 at 12:29:16)
Version 2 output:
00032983.tif = 2,162Kb

Now with V4 and using "4=Group4" output:
00032983.tif = 187Kb

A TREMENDOUS improvement in file size!!!

hope that helps!
Zoppo,

Please forgive my ignorance about using this site... I got an email from EE asking for action on this question... tried to answer it using my cell phone as I was away from my computer and couldn't make it distribute the credits equally between you and Joe.

Although I did not test your solution on folders containing files with different naming conventions, I was extremely pleased with the fact that your process did combine the files flawlessly in folders containing files with the same naming conventions.

I can definitely see how your process would be extremely beneficial to users with this issue or the need for recurring merging as in your example of running it every hour.

I have unsuccessfully tried to edit my acceptance of Joe's solution in an effort to credit your solution as well.

Please accept my most humble apology and know that I greatly appreciate your hard work on this topic.

Sincerely,
Kit Maddox
No need to apologize, it's on you to select one or more comments as answer. I just wanted to ensure you see the difference between the two possible solutions and choose the right one for your needs.

BTW, it was more fun than hard work since some of the things I used here I didn't ever use before, so I learned something too :o)

Have a nice day,

ZOPPO
cool! you're awesome Z
Hi Zoppo,
Re your comment, "As far as I can see it cannot work with different file name patterns but only files with a fixed number of leading characters (13 in the sample)."

Please take a look at my post on 2012-10-10 at 12:58:03, ID: 38482963. Note my comment, "Also, as I mentioned earlier, my script prompts for the number of lead-in characters needed to match, for those situations where it is something other than 13:" followed by a screen capture of a prompt box entitled NumFirstChars.

Btw, if there were some way for me to share the points with you, I'd be happy to do it. Regards, Joe
Sorry, I guess I didn't explain what I meant properly. What I meant is if i.e. you have files with three different counts of leading characters, lets say 8, 10 and 13, then you have to execute your script three times entering different values. The script I wrote handles this in one call since it searches for the trailing part (_????.tif) thus it doesn't matter how many leading characters exist ...

Don't worry about the points. KHMaddox could ask site moderators to change it if he likes. But IMO it's absoluteley ok when only the chosen solution is accepted by the asker when two working solutions exist but are different. None of my comments was of any help for KHMaddox to use your script ...

ZOPPO
Zoppo,
Thanks for the explanation. Regards, Joe
Kit,
Your comment in the find/copy/paste thread saying that the "Operation Completed" box is very helpful when doing thousands of files inspired me to improve it. The new one looks like this:
User generated imageIn addition, when doing thousands of files that can take many minutes, perhaps even hours, a progress bar would be very comforting to assure the user that processing is taking place. So the new version has this, too (the progress bar is updated as each file is processed):
User generated imageAs you can see, I tested it on only 26 files (23 TIFF and 3 non-TIFF), so I'd really appreciate it if you could let it rip on your 14,000+ files and let me know if the final statistics look right and if the progress bar updates properly.

Eventually, I'll modify the Article to be Version 3 (of the Article), but for now I'm attaching the new script here as V5 to keep the numbering consistent within this thread. Thanks, Joe
Kit,
One more thing. Your comment about copying/pasting the folder names made me realize that having the user type the folder names is not a good approach, so I changed the program to utilize a standard Windows dialog for browsing and selecting folders (and in the case of the destination folder, the ability to create it). The screens look like this:
User generated imageUser generated imageThe script for this is attached as V6 and includes the enhancements that are in V5. Looking forward to your feedback on this new version. Thanks, Joe
XP Pro v2002 SP3 32 bit- external hard drive connected via USB
source folder was on the external hard drive
destination folder was on desktop

Compression used: 1=LZW
Result: flawless (also, it DID ignore the non .tif files)
Source folder size: 1.38GB (including 238KB of non .tif files)    
Destination folder size: 2.38GB

User generated image
Joe,

V6 observation:

I usually have the source and destination folders open when I start "Combine-TIFF-v?.ahk"

V6 doesnt allow me to copy and paste the folder locations as the previous versions do.

Since my folders are buried deep in the file tree, browsing for them takes considerable additional time.... that being said, I definitely see the benefit of being able to browse, but for me this is not necessarily an "improvement"

cheers!
Kit
31,856 TIFF files?! Great stuff! I wonder what the resulting size would be with Group 4 compression...perhaps you can find that out on your next coffee break. :)

Your note about the compression method used gave me another idea...I should report that in the Operation Completed dialog...also, the total size of all the source TIFF files and destination TIFF files.

Good point on the new browsing method for folders. I'll change it so it has the best of both worlds – the ability to browse to it as well as type it in (copy/paste). Will work on V7 during the weekend. Regards, Joe
sweet!!!
here's some data:

User generated image
XP Pro v2002 SP3 32 bit- external hard drive connected via USB

source folder was on the external hard drive
Source folder size: 1.38GB (including 238KB of non .tif files)
Source folder .tif count: 31,856
Source folder non-.tif count: 4

Compression used: 4=ITU-T Group 4

Result: flawless

destination folder was on desktop
Destination folder size: 1.37 GB
Destination folder .tif count: 6,759
Destination folder non-.tif count: 0
completion time: 37 min 3 sec
Kit,
That's awesome data! Thanks so much for gathering it. Btw, is the Progress Bar working for you?

I thought I wasn't going to be able to get to V7 until the weekend, but managed to find the time to do it today. Here are the changes:

(1) Ability to navigate to or type in the Source folder:
User generated image(2) Ability to navigate to or type in or create the Destination folder:
User generated image(3) Enhanced statistics in the Operation Completed dialog:
User generated imageAlso, I added a parameter that should make the program run faster, although that may cause it to "hog" resources, especially the CPU.

Please let me know if everything works correctly. Thanks, Joe
Joe,

haven't tested V7 yet but wanted to answer your progress bar question... yes.. that is a VERY beneficial change!

will let you know how V7 does

cheers!
Kit
Kit,
Very glad to hear that! Looking forward to your feedback on V7. Have a nice weekend. Cheers, Joe
Kit,
I just realized that V7 doesn't put separator commas in the number-of-files statistics. Doesn't matter here with 20+ files, but does matter with your tens-of-thousands of files. The attached V8 fixes that, so please use V8 in your next test – and I'd appreciate it if you'd post the V8 Operation Completed dialog. Thanks, Joe
willl do!
noticed that you added an ending "\" at the end of the folders selected confirmation dialogue box...
i had wondered about that :)

User generated image
Do you think that's a good idea? I could display it as entered (with or without the slash – whatever the user entered)...would you prefer that?
oh no... with the slash is proper... I was just saying that I noticed it
OK, glad you noticed. :)   The program needs the backslash during execution, so it simply appends it when it's not entered, but it could easily display either one in the folders-confirmation dialog. But since you're happy with seeing the backslash, I'll leave it that way.
V8 result = flawless
copy & paste feature works great!!!
Thanks Joe!

User generated image
Kit,
You're welcome. And thanks to you for making the full run and posting the results. I trust that the results reported in the Operation Completed dialog match what you're seeing on the system.

Btw, is this going to be a repetitive process for you or a one-time effort? If repetitive, I'm curious what program/process is creating the TIFF files with that naming convention. Regards, Joe
repetitive ... I do title research for oil companies primarily in East Texas and this is an archive of the deed records that I am developing... this will allow me to review documents at home instead of going to the courthouse...  the clerk scans the documents and save them as individual .tif files.... that's why the volume is so great
Ah, that explains it. The clerk's scanner/scanning software is creating those TIFFs with that naming convention. I'm glad to know the program will be of value to you on an on-going basis...makes the effort to fine-tune it very worthwhile. Regards, Joe
definitely! :)
Hi Kit,
I updated the article to V3, which contains all of the changes from V8 in this thread. Even though the article was republished, the link is the same:
https://www.experts-exchange.com/Web_Development/Document_Imaging/A_10745-How-To-Combine-Merge-Append-TIFF-Files-in-Batch-Mode.html

There are a few very small differences between the V8 here and the V3 there – miniscule changes, but I'd really appreciate it if you would head over to the article and test the V3 there on your 31,000+ files. If you would post the Operation Completed dialog from a 31,000+ run, that would be awesome! Thanks much, Joe
Oh, and like before, the file name over there doesn't have V3 in it...it's still called just [Combine-TIFF.txt]. This is what the EE Page Editor recommended...replace the old file with the new one, keeping the name the same. Regards, Joe
cool.... will do!
Kit (and anyone else using the program),

Later today or tomorrow, I'm going to update the article and program. There will be several changes, but the most important is to correct a serious issue due to the way the "/append" parameter works in the IrfanView command line. When I wrote the program, I thought that the "/append" option would append the entire source TIFF file to the destination TIFF file. For example, I thought that

i_view32.exe c:\12345678_0001.tif /append=c:\12345678.tif

would append all of the pages of [c:\12345678_0001.tif] to [c:\12345678.tif]. I have just discovered that it does not. It appends only the first page of [c:\12345678_0001.tif] to [c:\12345678.tif].

I don't know if this is considered to be a "bug" or "feature" of IrfanView, but my experimentation has confirmed this behavior. Of course, if all of the source TIFF files are one page, this "/append" behavior is fine and the current program works. However, this is not the case if source files have more than one page.

To overcome this issue with the "/append" option, I modified the program to use the "/multitif" command line option, which allows all of the files in the operation (source and destination) to be multipage TIFF files. My experimentation with numerous multipage TIFF files has confirmed that "/multitif" works correctly.

To stay consistent in this thread, I attached the revised program to this post as [Combine-TIFF-v9.ahk], but I'll still call it [Combine-TIFF.ahk] at the article. Note that EE now allows AHK files to be uploaded, so uploading as TXT (and renaming to AHK after downloading) is no longer required. There are other changes in the program, which I'll document at the article, but I wanted everyone on this thread to have the new version ASAP.

Kit, even if all of your files are one-page TIFFs and you don't need the new version, I'd really appreciate it if you'd run it on one of your folders with 30,000+ source files and let me know the results. Thanks, Joe
will do!
also, I need to look at the "cancel" feature... seems like I hit cancel & it ran anyway... been meaning to try to re-create the issue so this will give me a chance to do that as well.

cheers!
Kit
Hmmm, sounds like a bug. Depending on which dialog box you're talking about, hitting Cancel should either offer the ability to retry or exit the program...and it should be very clear which one it is going to do when you hit Cancel. If you can reproduce the problem and let me know where it's happening, I'm sure I can fix it. Btw, I submitted the changes to the article yesterday, but it is still in Editor Review, which makes it inaccessible to members. I'm hoping for a quick review and re-publishing. The only change you should notice when running the new script is that it will offer to save the operational statistics in a text file, as shown in this screenshot:
User generated imageIf you say yes, you'll see this:
User generated imageThe text file it creates looks like this:

Beginning date and time: 2013-03-18_21.06.15
Compression method used: ITU-T Group 4
Number of source files processed: 23
Size of source files (bytes): 324,806
Number of destination files created: 6
Size of destination files (bytes): 324,670
Number of non-TIFF files ignored: 3
Ending date and time: 2013-03-18_21.06.17
Elapsed time (minutes:seconds): 0:2

Regards, Joe
cool
There's a bug in the v9 above related to saving the statistics in a text file. I fixed it in the script attached to the article and am also attaching it here as v10. So the [Combine-TIFF-v10.ahk] attached here is identical to the [Combine-TIFF.ahk] that will appear in the re-published article. Regards, Joe
Kit (and anyone else who is interested),

The article was just re-published. Link is the same:
https://www.experts-exchange.com/Web_Development/Document_Imaging/A_10745-How-To-Combine-Merge-Append-TIFF-Files-in-Batch-Mode.html

Please head on over there and let me know what you think. I'm curious about the performance of the "/multitif" parameter in the new version compared with "/append" in the previous version. Kit, if you could turn the new version loose on your 30,000+ files and post the operational statistics, that would be awesome. Thanks, Joe