Link to home
Start Free TrialLog in
Avatar of satzm
satzm

asked on

Batch Rename PDF images to Tiff

Experts: I have 3000 .tif (xxxxxx.tif) images that accidentally had their suffixes changed to .PDF's by the sender. Those docs were mixed with actual pdfs. XXXXXX.pdf.

I have identified the incorrect documents and created an Excel list of the 3000+ files incorrectly named on my server.

I need to take my excel spreadsheet of correct files names (.TIF) and somehow "find" the incorrectly named files (PDF) on my server, and rename them via batch to the corrected name, xxxxxx.tif as set out in the spreadsheet.

Any suggestions?

Thanks as always!
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

> excel spreadsheet of correct files names (.TIF)

Does the Excel spreadsheet have the fully qualified filenames? Give us an example of exactly what's in the spreadsheet...something like

H:\folder\subfolder\abcd.tif

or just

abcd.tif

Thanks, Joe
ASKER CERTIFIED SOLUTION
Avatar of kaufmed
kaufmed
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
> There is also an assumption that you have included the full file path and name in each cell.

kaufmed, what would the code be if that isn't the case? Thanks, Joe
It would have to be known where the files were located. If the files are in the same directory as the spreadsheet, then (AFAIK) it should be of no consequence; if, however, the files are in some other directory, then the code can't possibly know what to do. One way or another, you have to tell the code where to find the files.

**edit

I suppose, conditions warranting, you could take an approach like this:

Sub RunMe()
    Const SOURCE_FOLDER = "C:\path\to\folder\containing\files"

    Set startCell = Range("A1")
    Set endCell = startCell.Offset(3000, 0)
    
    For Each cell In Range(startCell, endCell)
        originalFilename = cell.Value

        If Not IsEmpty(originalFilename) And Len(originalFilename) > 0 Then
            Set fso = CreateObject("Scripting.FileSystemObject")
            Set sourceFolder = fso.GetFolder(SOURCE_FOLDER)
            
            targetPath = sourceFolder.Path & "\" & originalFilename
            
            If fso.FileExists(targetPath) Then
                Set file = fso.GetFile(targetPath)
    
                file.Name = Replace(file.Name, ".pdf", ".tif")
            End If
        End If
    Next

End Sub

Open in new window


...but this does assume that all of the files are in that one SOURCE_FOLDER directory.
kaufmed, great stuff! We'll have to wait for the OP to clarify, but I have a different interpretation of this comment:

I need to take my excel spreadsheet of correct files names (.TIF)

I took this to mean that his Excel spreadsheet has filenames with extensions of TIF, not PDF, i.e.,

C:\path\to\folder\containing\files\abcde.tif

not

C:\path\to\folder\containing\files\abcde.pdf

Shouldn't be tough to change your code to handle it if that's the case. Regards, Joe
Ah, yeah I missed that. It shouldn't be too difficult. I'll rework it later tonight. Good eye  = )
kaufmed,
Your first set of code works a charm! Before you spend time reworking it, I'd be tempted to have the OP do a Replace All of ".tif" with ".pdf" in his spreadsheet and then run your first set of code as-is (a simple Replace All will work as long as no files have a second ".tif" anywhere in the file name). Regards, Joe
Avatar of satzm
satzm

ASKER

Wow, you guys are amazing: Yes, the spreadsheet has the absolute location of the PDF's and TIF files (the TIFs which are mistakenly named are commingled with the PDF's). The spreadsheet is saved in the same folder on the server as the PDF's/TIFs.
A few things:

(1) Are all of the entries in the spreadsheet named as TIF files? Or are they all named as PDF files? Or are there some of both?

(2) If there are some of both, should the files with PDF names simply be left unchanged? If not, what do you want to do with them?

(3) Please show one example of an entry in the spreadsheet (just copy/paste it here).

Thanks, Joe
Avatar of satzm

ASKER

Joe: In furtherance to your comment: "Before you spend time reworking it, I'd be tempted to have the OP do a Replace All of ".tif" with ".pdf" in his spreadsheet and then run your first set of code as-is (a simple Replace All will work as long as no files have a second ".tif" anywhere in the file name). Regards, Joe"

In the original spreadsheet, it had 13000 lines of both PDF and TIF files shown as image #s (xxxxxxx.TIF or xxxxxx.PDF (TIF or PDF)). Yet, notwithstanding the spreadsheet's reference names, the documents sent for the images were ALL formatted as PDFs. Thus, on an open command from the spreadsheet, the users received constant errors for the 3000 TIF images improperly saved as PDFs. I removed all the PDF calls from the spreadsheet and resaved a new spreadsheet with only TIF references. However, all the TIF files were improperly sent to us formatted as PDF (they are TIF's, they are just improperly named as PDF).

That's why I need to change only the 3000 incorrect images left on the server drive. Both PDF and TIF files are in the same location. The spreadsheet only has TIF references.

S:\legal\imagexxxxxx.pdf should be S:\legal\imagexxxxxx.tif on the
server; the spreadsheet already ONLY has tif image references. Both the spreadsheet for the names of the TIFs and TIF files themselves are in the same folder on the drive. Reminder: the drive contains both PDF and TIFs and only the TIFs need changing based on the spreadsheet file names.

The absolute reference for the files are in Column C and the original names of the files are in column D, i.e. C2: S:\legal\imagexxxxxx.pdf  D2: imagexxxxxx.pdf.

I put an entry on column C as"=CONCATENATE("S:\legal\", C2) and carried that through my list of 13000 entries which properly gave me the absolute locations of all the pdf/tiff's listed on the spreadsheet. I then deleted the PDF references and only the TIF references remain on the spreadsheet. Now, I need to rename the improper pdf files on the drive from that list to open them from the sheet. I hope this better explains my dilemma. Thank you experts massively for your assistance.
For starters, I'm going to suggest that you copy all files in S:\legal\ to a backup folder (perhaps S:\legal_save\) just in case something goes awry...that way, restoring S:\legal\ to its original state will be easy.

Now, to make sure there are no misunderstandings...you said:

"The spreadsheet only has TIF references."

BUT, if I'm understanding you right, the filenames in the spreadsheet say PDF, even though they are truly TIF files. So, for example, cell C2 has something like:

S:\legal\imagexxxxxx.pdf

AND

the server has a file called

S:\legal\imagexxxxxx.pdf

BUT

that file should really have the name

S:\legal\imagexxxxxx.tif

If all of these interpretations are correct, kaufmed's (brilliant!) script is what you want (the first one). It sounds as if row 1 is a header, since you mentioned C2 as containing the file name. So you want to change

Set startCell = Range("A1")

to

Set startCell = Range("C2")

or to

Set startCell = Range("C1")

if I'm wrong about row 1 being a header.

Also, make sure you change 3000 in

Set endCell = startCell.Offset(3000, 0)

to the correct number of files in the spreadsheet.

Also, you'll need to save the spreadsheet to a macro-enabled format (.xlsm). Then follow kaufmed's excellent instructions in his first post.

Before unleashing it on the thousands of files in S:\legal\, I suggest you try it on a few test files in a test folder. I did that here and kaufmed's code worked perfectly, but it's worth testing in your environment just to be sure. Regards, Joe
Avatar of satzm

ASKER

Just one correction to your astute observations: Spreadsheet has ONLY tiff's now, the pdf references are deleted. The Files on the server are both PDF and TIFs.
> Spreadsheet has ONLY tiff's now

Yes, that's what I thought, but the names of those files in the spreadsheet say PDF, not TIF. In other words,

(1) the spreadsheet says, S:\legal\imagexxxxxx.pdf

(2) the server has, S:\legal\imagexxxxxx.pdf

(3) but the file is really a TIF, so you want the file on the server to be renamed from S:\legal\imagexxxxxx.pdf to S:\legal\imagexxxxxx.tif

Right?
If, on the other hand, you're saying that the spreadsheet has, S:\legal\imagexxxxxx.tif (not S:\legal\imagexxxxxx.pdf), then kaufmed's code won't work as-is. But that's where my earlier suggestion comes in...instead of changing kaufmed's code, just do a Replace All in your spreadsheet of ".tif" with ".pdf" and then kaufmed's code will work as-is. Regards, Joe
I'd be tempted to have the OP do a Replace All of ".tif" with ".pdf"
Heheh. I was thinking the same thing when I left work this afternoon  = )
Avatar of satzm

ASKER

wow: This is funny :) No, there are NO pdf's showing in the spreadsheet. There are no calls to PDFs on the server. There are no columns on the spreadsheet showing pdfs, there are no rows that have pdfs... all pdfs are caput... gone. Poof. No more pdfs on the spreadsheet. Just tif references. :) lol. Tif references on spreadsheet ONLY (3000 left on that spreadsheet) in the same folder as 12000 files that are pdf and tiff's that are ALL called pdfs. (that was my poor attempt at humor...sorry).
OK, so as was mentioned:  Do a Find/Replace on your spreadsheet, and change ".tif" to ".pdf". That will set the spreadsheet to have the data that the code above is expecting. Then run the code as I instructed above. This should rename the extensions of the affected files. If you need to maintain the original spreadsheet, then just Find/Replace back to the original ".tif".
This was why I asked you to copy/paste one of the cell's contents here...to make things clear. The confusion is that some of the language made it seem that the cells contained

S:\legal\imagexxxxxx.pdf

not

S:\legal\imagexxxxxx.tif

But, great, now that we're clear, you're good to go. The only possible glitch with the Find/Replace All is if, as I mentioned earlier, a file name has a second ".tif" in it...extremely unlikely, but possible. In other words, if you have a file in the spreadsheet called

S:\legal\imagexxxxxx.tif.tif

or

S:\legal\tifimagexxxxxx.tif

or

S:\legal\TIFF_imagexxxxxx.tif

or

S:\legal\tiff.imagexxxxxx.tif

or

S:\legal\imagexxxxxx.tif.yyyy.tif

or

S:\legal\imagexxxxxx.tiff.tif

or anything with a second ".tif", you'll need to handle them separately. Otherwise, kaufmed's code and instructions are perfect! Regards, Joe
Avatar of satzm

ASKER

I'll try to make this a lot less confusing:

1. An image of the file structure; 2. An image of the XLSM(Spreadsheet) & 3 the code as I understand it so far when hitting Alt-f11. On alt-f5, so far, no changes. I deleted all the extra rows of entries (3138 additional rows) and all the extra PDF images on the server. This is done for demo purposes only. If we can get this, please remember there are 12000 files in the actual folder and 3139 TIF's listed on the spreadsheet. This is merely a backup created to try to make this work.
File-Structure.JPG
XLSM-image-Spreadsheet.JPG
VBA..JPG
Avatar of satzm

ASKER

Regretfully, the code runs, but it does not change the suffix of the PDF images on the server. No errors... it just does not do anything I can see.
The file name in A2 needs to end in ".pdf", not ".tif". That's the purpose of the Replace All in the big spreadsheet...to change all of the ".tif" to ".pdf" in the spreadsheet...then kaufmed's code works. So, for your test, in A2, replace ".tif" in the file name with ".pdf". Then it will work.
Hmmm....  I'll ask the obvious:  Do you have write permissions on that location? I just tested the logic against a network share, and I'm not seeing the same result.

User generated image
kaufmed,
I think the answer is even more obvious. :)
Avatar of satzm

ASKER

Joe: I think I see the confusion for this: we are NOT attempting to change the spreadsheet. The sheet is fine as is, i.e. PDF or TIF. The issue is I must change the server's DOCS to either PDF or TIF depending on the spreadsheet. There are 12000 DOCS of which 3139 need to be renamed per the spreadsheet to xxxxxxx.tif instead of xxxxxx.pdf which are located on the drive.

Kaufmed: Yes, I have full permissions on that shared folder, I also tried it locally and the same thing happened.
Avatar of satzm

ASKER

Please note on the pictures uploaded, Cell A2 on the spreadsheet is actually: =HYPERLINK(CONCATENATE("S:\Structured Settlement\Court Research Data\Harris County 2010-2013\backup\",B2)) and not Cell B2 which is 44313893.pdf. Would this make any difference?
Yes, yes! If you change the names in the spreadsheet to be ".pdf" then the code WILL change the name on server! It's simply the way the code works. When it finds the file name

S:\legal\imagexxxxxx.pdf

in the spreadsheet, it will then look for that file on the server and change its name to

S:\legal\imagexxxxxx.tif
I think it will be simpler if I just change the code  ; )

Give me a few minutes.
Sorry...I didn't notice the concatenate. The code works fine on hyperlinks. So all you need to do is, in your test spreadsheet, change B2 from

44313893.tif

to

44313893.pdf

And the code will work...it will change the name of the file in the file system.
Try this modified version of the code:

Sub RunMe()
    Set startCell = Range("A2")
    Set endCell = startCell.Offset(2, 0)
    
    For Each cell In Range(startCell, endCell)
        originalFilename = Replace(cell.Value, ".tif", ".pdf")

        If Not IsEmpty(originalFilename) And Len(originalFilename) > 0 Then
            Set fso = CreateObject("Scripting.FileSystemObject")
            
            If fso.FileExists(originalFilename) Then
                Set file = fso.GetFile(originalFilename)
    
                file.Name = Replace(file.Name, ".pdf", ".tif")
            End If
        End If
    Next

End Sub

Open in new window


(Line 6 is modified.)
Avatar of satzm

ASKER

Okay, just to check vba, I moved the code, the spreadsheet and the PDF file to local (same folder). Changed spreadsheet to PDF and ran the code; no go. It did not change the file name on the local drive from PDF to TIF. Any other ideas?
Avatar of satzm

ASKER

I even changed the entire spreadsheet so there is only 1 column, 1 row, A1 with the file name, xxxxxx.pdf to no avail. The matching PDF image on the local drive xxxxxx.pdf was not renamed to xxxxxx.tif. Halllp :( And by the way all, thank you soo much for the help thus far.
> Changed spreadsheet to PDF

Please post a screenshot of the spreadsheet, like the last one you posted.
> I even changed the entire spreadsheet so there is only 1 column, 1 row, A1 with the file name, xxxxxx.pdf to no avail.

Did you hit F5 within the block of code?
Avatar of satzm

ASKER

Code on alt-f11, ran code by alt-f5 in the code module.
Avatar of satzm

ASKER

Code:
Sub RunMe()
    Set startCell = Range("a1")
    Set endCell = startCell.Offset(2, 0)
   
    For Each cell In Range(startCell, endCell)
        originalFilename = cell.Value

        If Not IsEmpty(originalFilename) And Len(originalFilename) > 0 Then
            Set fso = CreateObject("Scripting.FileSystemObject")
           
            If fso.FileExists(originalFilename) Then
                Set file = fso.GetFile(originalFilename)
   
                file.Name = Replace(file.Name, ".pdf", ".tif")
            End If
        End If
    Next

End Sub
Put the fully qualified file name in A1 (e.g., c:\temp\44313893.pdf). Then it will work.
Avatar of satzm

ASKER

No joy on the renaming of the file. See attached picture.
Filestructure-spreadsheet-code.JPG
I haven't tried the new code. Just for fun, try the
original:
Sub RunMe()
    Set startCell = Range("A1")
    Set endCell = startCell.Offset(2, 0)

    For Each cell In Range(startCell, endCell)
        originalFilename = cell.Value

        If Not IsEmpty(originalFilename) And Len(originalFilename) > 0 Then
            Set fso = CreateObject("Scripting.FileSystemObject")

            If fso.FileExists(originalFilename) Then
                Set file = fso.GetFile(originalFilename)

                file.Name = Replace(file.Name, ".pdf", ".tif")
            End If
        End If
    Next

End Sub

Open in new window

I'll be stunned if it doesn't work.
Avatar of satzm

ASKER

I'd like to unstun you... but no joy. I changed the spreadsheet to show the file name as PDF; code as Sub RunMe()
    Set startCell = Range("A1")
    Set endCell = startCell.Offset(2, 0)

    For Each cell In Range(startCell, endCell)
        originalFilename = cell.Value

        If Not IsEmpty(originalFilename) And Len(originalFilename) > 0 Then
            Set fso = CreateObject("Scripting.FileSystemObject")

            If fso.FileExists(originalFilename) Then
                Set file = fso.GetFile(originalFilename)

                file.Name = Replace(file.Name, ".pdf", ".tif")
            End If
        End If
    Next

End Sub

No joy in changing the file name and then I also tried changing the spreadsheet back to a TIF reference. Again, no change on the saved file. I am just not seeing any pdf file renamed to a tif file.
I tested kaufmed's new code...works a charm here! Changed all of my spreadsheet's cells to have ".tif" in the file names. The new code then renames the file names in the file system from ".pdf" to ".tif" perfectly! Even works on a hyperlinked file created by the same concatenate technique that you used.

Great job by kaufmed!
In your last screenshot, if you step over the highlighted line (by using F8), does the program continue to run, or does it error/stop running? By "stop running" I mean that you don't have any lines highlighted any more.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of satzm

ASKER

Wow, you are correct; the original code worked simply awesome. It was indeed the issue of the Caps vs. non-capped PDF in the code. I changed the reference in the spreadsheet (find / replace all PDF to pdf, changed the location(s) to show a2 and 3190 and it worked awesome. Thank you so much experts you guys are fantastic. Joe and Kaufmed, I will split the points as I could not have done it without the code, and could not have done it if Joe had not identified the caps issue. Thanks once again Experts... this is why I joined this site, you guys are awesome!
Avatar of satzm

ASKER

Thanks Experts: Fantastic job!
That's great news! You're welcome...very happy to help...and very glad that it's working. The credit certainly rests squarely with kaufmed's excellent VBA code, something I could not have done (it is beyond my area of expertise). As is often the case, I learned more than I taught in this thread. :)   A big thanks and kudos to kaufmed! Regards, Joe
@joewinograd

I think you put in more work than I did on this thread  ; )

@satzm

Glad it's all sorted out for you now.
Well, I was some perspiration, but you were the inspiration. :)