Mike Caldwell
asked on
VBScript test for EOF of a PDF file never exits
Got great help with a VBS to test for the EOF of a PDF file (I sometimes receive flawed files and need to test them out). Here is the script that works fine:
So now I have my usual FTP program that pulls down files in batches of 100 and puts them in a folder, holding them there until tested. If good, go to one folder, if bad go to another. It never exits out of the DO loop. I put a counter in there and a test to exit if it hit a million loops, and it does. Doubt a PDF has over a million lines, plus the same file test in the blink of an eye with just the one-shot code above. Here is the code for doing the test, cut out of a bigger script file:
Set fso=CreateObject("Scripting.FileSystemObject")
Set objFile=fso.OpenTextFile("c:\PDFTest\test.pdf.jnk",1)
dim x
Do Until objFile.AtEndOfStream
x = objFile.ReadLine
If objFile.atEndOfStream Then
last_line = x
End If
Loop
if InStr(1, last_line, "%%EOF")=0 then
msgbox "File Corrupt"
Else
msgbox "File OK"
end if
So now I have my usual FTP program that pulls down files in batches of 100 and puts them in a folder, holding them there until tested. If good, go to one folder, if bad go to another. It never exits out of the DO loop. I put a counter in there and a test to exit if it hit a million loops, and it does. Doubt a PDF has over a million lines, plus the same file test in the blink of an eye with just the one-shot code above. Here is the code for doing the test, cut out of a bigger script file:
'****** MOVE ZIPPED PDF FILES FROM SERVER2 TO TEMP LOCATION TO LOG BY ZIP FILE ****************
ON ERROR RESUME NEXT
Const strFinalDest = "C:\PDF FLAT\" ' ABBYY pull bucket
Const strPDFBad = "C:\PDFBad\"
Const strPDFTest = "C:\PDFTest"
Set objFSO = CreateObject("Scripting.fileSystemObject")
set objFolder = objfso.GetFolder(strPDFTest)
For each objFile in objFolder.Files
Set objFile = objfso.OpenTextFile(strPDFTest & objFile.Name,1)
Do Until objFile.AtEndOfStream
x = objFile.ReadLine
If objFile.atEndOfStream Then
last_line = x
Else
End If
Loop
if InStr(1, last_line, "%%EOF")=0 then
Call objFile.Move(strPDFBad)
msgbox "Bad File in holding folder"
wscript.Quit
Else
Call objFile.Move(strFinalDest)
msgbox "Good file moved to PDF Flat"
end if
Next
Add a backslash at the end of the strPDFTest value.
'****** MOVE ZIPPED PDF FILES FROM SERVER2 TO TEMP LOCATION TO LOG BY ZIP FILE ****************
ON ERROR RESUME NEXT
Const strFinalDest = "C:\PDF FLAT\" ' ABBYY pull bucket
Const strPDFBad = "C:\PDFBad\"
Const strPDFTest = "C:\PDFTest\" 'add a backslash at the end here
Set objFSO = CreateObject("Scripting.fileSystemObject")
Set objFolder = objfso.GetFolder(strPDFTest)
For Each objFile IN objFolder.Files
Set objFile = objfso.OpenTextFile(strPDFTest & objFile.Name,1)
Do Until objFile.AtEndOfStream
last_line = objFile.ReadLine 'no need to save into x first.
Loop
If InStr(1, last_line, "%%EOF") = 0 Then
Call objFile.Move(strPDFBad)
MsgBox "Bad File in holding folder"
wscript.Quit
Else
Call objFile.Move(strFinalDest)
MsgBox "Good file moved to PDF Flat"
End If
Set objFile = Nothing
Next
Set ojbFolder = Nothing
Set ojbFSO = Nothing
ASKER
hielo, agree the slash is needed. But it has not changed anything.
yo_bee, just trying to get it to go through once; if once works, multiples should. And no, I do not get any messages pop up; seems to never exit the DO loop.
yo_bee, just trying to get it to go through once; if once works, multiples should. And no, I do not get any messages pop up; seems to never exit the DO loop.
ASKER
Also, I have the "QUIT" so that when I find a bad file I will know the file name and the ZIP file it was batched down in. The script would normally delete the file from the server, so I want to preserve whatever is there so that I can determine if the bad files are coming from the server that way or if I am somehow causing it locally. If I don't get any at the time of download and unzip for a few days, then I will put the same trap into the script that queues files up in 25 file batches for my four channels of OCRing. What started this all was the OCR program barfing on bad PDFs, so that channel would just stop.
ASKER
BTW: I am running this partial script now from the folder strPDFTest. There are 25 PDFs there, all known good. I never complete the loop, so none have been moved out. If I can get the good ones to move on, then I can put some known bad files in there and mix them up. This is a totally private system, that I own from end to end, so I can make whatever conditions I need.
ASKER
So I took out the ON ERROR RESUME NEXT, and now I am getting the error message "Object does not support this property or method" for this line item: objFile.Name
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Runs without errors. Pops up the message "Good file moved to PDF Flat", except nothing actually moves. So it keeps looping around, and the files test good, but not moved.
ASKER
I added a message to show the file name, and it is stepping through the file list. So now what is missing is the move.
ASKER
With ON ERROR RESUME NEXT removed, the objFile.Move gets the "Object doesn't support this property or method" message.
Try using the MoveFile() method of the FileSystemObject (objFSO):
http://www.devguru.com/technologies/vbscript/14073
http://www.devguru.com/technologies/vbscript/14073
ASKER
hielo, it seems to work but now I get "Permission denied." I am the only user of this machine, and I have other scripts that move files between folders with no complaints from Windows.
If I am not mistaken, you replaced:
with:
If so, then immediately after the loop, close the file before you attempt to move it:
http://blogs.technet.com/b/heyscriptingguy/archive/2008/04/15/how-can-i-use-windows-powershell-to-determine-the-owner-of-a-file.aspx
Lastly, if the problem persists, see if adding your user account to the administrators group takes care of the problem -- at least for the time being to get the program logic above working correctly. Afterwards you can go back and address the permission problems.
Call objFile.Move(strPDFBad)
with:
Call objFSO.MoveFile( "...","...")
If so, then immediately after the loop, close the file before you attempt to move it:
...
Do Until objFile.AtEndOfStream
last_line = objFile.ReadLine 'no need to save into x first.
Loop
objFile.Close
...
Also, see if the owner of the script is also the owner of the folders by using the Get-acl command:http://blogs.technet.com/b/heyscriptingguy/archive/2008/04/15/how-can-i-use-windows-powershell-to-determine-the-owner-of-a-file.aspx
Lastly, if the problem persists, see if adding your user account to the administrators group takes care of the problem -- at least for the time being to get the program logic above working correctly. Afterwards you can go back and address the permission problems.
ASKER
Here is my implementation:
'****** MOVE ZIPPED PDF FILES FROM SERVER2 TO TEMP LOCATION TO LOG BY ZIP FILE ****************
' ON ERROR RESUME NEXT
Const strFinalDest = "C:\PDF FLAT\"
Const strPDFBad = "C:\PDFBad\"
Const strPDFTest = "C:\PDFTest\" 'add a backslash at the end here
Set objFSO = CreateObject("Scripting.fileSystemObject")
Set objFSO2 = CreateObject("Scripting.fileSystemObject")
Set objFolder = objfso.GetFolder(strPDFTest)
If objFolder Is Nothing Then
MsgBox "Invalid folder"
Else
Set fc = objFolder.Files
For Each file IN fc
Set objFile = objfso.OpenTextFile(strPDFTest & file.Name,1)
Do Until objFile.AtEndOfStream
last_line = objFile.ReadLine
Loop
If InStr(1, last_line, "%%EOF") = 0 Then
objFSO.MoveFile strPDFTest & File.Name, strPDFBad
MsgBox "Bad File in holding folder"
wscript.Quit
Else
objFSO.MoveFile strPDFTest & File.Name , strFinalDest
MsgBox "Good file moved to PDF Flat"
End If
Set objFile = Nothing
Next
Set fc = Nothing
Set ojbFolder = Nothing
End If
Set ojbFSO = Nothing
ASKER
hielo, yes!! Works fine now. I'll take this code and drop it into the script that downloads the ZIP files and puts all the PDFs int strPDFTest, then on to strFinalDest. I am also adding an email part that will send an SMS to my phone when I get a bad PDF.
As a courtesy to someone that may read this I'm putting my final code here. Thanks a lot.
As a courtesy to someone that may read this I'm putting my final code here. Thanks a lot.
'****** MOVE ZIPPED PDF FILES FROM SERVER2 TO TEMP LOCATION TO LOG BY ZIP FILE ****************
' ON ERROR RESUME NEXT
Const strFinalDest = "C:\PDF FLAT\"
Const strPDFBad = "C:\PDFBad\"
Const strPDFTest = "C:\PDFTest\"
Set objFSO = CreateObject("Scripting.fileSystemObject")
Set objFSO2 = CreateObject("Scripting.fileSystemObject")
Set objFolder = objfso.GetFolder(strPDFTest)
If objFolder Is Nothing Then
MsgBox "Invalid folder"
Else
Set fc = objFolder.Files
For Each file IN fc
Set objFile = objfso.OpenTextFile(strPDFTest & file.Name,1)
Do Until objFile.AtEndOfStream
last_line = objFile.ReadLine
Loop
objFile.Close
If InStr(1, last_line, "%%EOF") = 0 Then
objFSO.MoveFile strPDFTest & File.Name, strPDFBad
MsgBox "Bad File in holding folder"
wscript.Quit
Else
objFSO.MoveFile strPDFTest & File.Name , strFinalDest
MsgBox "Good file moved to PDF Flat"
End If
Set objFile = Nothing
Next
Set fc = Nothing
Set ojbFolder = Nothing
End If
Set ojbFSO = Nothing
ASKER
Accepted solution needed a few minor touch ups, so anyone reading this should look at all of the steps.
Recommend add logging rather than msgbox.
ASKER
Actually I want it to stop and send me an SMS, both to me and my programmer / partner / son. That way we can capture the ZIP file with the bad PDF and see if the PDF was bad in the ZIP file. I'll keep moving the no-EOF trap downstream in the work flow until I can determine exactly where it is happening.
also why do you have a quit command in there only for corrupt file?