Link to home
Start Free TrialLog in
Avatar of Mike Caldwell
Mike CaldwellFlag for United States of America

asked on

VBScript test for EOF of a PDF file never exits

Got great help with a VBS to test for the EOF of a PDF file (I sometimes receive flawed files and need to test them out).  Here is the script that works fine:

Set fso=CreateObject("Scripting.FileSystemObject")
Set objFile=fso.OpenTextFile("c:\PDFTest\test.pdf.jnk",1)

dim x
    Do Until objFile.AtEndOfStream
      x = objFile.ReadLine
      If objFile.atEndOfStream Then
         last_line = x
      End If
    Loop

if InStr(1, last_line, "%%EOF")=0 then
      msgbox "File Corrupt"
Else
      msgbox "File OK"
end if

Open in new window


So now I have my usual FTP program that pulls down files in batches of 100 and puts them in a folder, holding them there until tested.  If good, go to one folder, if bad go to another.  It never exits out of the DO loop.  I put a counter in there and a test to exit if it hit a million loops, and it does.  Doubt a PDF has over a million lines, plus the same file test in the blink of an eye with just the one-shot code above.  Here is the code for doing the test, cut out of a bigger script file:

'******  	MOVE ZIPPED PDF FILES FROM SERVER2 TO TEMP LOCATION TO LOG BY ZIP FILE ****************	
ON ERROR RESUME NEXT

Const strFinalDest = "C:\PDF FLAT\"			' ABBYY pull bucket
Const strPDFBad = "C:\PDFBad\"
Const strPDFTest = "C:\PDFTest"

Set objFSO = CreateObject("Scripting.fileSystemObject")
	set objFolder = objfso.GetFolder(strPDFTest)
	For each objFile in objFolder.Files
	Set objFile = objfso.OpenTextFile(strPDFTest & objFile.Name,1)
    Do Until objFile.AtEndOfStream
      x = objFile.ReadLine
      If objFile.atEndOfStream Then
         last_line = x
	Else
      End If
    Loop

	if InStr(1, last_line, "%%EOF")=0 then
		Call objFile.Move(strPDFBad)
		msgbox "Bad File in holding folder"
		wscript.Quit
	Else
		Call objFile.Move(strFinalDest)
		msgbox "Good file moved to PDF Flat"
	end if
	Next

Open in new window

Avatar of yo_bee
yo_bee
Flag of United States of America image

do you get a single msgbox from the second script?

also why do you have a quit command in there only for corrupt file?
Add a backslash at the end of the strPDFTest value.

'******  	MOVE ZIPPED PDF FILES FROM SERVER2 TO TEMP LOCATION TO LOG BY ZIP FILE ****************	
ON ERROR RESUME NEXT

Const strFinalDest = "C:\PDF FLAT\"			' ABBYY pull bucket
Const strPDFBad = "C:\PDFBad\"
Const strPDFTest = "C:\PDFTest\"   'add a backslash at the end here

Set objFSO = CreateObject("Scripting.fileSystemObject")
Set objFolder = objfso.GetFolder(strPDFTest)

For Each objFile IN objFolder.Files
	Set objFile = objfso.OpenTextFile(strPDFTest & objFile.Name,1)
	Do Until objFile.AtEndOfStream
		last_line = objFile.ReadLine 'no need to save into x first.
	Loop

	If InStr(1, last_line, "%%EOF") = 0 Then
		Call objFile.Move(strPDFBad)
		MsgBox "Bad File in holding folder"
		wscript.Quit
	Else
		Call objFile.Move(strFinalDest)
		MsgBox "Good file moved to PDF Flat"
	End If
	Set objFile = Nothing
Next
Set ojbFolder = Nothing
Set ojbFSO = Nothing

Open in new window

Avatar of Mike Caldwell

ASKER

hielo, agree the slash is needed.  But it has not changed anything.

yo_bee, just trying to get it to go through once; if once works, multiples should.  And no, I do not get any messages pop up; seems to never exit the DO loop.
Also, I have the "QUIT" so that when I find a bad file I will know the file name and the ZIP file it was batched down in.  The script would normally delete the file from the server, so I want to preserve whatever is there so that I can determine if the bad files are coming from the server that way or if I am somehow causing it locally.  If I don't get any at the time of download and unzip for a few days, then I will put the same trap into the script that queues files up in 25 file batches for my four channels of OCRing.  What started this all was the OCR program barfing on bad PDFs, so that channel would just stop.
BTW: I am running this partial script now from the folder strPDFTest.  There are 25 PDFs there, all known good.  I never complete the loop, so none have been moved out.  If I can get the good ones to move on, then I can put some known bad files in there and mix them up.  This is a totally private system, that I own from end to end, so I can make whatever conditions I need.
So I took out the ON ERROR RESUME NEXT, and now I am getting the error message "Object does not support this property or method" for this line item:    objFile.Name
ASKER CERTIFIED SOLUTION
Avatar of hielo
hielo
Flag of Wallis and Futuna image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Runs without errors.  Pops up the message "Good file moved to PDF Flat", except nothing actually moves.  So it keeps looping around, and the files test good, but not moved.
I added a message to show the file name, and it is stepping through the file list.  So now what is missing is the move.
With ON ERROR RESUME NEXT removed, the objFile.Move gets the "Object doesn't support this property or method" message.
Try using the MoveFile() method of the FileSystemObject (objFSO):
http://www.devguru.com/technologies/vbscript/14073
hielo, it seems to work but now I get "Permission denied."  I am the only user of this machine, and I have other scripts that move files between folders with no complaints from Windows.
If I am not mistaken, you replaced:
Call objFile.Move(strPDFBad)

Open in new window


with:
Call objFSO.MoveFile( "...","...")

Open in new window


If so, then immediately after the loop, close the file before you attempt to move it:
...
		Do Until objFile.AtEndOfStream
			last_line = objFile.ReadLine 'no need to save into x first.
		Loop
		objFile.Close
...

Open in new window

Also, see if the owner of the script is also the owner of the folders by using the Get-acl command:
http://blogs.technet.com/b/heyscriptingguy/archive/2008/04/15/how-can-i-use-windows-powershell-to-determine-the-owner-of-a-file.aspx

Lastly, if the problem persists, see if adding your user account to the administrators group takes care of the problem -- at least for the time being to get the program logic above working correctly.  Afterwards you can go back and address the permission problems.
Here is my implementation:
'******  	MOVE ZIPPED PDF FILES FROM SERVER2 TO TEMP LOCATION TO LOG BY ZIP FILE ****************	
'   ON ERROR RESUME NEXT

Const strFinalDest = "C:\PDF FLAT\"		
Const strPDFBad = "C:\PDFBad\"
Const strPDFTest = "C:\PDFTest\"   'add a backslash at the end here

Set objFSO = CreateObject("Scripting.fileSystemObject")
Set objFSO2 = CreateObject("Scripting.fileSystemObject")
Set objFolder = objfso.GetFolder(strPDFTest)
If objFolder Is Nothing Then
	MsgBox "Invalid folder"
Else
	Set fc = objFolder.Files
	For Each file IN fc
		Set objFile = objfso.OpenTextFile(strPDFTest & file.Name,1)
		Do Until objFile.AtEndOfStream
			last_line = objFile.ReadLine 
		Loop
		If InStr(1, last_line, "%%EOF") = 0 Then
			objFSO.MoveFile strPDFTest & File.Name, strPDFBad
			MsgBox "Bad File in holding folder"
			wscript.Quit
		Else
			objFSO.MoveFile strPDFTest & File.Name , strFinalDest
			MsgBox "Good file moved to PDF Flat"
		End If
		Set objFile = Nothing
	Next

	Set fc = Nothing
	Set ojbFolder = Nothing
End If

Set ojbFSO = Nothing

Open in new window

hielo, yes!!  Works fine now.  I'll take this code and drop it into the script that downloads the ZIP files and puts all the PDFs int strPDFTest, then on to strFinalDest.  I am also adding an email part that will send an SMS to my phone when I get a bad PDF.

As a courtesy to someone that may read this I'm putting my final code here.  Thanks a lot.

'******  	MOVE ZIPPED PDF FILES FROM SERVER2 TO TEMP LOCATION TO LOG BY ZIP FILE ****************	
'   ON ERROR RESUME NEXT

Const strFinalDest = "C:\PDF FLAT\"		
Const strPDFBad = "C:\PDFBad\"
Const strPDFTest = "C:\PDFTest\"  
Set objFSO = CreateObject("Scripting.fileSystemObject")
Set objFSO2 = CreateObject("Scripting.fileSystemObject")
Set objFolder = objfso.GetFolder(strPDFTest)
If objFolder Is Nothing Then
	MsgBox "Invalid folder"
Else
	Set fc = objFolder.Files

	For Each file IN fc
		Set objFile = objfso.OpenTextFile(strPDFTest & file.Name,1)
		Do Until objFile.AtEndOfStream
			last_line = objFile.ReadLine 
		Loop
		objFile.Close

		If InStr(1, last_line, "%%EOF") = 0 Then
			objFSO.MoveFile strPDFTest & File.Name, strPDFBad
			MsgBox "Bad File in holding folder"
			wscript.Quit
		Else
			objFSO.MoveFile strPDFTest & File.Name , strFinalDest
			MsgBox "Good file moved to PDF Flat"
		End If
		Set objFile = Nothing
	Next

	Set fc = Nothing
	Set ojbFolder = Nothing
End If

Set ojbFSO = Nothing

Open in new window

Accepted solution needed a few minor touch ups, so anyone reading this should look at all of the steps.
Recommend add logging rather than msgbox.
Actually I want it to stop and send me an SMS, both to me and my programmer / partner / son.  That way we can capture the ZIP file with the bad PDF and see if the PDF was bad in the ZIP file.    I'll keep moving the no-EOF trap downstream in the work flow until I can determine exactly where it is happening.