Link to home
Start Free TrialLog in
Avatar of APS NZ
APS NZFlag for New Zealand

asked on

Looking for a quicker way to check files

We have several folders containing 3000+ files each.  They contain scanned pdfs.  Some of our staff have been scanning in blank pages to get around the system, and I have just finished going through them all by thumbnailing them via CFPDF and marking the offending files.  I have all the filenames (good and bad) in a table.

I now need to keep up this task, looking at all new files added to the folders each week.  I thought of the dateLastModified of CFDirectory, but some of the files will have old dates on them.  Is there a quicker/better way (instead of going through a CFDirectory listing and querying the table for each file to see if it has already been processed)?
Avatar of _agx_
_agx_
Flag of United States of America image

Unfortunately, I don't think there's a simple method for doing this.

Edit: I agree that querying file by file is not the way to go. What I've done in the past is to generate a list of file names and save it to a .txt file. Then use db tools (like sql server's BULK INSERT) to import the file into a temporary table. Finally, use a query to compare the file names and find out what's missing.  It sounds more complicated than it really is.  I think I posted an example of this somewhere.  I'll see if I can find it.
Avatar of APS NZ

ASKER

Thanks agx - a sample would be great
Quick questions -
1) I forgot, are you on windows or *nix?
2) Do you have cfexecute access on your server? While you could generate the file list with cfdirectory, o/s commands like LS or DIR are faster.
3) Which db - SQL Server or MySQL?
Avatar of APS NZ

ASKER

We are on windows and using SQL server 2008 r2
Yes, I have access to cfexecute
ASKER CERTIFIED SOLUTION
Avatar of _agx_
_agx_
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
BTW, the above is a quick and dirty example to demonstrate the concept. In real code, I'd put this logic in a stored procedure and pass in the file name as a parameter:
Avatar of APS NZ

ASKER

WOW! Thanks agx - I will test and come back soon
Avatar of APS NZ

ASKER

I am getting the following error on line 30

Error Executing Database Query.  
[Macromedia][SQLServer JDBC Driver][SQLServer]Unknown object type '#TempImport' used in a CREATE, DROP, or ALTER statement.  

<!--- query to find file names that do NOT exist in the existing db table --->
29 :             SELECT TOP 100 tmp.*
30 :             FROM   ##TempImport tmp LEFT JOIN CF_Temp#Sfx#.dbo.CheckPPTbl main
31 :                               ON main.FileName = tmp.FileName
32 :             WHERE  main.FileName IS NULL;
(Edit) Oh wait... I pasted the wrong version. My bad.  It didn't seem to like the UUID file name, so I used this instead

<cfset saveToFile = ExpandPath("./fileNames.txt")>
Avatar of APS NZ

ASKER

Thanks very much agx - it works now, and I have made it a stored procedure.
Cool.  You probably figured this out already :) but .. for the archives, to use dynamic file names you need to build a SQL string and execute it via exec or sp_executesql. See the BOL for an example
Avatar of APS NZ

ASKER

Thanks agx - I looked at that BOL page while I was getting things to work.