We help IT Professionals succeed at work.

Looking for a quicker way to check files

APS NZ
APS NZ asked
on
194 Views
Last Modified: 2014-11-11
We have several folders containing 3000+ files each.  They contain scanned pdfs.  Some of our staff have been scanning in blank pages to get around the system, and I have just finished going through them all by thumbnailing them via CFPDF and marking the offending files.  I have all the filenames (good and bad) in a table.

I now need to keep up this task, looking at all new files added to the folders each week.  I thought of the dateLastModified of CFDirectory, but some of the files will have old dates on them.  Is there a quicker/better way (instead of going through a CFDirectory listing and querying the table for each file to see if it has already been processed)?
Comment
Watch Question

CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
Unfortunately, I don't think there's a simple method for doing this.

Edit: I agree that querying file by file is not the way to go. What I've done in the past is to generate a list of file names and save it to a .txt file. Then use db tools (like sql server's BULK INSERT) to import the file into a temporary table. Finally, use a query to compare the file names and find out what's missing.  It sounds more complicated than it really is.  I think I posted an example of this somewhere.  I'll see if I can find it.
APS NZIT Team

Author

Commented:
Thanks agx - a sample would be great
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
Quick questions -
1) I forgot, are you on windows or *nix?
2) Do you have cfexecute access on your server? While you could generate the file list with cfdirectory, o/s commands like LS or DIR are faster.
3) Which db - SQL Server or MySQL?
APS NZIT Team

Author

Commented:
We are on windows and using SQL server 2008 r2
Yes, I have access to cfexecute
CERTIFIED EXPERT
Most Valuable Expert 2015
Commented:
This problem has been solved!
(Unlock this solution with a 7-day Free Trial)
UNLOCK SOLUTION
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
BTW, the above is a quick and dirty example to demonstrate the concept. In real code, I'd put this logic in a stored procedure and pass in the file name as a parameter:
APS NZIT Team

Author

Commented:
WOW! Thanks agx - I will test and come back soon
APS NZIT Team

Author

Commented:
I am getting the following error on line 30

Error Executing Database Query.  
[Macromedia][SQLServer JDBC Driver][SQLServer]Unknown object type '#TempImport' used in a CREATE, DROP, or ALTER statement.  

<!--- query to find file names that do NOT exist in the existing db table --->
29 :             SELECT TOP 100 tmp.*
30 :             FROM   ##TempImport tmp LEFT JOIN CF_Temp#Sfx#.dbo.CheckPPTbl main
31 :                               ON main.FileName = tmp.FileName
32 :             WHERE  main.FileName IS NULL;
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
(Edit) Oh wait... I pasted the wrong version. My bad.  It didn't seem to like the UUID file name, so I used this instead

<cfset saveToFile = ExpandPath("./fileNames.txt")>
APS NZIT Team

Author

Commented:
Thanks very much agx - it works now, and I have made it a stored procedure.
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
Cool.  You probably figured this out already :) but .. for the archives, to use dynamic file names you need to build a SQL string and execute it via exec or sp_executesql. See the BOL for an example
APS NZIT Team

Author

Commented:
Thanks agx - I looked at that BOL page while I was getting things to work.