Link to home
Start Free TrialLog in
Avatar of dominica526
dominica526

asked on

What is the best way to find a needle in a haystack?

Hello Experts,

I have a folder that is filled with files. At least 3 gigs of files to be precise. I am looking for just one file and I use For Each filecollection to wade through this huge folder in my VB application. It takes about 7 minutes to complete the process of finding the file. The file that I am looking for is among about five files that were created today, or I should say the current day. This process is done everyday. Is there a way that I can just sort through the current days files and not have to go through the big pile of files? I cannot clean-out the other files so Im looking for a better way to find a needle in a haystack. Once my app find a file that matches my filter I use FSO to give me the last date modified then I compare it to the current system date, works just fine but my app takes all the CPU and just seems inefficient.  Any ideas would be greatly appreciated.    

Thanks Experts
Avatar of [ fanpages ]
[ fanpages ]

Hi,

You need to sort the list of files in descending date order and as soon as the date of the file being checked is before the current date, then you know not to check any more files (and exit from your loop).

Can you pre-sort your filecollection before processing?

If not, maybe you can use a Shell() function to execute a "DIR /a-d /o-d" command & redirect the output to a file, then read the contents of this file.

e.g.

Shell("DIR /a-d /b /od > c:\filelist.txt")

Then read each line of c:\filelist.txt (skipping the 'header' information about the current folder), checking the prefix on each line of the date of each file.  When the date is less than the current date, stop reading the file.  Until that point, extract the filename from the same line of the file & process accordingly.

BFN,

fp.
Avatar of dominica526

ASKER

Hello BFN,

I have no idea how to pre-sort my filecollection. Could you expound on that idea?
BFN,

I can run the dir from the command line, but when I run it with the shell in VB I get a file not found error.
Hi,

Re: Sorting filecollection

Please explain how your filecollection is created, then I can best advise on how to sort it.  Provide the relevant section of code would help.

Re: Shell

Try this..
SHELL("cmd /c DIR c:\*.* /a-d /o-d >> c:\filelist.txt")

Then review the content of the file "c:\filelist.txt".

We can then read this file searching for dates in the first 10 characters of each line, and if found, check that the date is not before today's date.

Shall I continue with this suggestion?


B)ye F)or N)ow,

f)an p)ages
Do you have the filename?
If so, you can use the FileExists method of the FileSystemObject and check the date.

if fso.fileexists(filename) then ...
This part of the code takes several minutes to run, it gets me the correct file but it is slow....


Set Folder = fso.GetFolder("r:\ENXDDPC\reports")
Set FileCollection = Folder.FILES
'Convert date
Dim TheDate As Date
Dim mymon
Dim myday
Dim mymonHEX As String
Dim mydayHEX As String
Dim myyear As Date
mymon = Month(Now)
myday = Day(Now)

Select Case mymon
        Case 10
        mymonHEX = "A" 'October
        Case 11
        mymonHEX = "B" 'November
        Case 12
        mymonHEX = "C" 'December
    Case Else
        mymonHEX = mymon
    End Select

Select Case myday
        Case 1
        mydayHEX = "01" 'October
        Case 2
        mydayHEX = "02" 'October
        Case 3
        mydayHEX = "03" 'October
        Case 4
        mydayHEX = "04" 'October
        Case 5
        mydayHEX = "05" 'October
        Case 6
        mydayHEX = "06" 'October
        Case 7
        mydayHEX = "07" 'October
        Case 8
        mydayHEX = "08" 'October
        Case 9
        mydayHEX = "09" 'October
        Case Else
        mydayHEX = myday
    End Select

FileGo = ""
Filt4 = "[BLBLST]\d\d." & mymonHEX & mydayHEX

For Each file In FileCollection
spoot4 = file
Set f = fso.GetFile(spoot4)
Filetime = f.DateLastModified
FileTimeClean = Left(Filetime, 10)
FileTimeCleanClean = Replace(FileTimeClean, "/", "")
currentime = Replace(Date, "/", "")
If currentime = FileTimeCleanClean Then
gogo4 = TestRE(Filt4, spoot4, True)
If gogo4 = True Then
FileTimeClean = Right(Filetime, 8)
FileTimeCleanClean = Replace(FileTimeClean, ":", "")
currentime = Replace(Time, ":", "")
If FileTimeCleanClean < NOONcutoff And FileTimeCleanClean > NOONStart Then
clnspoot4 = Right(spoot4, 12)
RunNum2 = Mid(clnspoot4, 7, 2)
LatestRpt = clnspoot4
Call Fil(Og, T & " " & "The Final FILE" & " " & LatestRpt)
Exit For
End If
End If
End If
Next

Sorry... you'll have to explain what you are trying to do.

Why is every day in October, for instance?  Or are you just trying to prefix all one-digit numbers with a "0"?


Perhaps if you detail what you are trying to achieve with your code, then we can look at better approaches.  Please don't forget to mention the naming convention that your filenames take.

Thanks.

BFN,

fp.
Try setting up an Access database with a table that contains a record id, a filename field and a filepath field.  Write a maintenance program that loads new filenames into it overnight and let that run under the control of your system scheduler.

Then change your VB app to query by filename, then build the exact path to the file by concatenating the filepath and filename fields.  Voila!
Oopps! Sorry about the October comments that confused the issue. The files named like this: BLBLST01.413 BLBLST is static where the 01 dynamically states the file generation. 413 represents the current date and yes I need to place a zero for single digit days because the last two digits belong to the current day and a single digit is held by current month requiring me to use A, B and C to represent Oct, Nov and Dec. Now that&#8217;s out of the way on to my for each collection may look a bit messy but I assure you that it works.  

It is a simple loop through the folder \reports except that I stop and take a look at each file to get the date modified. Then I must clean-up the date so that it looks like this: 04132005  and can be compared to currentime. I just realized that I could clean-up my For Each loop by running:

 currentime = Replace(Date, "/", "")

Just once before the For Each starts. Anyhoo, I&#8217;m looking for the latest generation of the BLBLST report files in a mammoth folder filled with old BLBLST report files. My method does in fact work. I find the latest BLBLST file everyday. I&#8217;m just looking for ways to do is smarter. And by the way, I cannot guess what the generation will be it depends on the users. Most times its one or two generations like BLBLST02.413 but sometimes it can be BLBLST29.413 so my criteria is simple BLBLST and the current latest date and time the unknown is just the generation.
Hi again,

My head hurts now.

Let's go back to what we were discussing first.

I never actually gave you any code that returned a sorted list of files.

Please try this code below, noting that it presently returns (to the Immediate "Debug" window) all files in the "c:\windows" folder, for the current date.

Note that the FileSystemObject is used first to query all the files, then a new (ADO) Recordset is created & each file is stored as a new record.  The Recordset is then sorted into descending 'DateCreated' order, and returned to the calling process where each record can be read sequentially until the 'DateCreated' is less than the current date.

If you amend the functionality of the sample "Test" sub-routine to match your requirements you will find this approach much faster than what you have currently.

' Begin code...

Option Explicit
Public Function objGet_Sorted_File_List(ByVal strFolder As String) As Object

' ----------------------------------------------------------------------------------------------
' Experts Exchange Question:
' https://www.experts-exchange.com/questions/21385046/What-is-the-best-way-to-find-a-needle-in-a-haystack.html
'
' Copyright:    (c) 2005 Clearlogic Concepts (UK) Limited
' Author:       "fanpages"      [http://NigelLee.info]
' Dated:        14 April 2005
' ----------------------------------------------------------------------------------------------

  Dim objFile                                           As Object
  Dim objFSO                                            As Object
  Dim objFSO_Folder                                     As Object
  Dim objFSO_Folder_Files                               As Object
  Dim objRecordset                                      As Object
 
  On Error GoTo Err_objGet_Sorted_File_List
 
  Const adDate                                          As Integer = 7
  Const adVarChar                                       As Integer = 200
 
  Set objRecordset = CreateObject("ADODB.Recordset")
 
  If Not (objRecordset Is Nothing) Then
     Set objFSO = CreateObject("Scripting.FileSystemObject")
  End If
 
  If Not (objFSO Is Nothing) Then
     Set objFSO_Folder = objFSO.GetFolder(strFolder)
  End If
 
  If Not (objFSO_Folder Is Nothing) Then
     Set objFSO_Folder_Files = objFSO_Folder.Files
  End If
 
  If Not (objFSO_Folder_Files Is Nothing) Then
     objRecordset.Fields.Append "Name", adVarChar, 255
     objRecordset.Fields.Append "DateCreated", adDate
     
     objRecordset.Open
     
     For Each objFile In objFSO_Folder_Files
         objRecordset.AddNew
         objRecordset("Name") = objFile.Name
         objRecordset("DateCreated") = objFile.DateCreated
         objRecordset.Update
     Next objFile
     
     objRecordset.Sort = "DateCreated DESC"
     objRecordset.MoveFirst
     
     Set objGet_Sorted_File_List = objRecordset
  End If
 
Exit_objGet_Sorted_File_List:

  On Error Resume Next
 
  Set objFile = Nothing
  Set objFSO_Folder_Files = Nothing
  Set objFSO_Folder = Nothing
  Set objFSO = Nothing
  Set objRecordset = Nothing
 
  Exit Function
 
Err_objGet_Sorted_File_List:

  On Error Resume Next
 
  Set objGet_Sorted_File_List = Nothing
 
  Resume Exit_objGet_Sorted_File_List
 
End Function
Public Sub Test()

' The FileSystemObject is used first to query all the files in the specified folder, "c:\windows".
' A new (ADO) Recordset is created & each file is stored as a new record
' The Recordset is then sorted into descending 'DateCreated' order, and returned to the calling process
' Each record is read sequentially until the 'DateCreated' is less than the current date

  Dim blnWend                                           As Boolean
  Dim objFiles                                          As Object
 
  On Error Resume Next
 
  Set objFiles = objGet_Sorted_File_List("c:\windows")
 
  objFiles.MoveLast
  objFiles.MoveFirst
 
  blnWend = (objFiles.EOF)
 
  While Not (blnWend)
 
      If Int(objFiles("DateCreated")) = Int(Now()) Then
         Debug.Print objFiles("Name"), objFiles("DateCreated")
         
         objFiles.MoveNext
         blnWend = (objFiles.EOF)
      Else
         blnWend = True
      End If
     
  Wend
 
  Set objFiles = Nothing
 
End Sub

' ...End code


BFN,

fp.


Fanpages,

Thanks for the code to try, I will get right on it. ADO is new to me but I am excited to try it. Question! Once this code is run how do I get to the database or table to extract my file. Currently I am using the For Each collection loop and I resolve my criteria on each and every loop until I find the file. With your solution will I then go through the ADO records or can I get my target while its running? I will step through it and perhaps find my own answer.  

Hi,

Please try this subroutine, "Test()", above.

This line creates an ADO recordset of files (in descending DateCreated order):

Set objFiles = objGet_Sorted_File_List("c:\windows")

There is then a While...Wend loop that reads through each record in the recordset.  Within the loop is an IF statement to test if the DateCreated value of the present record is equal to the current date, and if it is, then the Filename & DateCreated values are displayed in the VBA "Immediate" (i.e. Debug) window.  You would replace this code with whatever you need to do to the individual file(s).

Otherwise, the DateCreated value must be less than the current date & hence the While...Wend loop need not progress reading through the records as all the other records will have DateCreated values before the current date also.

I guess to be absolutely sure you should test for DateCreated values greater than the current date, just in case of rogue files that may have been created on machines with incorrect system date settings or, indeed, simply if you create a file, then set the system date back a day on your own machine.  However, I suggest you start with the "Test()" routine & amend it to suit your needs.

BFN,

fp.
I seem to be having a problem with this line,

Set objFiles = objGet_Sorted_File_List("c:\windows")

When I step past this line and go to

Public Function objGet_Sorted_File_List(ByVal strFolder As String) As Object

strFolder is a null! makes no sense to me why this would not get picked up. Am I missing a reference? I added scripting but I don't think this is the problem?


ASKER CERTIFIED SOLUTION
Avatar of [ fanpages ]
[ fanpages ]

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi,

Glad you got the issue resolved.  Was it complicated?

Thanks for the points/grading.

BFN,

fp.