Solved

Fastest way to search 5000 Excel Files in a directory

Posted on 2002-07-29
8
146 Views
Last Modified: 2012-05-04
I have 5000 Excel files in a directory.  I need to go through all of them and search for some text and change a cell if the text is found.  I'm looking for an alternative to my current method of Automation, where I open, search and close each file.  Openning and closing each file takes to long.
0
Comment
Question by:jbauer22
8 Comments
 
LVL 1

Expert Comment

by:ophirg
Comment Utility
Hi.

I have something that might work, but only in some cases.

Try refering each cell in some sheet like this:

='full path[filename.xls]sheet name'!A1

The excel appears to be doing a much better job in openning files.

Ophir.
0
 
LVL 17

Accepted Solution

by:
inthedark earned 150 total points
Comment Utility
You must have a reason for not using a database?

But if you quikly find all of the files that contain the text, you can then use your current method to change the file. This method should take just a few seconds, but will increas if loads of files have to be changed.

If the files are not encrypted you could find the text this way:

redim fls$(0)

dim fl as long
fim flc as long

' read all file names
fl=-1
D$=Dir(YourFolder + "\*.XLS")
Do While Len(D$)
   fl=fl+1
   if fl>ubound(fls) Then
     redim preserve fls(fl+1000)
   end if
   fls(fl)=d$
   D$ = Dir() ' ** warning this can fail
Loop

if fl<0 then
  msgbox "No files found"
  exit sub ?
end if

redim preserve fls(fl)

For FLc = 0 to fl
   d$=fls(flc)
   DT$ = ReadFile(D$)
   If Instr(ucase$(D$),SearchText)>0 then
     CallChangeThisFile YourFolder + "\" + D$
   End if
Next

etc.



Public Function ReadFile(filename As String) As String

Dim wlfn As Long

wlfn = FreeFile
On Error Resume Next

If Len(Dir(filename)) > 0 Then
    If Err.Number Then
        ReadFile = "Invalid resource file path: " + filename
        Exit Function
    End If
   
    Open filename For Input Shared As #wlfn
    ReadFile = Input(LOF(wlfn), wlfn)
    Close wlfn
Else
    ReadFile = "ERROR"
End If

End Function
0
 
LVL 17

Expert Comment

by:inthedark
Comment Utility
woops....

I used d$, should have been dt$
 but you get the drift of fast search

For FLc = 0 to fl
  d$=fls(flc)
  DT$ = ReadFile(D$)
  If Instr(ucase$(DT$),SearchText)>0 then ' corrected    CallChangeThisFile YourFolder + "\" + D$
  End if
Next

0
 
LVL 49

Expert Comment

by:Ryan Chong
Comment Utility
The faster way to search files in folders should be using API, take a look for the example below:

http://www.mvps.org/vbnet/code/fileapi/recursivefiles_minimal.htm

By the way, the time to Open, Read then Close the Excel files may not able to be faster as it's handle by Excel Object Library.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 4

Expert Comment

by:mcoop
Comment Utility
Are you looking for cells with specific text, or filenames with specific text...

For data wihin cells...

The excel 'find' function is very fast when called from VB - via automation.    On the other hand you do have to open every file....  Although 5000 files isn;t that many - I'd say around an hour to open and process on a fast machine... (of course saving the file after the change will double this time - or more)

I suppose if someone has some code that understands the raw Excel file format, you may have some luck there - but will likely have problems when each new version of Excel is released.

0
 
LVL 17

Expert Comment

by:inthedark
Comment Utility
There is another way which would only take a few seconds to run. But would work better is most of the files did not change to often.

1) Load each Excel document, then save as a CSV file - this sould only take a split second.

2) Scan the csv file and extract all of the words and save then in a database. You would need 3 tables: documents, words and word by document index.  You would save each word in a table so it could have a numeric key to enable very fast searching. Give each document a numeric key too.  If the scan finds the same document already exists, revise the previous words. Then Save the date last changed in the table.

3) Create a service which checks the date/time last changed, recent changes can then be update in the database.

Each word would be given a numeric key. As the word and document index key would only be 8 bytes you could search  millions of words in a split second.  It would only take a few minutes to create the code.

Just a thought......
0
 
LVL 4

Expert Comment

by:mcoop
Comment Utility
On deeper thought... work on a long term solution...
Migrate to a database.
0
 
LVL 17

Expert Comment

by:inthedark
Comment Utility
I'd go with mcoop's suggestion....but bosses can sometimes only see one colour.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Article by: Martin
Here are a few simple, working, games that you can use as-is or as the basis for your own games. Tic-Tac-Toe This is one of the simplest of all games.   The game allows for a choice of who goes first and keeps track of the number of wins for…
You can of course define an array to hold data that is of a particular type like an array of Strings to hold customer names or an array of Doubles to hold customer sales, but what do you do if you want to coordinate that data? This article describes…
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now