Link to home
Start Free TrialLog in
Avatar of sheld24
sheld24

asked on

InStr Search Question!

Hey,
My application is producing multi text files in a data dir...and...I like to scan all these files for multiply search words and return just the file names to a list box that can then be clicked and opened. I was wondering if InStr could be made to search for all multiply search words and ignore the files that do not contain ALL these words? Basically, 1) Open each file in data dir...2) scan for all the search words...3)If all the words exists to add the file and path to a list box...

   
Avatar of sheld24
sheld24

ASKER

Edited text of question.
Hi,

Why don't you use InStr for each word that is entered ?
you can count the number of words, keep a counter for each searched file and the ones which contain the same number as the number of words match your query...

Problem is of course that you have to go through each file as many as times as there are words in the query....

Regards
ASKER CERTIFIED SOLUTION
Avatar of mark2150
mark2150

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of sheld24

ASKER

This is really a stinker! I suppose then there is no better way with VB?
There are always better ways.

I assume you're trying to build a database of references in files. There has been a *LOT* of reasearch done on this topic.

Your approach, while direct and easy to code, is essentially "brute force" and will result in the longest execution times.

Typically a scanner doesn't repeatedly scan the entire file looking for key words, rather it runs thru the file once and builds a table of *all* words found (the smarter versions eliminate "noise" words like "the", "and", "to", "or", etc.) This results in a single pass thru the data instead of multiple passes.

The indexers will generally have a database of word frequencies and local occurrances. This is frequently tokenized so only a short pointer is stored instead of the full file name. You'll have to do a double index lookup to get the real file name, but the data set size will be manageable.

Remember: An ounce of algorithm optimization is worth a *pound* of hardware!

M