Anyone who has worked with database applications for any length of time has developed forms for doing complex filters. These usually involve comboboxes, multi-select listboxes, and the occasional checkbox. These controls are great for filtering on fixed length fields with specific values, but when it comes to filtering text strings, or memo fields, we (I) have generally settled for allowing users to type a couple of keywords into a textbox. We then build criteria strings which parse these key words and concatenate them together with a series of OR statements, resulting in a criteria which looks something like:
[Field1] Like “*Keyword1*” OR [Field1] Like “*Keyword2*” OR [Field1] like “*Keyword3*”
Unfortunately, this is not very flexible, and when we get the results, we realize that the criteria is too broad, or not broad enough, so we play around with it a little, and eventually settle on a criteria string which is decidedly not what we had hoped for. Well, not anymore!
I recently had a requirement to provide a client with the ability to search three different text fields for a variety of keywords, including being able to group them (keyWord1 AND keyWord2) and to exclude words (AND NOT keyWord3) from the result set as. Some of the more advanced web search engines can to this, and I’m sure they use a similar technique. Basically, I created a function which will accept a string of keywords and an array of fields to which the keywords should be applied. It will parse those words (including parenthesis and + and – operators to indicate And or NOT) and build a complex criteria string.
Lets assume that you need to search a database of magazine articles, and want to identify all of the articles that include the term “Health” and either “diabetes” or “cancer”, but which do not include the term “colon”. With this function, you can enter the following string in the “Search for” textbox on a form.
Health +(diabetes cancer) –colon
Then, with the click of a command button, set the filter property of the form to the return value of my function.
Private Sub cmd_Filter_Click
Me.Filter = fnTextFilter(Me.txt_SearchFor, true, "memoField")
Me.FilterOn = (Len(me.Filter & “”) > 0)
When run, with this input string, my function returns a criteria string that looks like the following:
([Field1] Like “*Health*”) AND ([Field1] Like “*diabetes*” OR [Field1] Like “*cancer*”) AND NOT ([Field1] Like “*colon*”)
How it works:
The function is actually quite simple, it accepts a string of keywords, parenthesis, and symbols (+ or -), parses that string into an array, and then processes the individual elements of the array, building the criteria string one element at a time. Words that are preceded by a plus sign (+) are be interpreted as “AND [FieldName] LIKE”; words preceded by a minus sign (-) are be interpreted as “AND NOT [FieldName] LIKE”; and words preceded by a space or a left parenthesis will be interpreted as “OR [FieldName] LIKE”. The challenge is ensuring that when you start stringing Boolean operators together, you understand what you are asking for. For example:
diabetes cancer –colon
will be interpreted by my function as:
([Field1] Like “*diabetes*”) OR ([Field1] Like “*cancer*”) AND NOT ([Field1] LIKE “*colon*”)
But because of the precedence of AND over OR in Boolean logic, this will actually be interpreted by SQL or JET as:
([Field1] Like “*diabetes*”) OR ([Field1] Like “*cancer*” AND NOT [Field1] LIKE “*colon*”)
So if what you really want is articles that contain the term diabetes or cancer, but not colon, then you need to enter the keywords with parentheses, as follows:
(diabetes cancer) –colon
The examples I have shown up until this point use the following syntax to identify those records where the keyword is embedded in the text being searched.
[Field1] Like “*keyWord1*”
However, because of the all inclusive asterisk (*) as the prefix and suffix of the keywords, this can result in unexpected results . With this syntax, if you are searching for the word "and", you will also select records that contain the words “stand”, “band”, “brand”, "demand", …
In order to prevent this, I included a WholeWord parameter in the function. This parameter allows you to identify whether you want only those instances where the keyword stands alone within the text. If the WholeWord parameter is set to True, then the syntax of the query will look like:
[Field1] Like “*[ .!,;?(]keyword1[ .!,;?)]*”
There are probably other prefix and suffix characters that should be included within the brackets, but these are the only ones that came to mind when I wrote the function. If you think of others, please post a comment to this article.
Occassionally, you will want to search for a sequence of two or more words: "colon cancer". To do this, you could enter "colon +cancer" as your search string, but this would only ensure that both words are found in the record, not that they are sequential. So, the way to approach this problem is to embed wildcards (? or *) within your search string. If you want the words to be sequential, as in "colon cancer", then enter a search string that separates the two words with a question mark: "colon?cancer". If you just want to ensure that "colon" precedes "cancer" (I'm sure there is a good reason for this, but cannot think of one at the moment), then use: "colon*cancer" as your search string.
The other truth I hit upon while developing this function is that in the past, if I had multiple text/memo fields that I wanted to check for the same values, I would most likely have written:
[Field1] Like “*keyWord1*” OR [Field2] Like “*keyWord1*” OR [Field3] Like “*keyWord1*”
But as I was writing this function, I realized that if the fields I want to check are all text or memo fields, I can string them together like the following, and save a lot of work.
([Field1] & " " & [Field2] & " " & [Field3]) Like “*keyWord1*”
When I came to this brilliant conclusion, I modified the function to accept an array of fields, rather than just a single field. The code will then concatenate each of these fields together and apply the criteria to the combined set of fields.
The attached file contains a table that contains a single memo field composed of strings of randomly selected words strung together to create text strings of 20-50 words. The text in each record makes absolutely no sense, but it will provide you with a way to see this function in action. The file also contains a form (frm_Filter_Memos) which contains:
1. A textbox for defining your search criteria
2. A combo box which lists all of the words that are used, and the number of records that contain that word; selecting a word from this combo box will add it to the search string. I would not normally put this on a form, but did so to allow you to see how many occurrences of a particular word you should find when you use that word in the filter.
3. Command buttons for setting and clearing your filter
4. A textbox which displays the records (filtered or unfiltered)
5. Several labels in the footer which indicates how many records are in the unfiltered and filtered recordsets.
6. Another textbox in the form footer with shows you the criteria string that is being implemented when you click the Filter button.
I hope you find this article useful. In my next article, I’ll take this process one step further and show you how to add rich text formatting based on your keywords, making it significantly easier to find the keywords in your form or report.