• Status: Solved
• Priority: Medium
• Security: Public
• Views: 373

pattern matching

I need help explaining a piece of code from my book; it has some notes, but not enough for me to completely understand it.  I'd like a more detailed description of what each line means.  The code is as follows:

BOOL MatchPattern(const char str[], const char pat[])

{
BOOL res;

//Validating the pointers
if((str == NULL && pat != NULL) ||
(str != NULL && pat == NULL) )
return FALSE;
else if(str == NULL && pat == NULL)
return TRUE;

if((str != NULL &&  *str == '\0' && pat != NULL && *pat != '\0') ||
(str != NULL &&  *str != '\0' && pat != NULL && *pat == '\0') )
return FALSE;
switch(*pat)
{
case '*':        //check the next one with zero advance or one advance
res = MatchPattern(str, pat + 1) || MatchPattern(str + 1, pat + 1);
if(res == FALSE)
res = MatchPattern(str + 1, pat); //check the next one with * as pattern
break;
case '.':       //check the next one advance only
res = MatchPattern(str + 1, pat + 1);
break;
case '\0': //end of pattern
if(*str == '\0')
res = TRUE;
else
res = TRUE;
break;
default: //assuming both the pattern is made out of alphbets and so is the string to be matched
if(*str == *pat)
res = MatchPattern(str + 1, pat + 1);
else
res = FALSE;
}

return res;
}
0
yumaslim
2 Solutions

Commented:
OK, here we go :

>> BOOL MatchPattern(const char str[], const char pat[]) {
This function will match a string str to a pattern pat, and return true or false if it matches or not respectively.

>>         BOOL res;
initialize the result

>>         //Validating the pointers
>>         if((str == NULL && pat != NULL) ||
>>                 (str != NULL && pat == NULL) )
>>                 return FALSE;
>>         else if(str == NULL && pat == NULL)
>>                 return TRUE;
checks the input if it really contains values. However, since the input is an array of char rather than a pointer to char, this is a bit strange !

>>         if((str != NULL &&  *str == '\0' && pat != NULL && *pat != '\0') ||
>>                 (str != NULL &&  *str != '\0' && pat != NULL && *pat == '\0') )
>>                 return FALSE;
If one of both str and pat are empty strings, then return false ... the pattern didn't match !

>> switch(*pat) {
read the first character in the pattern, and depending on its value jump to the corresponding case :

>>         case '*':        //check the next one with zero advance or one advance
>>                                 res = MatchPattern(str, pat + 1) || MatchPattern(str + 1, pat + 1);
>>                                 if(res == FALSE)
>>                                         res = MatchPattern(str + 1, pat); //check the next one with * as pattern
>>                 break;
If it's a '*', then call the function recursively for the following three situations :
1) if MatchPattern(str, pat + 1) matches, then the * was an empty string in str
2) if MatchPattern(str + 1, pat + 1) matches, then the * was the first character in str
3) if MatchPattern(str + 1, pat) matches, then the * was the first character as well as following characters in str
4) else the result is false

>>         case '.':       //check the next one advance only
>>                                 res = MatchPattern(str + 1, pat + 1);
>>                 break;
If it's a '.', just move to the next character by calling MatchPattern(str + 1, pat + 1) recursively

>>         case '\0': //end of pattern
>>                                 if(*str == '\0')
>>                                         res = TRUE;
>>                                 else
>>                                         res = TRUE;
>>                 break;
If it's a '\0', then the pattern ends, so the result is always true, whether the str has ended or not

>>         default: //assuming both the pattern is made out of alphbets and so is the string to be matched
>>                         if(*str == *pat)
>>                                 res = MatchPattern(str + 1, pat + 1);
>>                         else
>>                                 res = FALSE;
For all other characters that can be found in the pattern, the character in the pattern needs to be the same as the character in the string. If that's not the case, false is returned ... if it IS the case, we move to the next character by calling MatchPattern(str + 1, pat + 1) recursively

>>         }
>>         return res;
>> }
The result is returned.

Does this explain what the function does to you ? If not, feel free to ask specific questions, and we'll be glad to answer them.
0

Commented:
My advice is to throw out the book.  This is a poor example of C programming.  You do not want to learn to write code like this.

0

Author Commented:
Does this actually have anything to do with matching the pattern?

>>         case '\0': //end of pattern
>>                                 if(*str == '\0')
>>                                         res = TRUE;
>>                                 else
>>                                         res = TRUE;
>>                 break;
0

Commented:
Hi yumaslim,

Here's a similar piece of code I found elsewhere on EE and tinkered with to get it to my personal standards. It may be easier to understand.

// Simple Wild card matching from Chef-Hunter at EE.
bool WildCmp(const char* search, const char* in, char * match)
{
// Record the match character.
*match = *search;
// End of search string?
if(*search == '\0')
{
// Good if both finished?
return (*in == '\0');
}
// End of in string but not search string?
if(*in == '\0')
{
// If search was a '*'
if(*search == '*')
{
// Yes! The rest must match.
return WildCmp(search+1,in,match);
}
// Not a '*, not complete match.
return false;
}
// Check for actual match.
if(toupper(*search) == toupper(*in) )
{
return WildCmp(search+1, in+1, match+1);
}
// Check for '?' match.
if(*search == '?')
{
return WildCmp(search+1, in+1, match+1);
}
// Check for '*'
if(*search == '*')
{
return WildCmp(search+1, in, match+1)    // Match nothing.
|| WildCmp(search+1, in+1, match+1)    // Or the next character.
|| WildCmp(search, in+1, match+1);    // Or the one after.
}
// Failed.
return false;
}

I added one feature to the system which is to record what characters matched each wild part of the pattern. It also uses '?' instead of '.' for 'any single character'. I hope its easier to understand. As brett said, the code you posted isnt very nice. :-(

Paul
0

Commented:
>> Does this actually have anything to do with matching the pattern?
Yes, it does : It signifies the end of the pattern matching algorithm, and that the complete string was matched correctly !
0

Commented:
Oh yes, and I've got to agree with brettmjohnson and PaulCaswell about the quality of that code ... Not only is it not very easy to understand, it's also not very efficient at what it does ... recursion eg. is not really the best option here, as it can go very deep (for every character in the string !!!), and exhaust the stack space if you're unlucky. For a 256 length string (which is not very long), you already have a 256 depth recursion (maybe even more in case there's *'s in the pattern) ... that's a lot !!
0

Featured Post

Tackle projects and never again get stuck behind a technical roadblock.