a string matching function

I need to make a string matching function to see is a given filename is
in a given filespec.
something like:

function(Filename, FileSpc: string): boolean;

the filespc must support wildcards

HOW DO I DO THIS?

Thanks
              LR
LVL 1
lfrodriguesAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

geobulCommented:
Hi,

Try this one:

function FileNameComp(Filename, FileSpc: string): boolean;
var
  MyMask : TMask;
begin
  MyMask := TMask.Create(FileSpc);
  result := MyMask.Matches(Filename);
  MyMask.Free;
end;

Regards, Geo
0
LischkeCommented:
Lookup TMask in the help file. This class is located in unit Masks.pas (not Mask.pas!). This should be enough for most cases.

If you need a much better matching function (case sensitiviy, character classes, faster...) then tell me.

Ciao, Mike
0
TheNeilCommented:
Or you could do this

FUNCTION FileNameComp(filename, filespc : STRING): BOOLEAN;
BEGIN
  Result := Pos(UpperCase(ExtractFileExt(Filename)), UpperCase(FileSpc)) <> 0;
END;

The Neil
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

geobulCommented:
TMask is in masks unit, so:

uses ...., Masks;

Regards, Geo
0
lfrodriguesAuthor Commented:
Nop, taht's not waht I need, Lischke what do you have??
0
geobulCommented:
What do you actually need ? What's not good enough in my suggestion?
0
LischkeCommented:
Here's a very good (flexible and fast) extended file name matching routine:


{written by Madshi, finetuned by topkapi)

test whether a string matches a specific pattern

supports standard wildcards like "*" (for a series of random chars), "?" stands for 1 random char, but [a,b,c] can be only "a", "b" or "c" and [a..z] can be any letter in between "a" and "z". [a..c, e..z] is the lowercase
alphabet without the letter "d", etc...

it is possible to do something like "abc[5:a..z, 'A..Z]"
to find a string with "abc" and 5 more letters in the set [a..z, A..Z]

all is case sensitive if specified, otherwise not casesensitive

unit StringMatch;

{-------------------------------------------------------------------------------

-------------------------------------------------------------------------------}

interface

uses
  SysUtils;

{------------------------------------03.09.99 09:56:19-------------------------------------------
  Fast Character Lookup Table
-------------------------------------------------------------------------------}

type
  TCharSet = set of char;

const
  lowCharTable : array [#0..#$FF] of char = (
  #$00,#$01,#$02,#$03,#$04,#$05,#$06,#$07,#$08,#$09,#$0A,#$0B,#$0C,#$0D,#$0E,#$0F,
   #$10,#$11,#$12,#$13,#$14,#$15,#$16,#$17,#$18,#$19,#$1A,#$1B,#$1C,#$1D,#$1E,#$1F,
  #$20,#$21,#$22,#$23,#$24,#$25,#$26,#$27,#$28,#$29,#$2A,#$2B,#$2C,#$2D,#$2E,#$2F,
   #$30,#$31,#$32,#$33,#$34,#$35,#$36,#$37,#$38,#$39,#$3A,#$3B,#$3C,#$3D,#$3E,#$3F,
  #$40,#$61,#$62,#$63,#$64,#$65,#$66,#$67,#$68,#$69,#$6A,#$6B,#$6C,#$6D,#$6E,#$6F,
   #$70,#$71,#$72,#$73,#$74,#$75,#$76,#$77,#$78,#$79,#$7A,#$5B,#$5C,#$5D,#$5E,#$5F,
  #$60,#$61,#$62,#$63,#$64,#$65,#$66,#$67,#$68,#$69,#$6A,#$6B,#$6C,#$6D,#$6E,#$6F,
   #$70,#$71,#$72,#$73,#$74,#$75,#$76,#$77,#$78,#$79,#$7A,#$7B,#$7C,#$7D,#$7E,#$7F,
  #$80,#$81,#$82,#$83,#$84,#$85,#$86,#$87,#$88,#$89,#$9A,#$8B,#$9C,#$8D,#$9E,#$8F,
   #$90,#$91,#$92,#$93,#$94,#$95,#$96,#$97,#$98,#$99,#$9A,#$9B,#$9C,#$9D,#$9E,#$FF,
  #$A0,#$A1,#$A2,#$A3,#$A4,#$A5,#$A6,#$A7,#$A8,#$A9,#$AA,#$AB,#$AC,#$AD,#$AE,#$AF,
   #$B0,#$B1,#$B2,#$B3,#$B4,#$B5,#$B6,#$B7,#$B8,#$B9,#$BA,#$BB,#$BC,#$BD,#$BE,#$BF,
  #$E0,#$E1,#$E2,#$E3,#$E4,#$E5,#$E6,#$E7,#$E8,#$E9,#$EA,#$EB,#$EC,#$ED,#$EE,#$EF,
   #$F0,#$F1,#$F2,#$F3,#$F4,#$F5,#$F6,#$D7,#$F8,#$F9,#$FA,#$FB,#$FC,#$FD,#$FE,#$DF,
  #$E0,#$E1,#$E2,#$E3,#$E4,#$E5,#$E6,#$E7,#$E8,#$E9,#$EA,#$EB,#$EC,#$ED,#$EE,#$EF,
   #$F0,#$F1,#$F2,#$F3,#$F4,#$F5,#$F6,#$F7,#$F8,#$F9,#$FA,#$FB,#$FC,#$FD,#$FE,#$FF
  ); // lowCharTable

{-------------------------------------------------------------------------------
  Constant Chars, don't change unless you want to alter syntax!
-------------------------------------------------------------------------------}
  CSpecialCharBegin   = '[';
  CSpecialCharEnd     = ']';
  CInverseMatchChar   = '!';
  CCharSeperatorChar  = ',';
  CRangeSeperatorChar = '.';
  CStringEndChar      =  #0;
  CWildCardMultiple   = '*';
  CWildCardSingle     = '?';
  CListStartChar      = ':';

  CControlChars : TCharSet = [
    CStringEndChar,
    CSpecialCharEnd,
    CListStartChar,
    CCharSeperatorChar,
    CRangeSeperatorChar
  ];



{-------------------------------------------------------------------------------
  function declarations
-------------------------------------------------------------------------------}
function ParseSpecialString(CaseInsensitive: boolean; var CursorMask: PChar;
                            var CharSet: TCharSet; var InverseMatch: boolean;
                            var SpecialCharCnt: integer) : boolean;

function CaseSensitiveMatch  (SearchString, MaskString: string) : boolean;
function CaseInSensitiveMatch(SearchString, MaskString: string) : boolean;



implementation

{-------------------------------------------------------------------------------
  parses the [...] bit of the MaskString
-------------------------------------------------------------------------------}
function ParseSpecialString(CaseInsensitive: boolean; var CursorMask: PChar;
                            var CharSet: TCharSet; var InverseMatch: boolean;
                            var SpecialCharCnt: integer) : boolean;
var
  CurChar     : PChar;
  HelpString  : string;
  CntChar     : char;     // char counter variable
begin
  Result:=False;
  CharSet := [];
  SpecialCharCnt := 1;   // inizialize return values to default value
  Inc(CursorMask);       // skip CSpecialCharBegin char

  // match or InverseMatch?
  InverseMatch := CursorMask ^= CInverseMatchChar;

  if InverseMatch then
    Inc(CursorMask);

  while True do
  begin
    // remember current char
    CurChar := CursorMask;

    // search first control char
    while not (CursorMask^ in CControlChars) do
      Inc(CursorMask);

    case CursorMask^ of
      // CSpecialCharEnd char missing!!
      CStringEndChar :
        Exit;

      // length value found
      CListStartChar :
        begin
          SetLength(HelpString, CursorMask - CurChar);
          Move(CurChar^, pointer(HelpString)^, CursorMask - CurChar);
          SpecialCharCnt := StrToIntDef(HelpString, 0);

          // if length is 0 or no integer value -> error!!
          if SpecialCharCnt = 0 then
            Exit;
        end;

      // ",x," single char, must be 1 byte long
      CCharSeperatorChar, CSpecialCharEnd :
        if CursorMask - CurChar = 1 then
        begin
          if CaseInsensitive then
            // fill char set
            Include(CharSet, lowCharTable[CurChar^])
          else
            Include(CharSet, CurChar^);

          if CursorMask^ = CSpecialCharEnd then
          // we are ready with parsing...
          begin
            Inc(CursorMask);
            Result := True;
            Exit;
          end;
        end
        else
          Exit;

      CRangeSeperatorChar :
        if CursorMask - CurChar = 1 then
        begin
          // ",x..y," multiple chars, both x and y must be 1 byte long
          Inc(CursorMask);

          // skip all CRangeSeperatorChar's
          while CursorMask ^= CRangeSeperatorChar do
            Inc(CursorMask);

          // y must not be a control char
          if CursorMask^ in CControlChars then
            Exit;

          // ",y..x," is not valid
          if CursorMask^ < CurChar^ then
            Exit;

          // fill char set
          for CntChar := CurChar^ to CursorMask^ do
          if CaseInsensitive then
            Include(CharSet, lowCharTable[CntChar])
          else
            Include(CharSet, CntChar);

          Inc(CursorMask);

          if CursorMask^ = CSpecialCharEnd then
          // we are ready with parsing...
          begin
            Inc(CursorMask);
            Result := True;
            Exit;
          end;

          // otherwise a CCharSeperatorChar MUST follow...
          if CursorMask^ <> CCharSeperatorChar then
            Exit;
        end
        else
          Exit;

    end; // case

    Inc(CursorMask);

  end; // while
end;

{-------------------------------------------------------------------------------

-------------------------------------------------------------------------------}
function CaseSensitiveMatch(SearchString, MaskString: string) : boolean;
var
  CursorString,
  CursorMask,
  MemoryString,
  MemoryMask      : PChar;

  // found CWIldCardMultiple in MaskString ?
  MaskDontCare    : boolean;

  // for special purposes...
  CharSet         : TCharSet;

  // special or not special, that's here question here...  :-)
  Special         : boolean;

  SpecialCharCnt,
  Cnt             : integer;

begin
  CursorMask := PChar(MaskString);

  // MaskString = CWildCardMultiple ?
  if (CursorMask^ = CWildCardMultiple) and ((CursorMask + 1) ^= CStringEndChar) then
  begin
    Result := True;
    Exit
  end;

  Result := False;

  CursorString := PChar(SearchString);
  MemoryString := CursorString;
  MemoryMask := CursorMask;
  MaskDontCare := False;

  while (CursorMask^ <> CStringEndChar) or (CursorString^ <> CStringEndChar) do
  begin
    // if MaskString is empty before string is empty -> no match
    if CursorMask ^= CStringEndChar then
      Exit;

    // if string is empty before MaskString is empty -> match only if rest of MaskString '*****'...
    if CursorString ^= CStringEndChar then
    begin
      while CursorMask^ <> CStringEndChar do
      begin
        if CursorMask^ <> CWildCardMultiple then
          Exit;
        Inc(CursorMask);
      end;
      Break;
    end;

    case CursorMask^ of

      // CWildCardMultiple found -> match if CWildCardMultiple is last char of MaskString
      CWildCardMultiple:
        if (CursorMask+1)^<>CStringEndChar then
        begin
          // else continue testing...
          MaskDontCare := True;
          // memoryMask   -> first char after CWildCardMultiple
          Inc(CursorMask);
          // memoryString -> currentPos + 1
          MemoryString:=CursorString+1; MemoryMask:=CursorMask;
        end else
          Break;

      // CWildCardSingle simply means, we can skip one char in both string and MaskString
      CWildCardSingle:
        begin
          Inc(CursorString);
          Inc(CursorMask);
        end;

      // ooops... now it gets more difficult...  :-)
      CSpecialCharBegin:
        begin
          // wrong special character syntax !!
          if not ParseSpecialString(False, CursorMask, CharSet, Special, SpecialCharCnt) then
            Exit;

          for Cnt := 1 to SpecialCharCnt do
            if (CursorString^ in CharSet) = Special then
            begin
              // current char does match ?
              if MaskDontCare then
              begin
                // it does not, but we have already found a CWildCardMultiple some time ago...
                // continue with memoryMask / memoryString; Inc(memoryString)
                CursorMask := MemoryMask;
                CursorString := MemoryString;
                Inc(MemoryString);

                // Break the special loop...
                Break;
              end else
                // it does not match, and we had no CWildCardMultiple yet, so -> no match
                Exit;
            end else
              // current char matches, so move string cursors + 1
          Inc(CursorString);
        end;
      else if CursorMask^ <> CursorString^ then
      begin
        // current char does match ?
        if MaskDontCare then
        begin
          // it does not, but we have already found a CWildCardMultiple some time ago...
          // continue with memoryMask / memoryString; Inc(memoryString)
          CursorMask := MemoryMask;
          CursorString := MemoryString;
          Inc(MemoryString);
        end else
          // it does not match, and we had no CWildCardMultiple yet, so -> no match
          Exit;
      end else begin
        // current char matches, so move both cursors + 1
        Inc(CursorString);
        Inc(CursorMask);
      end;

    end;  // case
  end; // while

  Result := True;
end;

{-------------------------------------------------------------------------------

-------------------------------------------------------------------------------}
function CaseInSensitiveMatch(SearchString, MaskString: string) : boolean;
var
  CursorString,
  CursorMask,
  MemoryString,
  MemoryMask      : PChar;

  // found CWIldCardMultiple in MaskString ?
  MaskDontCare    : boolean;

  // for special purposes...
  CharSet         : TCharSet;

  // special or not special, that's the question here... :-)
  Special         : boolean;

  SpecialCharCnt,
  Cnt             : integer;

begin
  CursorMask := PChar(MaskString);

  if (CursorMask^ = CWildCardMultiple) and ((CursorMask + 1)^ = CStringEndChar) then
  begin
    // MaskString = CWildCardMultiple ?
    Result:=True;
    Exit
  end;

  Result := False;

  CursorString := PChar(SearchString);
  MemoryString := CursorString;
  MemoryMask := CursorMask;
  MaskDontCare := False;

  while (CursorMask^ <> CStringEndChar) or (CursorString^ <> CStringEndChar) do
  begin

    // if MaskString is empty before string is empty -> no match
    if CursorMask^ = CStringEndChar then
      Exit;

    // if string is empty before MaskString is empty -> match only if rest of MaskString '*****'...
    if CursorString^ = CStringEndChar then
    begin
      while CursorMask^ <> CStringEndChar do
      begin
        if CursorMask^ <> CWildCardMultiple then
          Exit;
        Inc(CursorMask);
      end;
      Break;
    end;

    case CursorMask^ of

      CWildCardMultiple :

        // CWildCardMultiple found -> match if CWildCardMultiple is last char of MaskString
        if (CursorMask + 1)^ <> CStringEndChar then
        begin
          // else continue testing...
          MaskDontCare := True;
          // memoryMask   -> first char after CWildCardMultiple
          Inc(CursorMask);
          // memoryString -> currentPos + 1
          MemoryString := CursorString + 1;
          MemoryMask := CursorMask;
        end else
          Break;


      // CWildCardSingle simply means, we can skip one char in both string and MaskString
      CWildCardSingle :
      begin
        Inc(CursorString);
        Inc(CursorMask);
      end;


      // ooops... now it gets more difficult...  :-)
      CSpecialCharBegin :
      begin
        // wrong special character syntax !!
        if not ParseSpecialString(True, CursorMask, CharSet, Special, SpecialCharCnt) then
          Exit;

        for Cnt := 1 to SpecialCharCnt do
          if (lowCharTable[CursorString^] in CharSet) = Special then
          begin
            // current char does match ?
            if MaskDontCare then
            begin
              // it does not, but we have already found a CWildCardMultiple some time ago...
              // continue with memoryMask / memoryString; Inc(memoryString)
              CursorMask := MemoryMask;
              CursorString := MemoryString;
              Inc(MemoryString);
              // Break the special loop...
              Break;
            end else
              // it does not match, and we had no CWildCardMultiple yet, so -> no match
              Exit;
          end else
            // current char matches, so move string cursors + 1
            Inc(CursorString);
          end;
        else if lowCharTable[CursorMask^] <> lowCharTable[CursorString^] then
        begin
          // current char does match ?
          if MaskDontCare then
          begin
            // it does not, but we have already found a CWildCardMultiple some time ago...
            // continue with memoryMask / memoryString; Inc(memoryString)
            CursorMask := MemoryMask;
            CursorString := MemoryString;
            Inc(MemoryString);
          end else
            // it does not match, and we had no CWildCardMultiple yet, so -> no match
            Exit;
        end else begin
          // current char matches, so move both cursors + 1
          Inc(CursorString);
          Inc(CursorMask);
      end;
    end;  // case
  end;  // while

  Result:=True;
end;

end.

Ciao, Mike
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
lfrodriguesAuthor Commented:
geobul your idea wasn't as fast as i need. What i need here is speed, so...
0
TheNeilCommented:
What is wrong with just doing a string comparison on the file extension?

The Neil
0
LischkeCommented:
:-) A lot is wrong. We are talking here about pattern matching which means to check a string whether it match a pattern like:

  abc??.*

Therefor you can't use the naive string comparison.

Ciao, Mike
0
TheNeilCommented:
Oops, forgive my general lack of intelligence

The Neil
0
geobulCommented:
What does it mean - "wasn't as fast as i need" ?

It takes approximately 0.025 milliseconds on my 333 MHz, 64 MB Win95 pc with D3 C/S.
0
lfrodriguesAuthor Commented:
That inprise function has a bug when the filespc is *.* and the file name has to extention it returns false.
And bisides that Lischke told me that he had a faster ways to do that.
0
LischkeCommented:
Yes and I gave you already the code four days ago. Have you tried it out?

geobul, you shouldn't only see one search but many. The more runs you need the more does it matter how long the comparison takes (not counted the flexibility of my other solution).

Ciao, Mike
0
lfrodriguesAuthor Commented:
As i said geobul your rotine wasn't that fast for what i need,
iven Lischke's source migth not be as fast as i need but i will try to optimesi it
in basm.

Speed:
------
geobul's

var
   c, n1, n2: TLargeInteger;
   a:tmask;
begin
     QueryPerformanceFrequency(c);
     QueryPerformanceCounter(n1);

     a := tmask.create('*.??t');
     a.Matches('Filename.txt');
     a.Free;

     QueryPerformanceCounter(n2);
     ShowMessage(format('Seconds: %.18f', [(n2 - n1)/ c]));
end;

TIME:
-----
0.000103085871368947
0.000106438257429726
0.000104762064399336


---------------------------------------
Lischke's

var
   c, n1, n2: TLargeInteger;
   a:tmask;
begin
     QueryPerformanceFrequency(c);
     QueryPerformanceCounter(n1);

     CaseInSensitiveMatch('Filename.txt','*.??t');

     QueryPerformanceCounter(n2);
     ShowMessage(format('Seconds: %.18f', [(n2 - n1)/ c]));
end;

TIME:
-----
0.000014247640758310
0.000013409544243115
0.000014247640758310



I believe that the previous values explain why i rejected your suggestion.
I tested both sources on my progs test machine:
P200 32MB and win98 release2 with delphi 4 client/sever updatepack 3

That is it.

Thanks Lischke.
0
LischkeCommented:
You are welcome :-)

Ciao, Mike
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Delphi

From novice to tech pro — start learning today.