Searching a string?

I have a set of strings (actually pchars)
each is approximately 1100 characters long I wish to search these for a substring of about 20 characters.
what is the most efficient way of doing this,
whilst allowing for maybe one or two mismatches within the substring?

many thanks
john
John CulkinAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

d003303Commented:
Yo,
have you tried the StrPos function ? It is pretty fast.

Slash/d003303
0
julio011597Commented:
The StrPos() works fine for exact matches.
To handle mismatches you need some code.

Say you wish to allow for M=2 mismatches.

Basically you have to compare chars one at a time, check for matching chars at the given max dinstance M, and keep considering a possible string match until the sum of distances does not exceed M. If you reach the end of the matching string, then you have a string match. You must also be prepared to follow more than one possibility at a time: actually you need to handle up to M+1 parallel checks.

If this sounds like what you're looking for, i can provide the needed code in a couple of days. If this is the case, please also tell wether you'd like a function accepting M as a parameter, or you just need M to be fixed and equal to... 2?

Regards.
0
julio011597Commented:
Another, simpler but maybe slower (?) way, is to take your matching string and generate all the possible derivated matching strings - at distance <= M -, and test with StrPos() for each of them. Maybe is this what d003303 meant?

In case you ask for an answer, i'll try to compute which is faster first, and will answer accordingly.

d003303, feel free to consider the question anyway open in the meantime :)
0
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

d003303Commented:
Hi julio,
right, that's what I thought. But if M or the length of the substring will raise, this solution will be no fun anymore. Best would be your algorithm that takes the mismatch parameter, otherwise (when the profile of the app is changing) you'll  have to re-code everything.

Slash/d003303
0
ZifNabCommented:
Hi John,

A while ago, somebody already answered such a question, so it can be handy to browse through the answered questions...

Also very handy can be an article of the Delphi Informant magazine of March '98...

Regards, ZiF.
0
interCommented:
Hi there, here is my routine I code and try to test it as much as I can. It works as follows: give a substring, string, and max mismatch it returns nil if not found else returns a pointer to the matching string starting position:
e.g.
StrCopy(SubStr,  'mello');
StrCopy(Str, 'who the hell are those man saying hello');
PartialPos(SubStr, Str, 1) returns pointer to 'hello' in Str
PartialPos(SubStr, Str, 2) returns pointer to 'hell ...' in Str
NOTE : DO NOT MODIFY THE CONTENT OF THE RETURN VALUE IT JUST POINTS TO YOUR Str so copy it if you want to modify it!

function PartialPos(S, D :PChar; M:integer):PChar;
  function ComputeMismatch(S,D : PChar):integer;
  begin
    Result := 0;
    while (S^ <> #0) and (D^ <> #0) do
    begin
      if S^ <> D^ then Inc(Result);
      Inc(S); Inc(D);
    end;
    // we have end of D but still there are chars in S!
    if S^ <> #0 then Inc(Result, StrLen(S));
  end;
var
  P : PChar;
  Done : boolean;
  L : integer;
begin
  Result := nil; Done := false; L := StrLen(S);
  // If predefined conditions set terminate search
  if (StrLen(S) > StrLen(D)) or (StrLen(S) < M) then EXIT;
  if M = 0 then begin Result := StrPos(D, S); Exit;end;
  while not Done and (S^ <>#0) do
  begin
    P := StrScan(D, S^);
    if P <> nil then
    begin
        while P <> nil do
        begin
          if ComputeMismatch(S, P) <= M then
          begin
           Done := True;
           Result := P - (L - StrLen(S));
           if Result < D then Result := D;
           Break;
          end;
          P := StrScan(P+1, S^); //search other substring
        end;
        if not Done then begin Inc(S); Dec(M); end;
    end else begin //not found in
      Dec(M);
      Inc(S,1);
      if M < 0 then Done := true;
    end; // if P <> nil...
  end;
end;

Regards,
Igor
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
John CulkinAuthor Commented:
Thanks for so many replies
this routine does everything I need

Many Thanks


John
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Delphi

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.