Solved

Compare 2 strings

Posted on 1997-09-24
20
178 Views
Last Modified: 2010-04-04
I have 2 strings (100 chars max).
The first line is the reference.
The second is introduced to the keyboard by a student.

How to count *all*  the errors in the second line?

Ex :
1=The first line is the reference.
==================================
2=Thhe first lineis the reference. (2 errors)      
2=Thehehe first line is the reference.  (4 errors)
2=te first line is the eference. (3 errors)


0
Comment
Question by:jpdupont
  • 9
  • 6
  • 2
  • +2
20 Comments
 
LVL 1

Expert Comment

by:anilms
ID: 1346217
Here is your answer. Go on extracting the substrings from the first and the second strings and compare them. I assume that there are two edit boxes and a button for starting the comparison process. I have done a simple comparison between the extracted strings. You can make it much more complicated and add counters to count errors. To achieve this, I have first developed a function called extract which I later use on the ButtonClick event.

function extract(var instr:string):string;
{Extracts the first substring in a string and removes the extract from the original string}
var blankpos :integer;
begin
     blankpos := pos(' ',instr); {Blank space separates two substrings}
     if blankpos > 0 then
     begin
          extract := copy(instr, 1, blankpos-1);
          instr := copy(instr, blankpos+1, length(instr)-blankpos);
     end else extract := instr;
end;


procedure TForm1.Button1Click(Sender: TObject);
var string1,string2:string; {String1:Original String, String2- Typed in String}
    string1e,string2e:string; {String1e,2e : Sub String Extract from String1,2}
begin
     string1:=Edit1.Text;
     string2:=Edit2.Text;
     repeat
           string1e := extract(string1);
           string2e := extract(string2);
           if string1e<>string2e then
           { Do your match processing}
           showmessage(string1e+'<>'string2e);
     until string1e =  string1 {What has been extracted is equal to what is left in the string}
end;

0
 

Author Comment

by:jpdupont
ID: 1346218
Hi!
Thank you for your proposal ...

but the problem is well more complex!

try :
1=Go on extracting the substrings
2=Goon extracting the substrings (1 error)
(Here your function give me 5 errors)
2=Go on estrating the substrings (2 errors)
(here your function give me 1 error)
2=on extracting the substrings
(The student don't type the first word : your function say ALL is wrong !)

I have already somme code :
I have already analyzed the problem.
I can send it by email :
That can put you on the way.

Regards.
Jean-Pol




0
 
LVL 1

Expert Comment

by:anilms
ID: 1346219
This answer was to help you, not exactly to solve your problem or earn me points. We do develop a lot of parsers and tokenisers. We first break down the whole text into tokens and positions and then we parse/group/verify/(whatever is needed to be done) with the tokens. This happens to be the core code for extracting tokens - whether it is Pascal or C. Hope you enjoyed it.
0
 
LVL 8

Expert Comment

by:ZifNab
ID: 1346220
1. First extract the words like animls says.
2. Then count the words -> gives already some sort of errors.
3. Then analyse each word by checking each character.

function CheckWord(Reference, check : string):integer;
var LReference, LCheck, i, WordError : integer;

begin
 LReference := Length(Reference);
 LCheck     := Length(Check);
 WordError  := ABS(LReference-LCheck);
 For i := 1 To LReference do begin
  if Reference[i] <> Check[i] then begin
   Inc(WordError);
 end;
 Result := WordError;
end;

4. The beauty of this method is that you actually can count
   different errors!  
5. Haven't checked the code, so if it doesn't work, let me know
   and I'll put everything together.

Have fun,
c.u. ZifNab;
0
 

Author Comment

by:jpdupont
ID: 1346221
Thank you for your proposal.
Unfortunately, as I explained, the problem is more complex.  Your function gives me no precise result.

With your function, try :

1=papa
2=paa ( 1 error [paPa])  

Your function give me 3 errors !!!

My analysis of the problem is the following:
1) I loop the 2 lines until the first error.  
2) The error can be:
- a letter lacks
- a letter to much
- a bad letter

3) Tests
- I copy the line 2 in 3 temporary strings
- I test each these 3 strings by correcting them 3 different manners according to the possible error type.  I seek in the 3 strings where is found the next error.  The largest result should give me the correct error type.
That functions correctly *sometimes* !
Would you like to see my code ?

Regards.
JP
0
 
LVL 8

Expert Comment

by:ZifNab
ID: 1346222
Sure, let me see your code.

Tom.Deprez@uz.kuleuven.ac.be
0
 

Author Comment

by:jpdupont
ID: 1346223
I increase points to 140.

If what I ask is too difficult, an acceptable solution would be
to verify word by word.  

!!!
- The student can have forgotten a space between 2 words.
- The student can have forgotten a word.

Ex.

1=An acceptable solution would be to verify word by word.  

2=Anacceptable solution would be to verify word by word.  (=1 error ->Space between the 2 first words - all the others are correct)
2=An aceptable solution would be to verify word by word.   (=1 error ->acCeptable - easy : all the words are in the correct position)
2=An acceptable solution would to verify word by word.  (=1error ->BE)
2=An acceptable solution would beto verify word by wrd.  (=2 errors -> space between BE and TO, wOrd)
0
 

Author Comment

by:jpdupont
ID: 1346224

I increase the point to 150
0
 

Author Comment

by:jpdupont
ID: 1346225
Adjusted points to 150
0
 
LVL 3

Accepted Solution

by:
mirek071497 earned 200 total points
ID: 1346226
Try this example :

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls;

type
  TForm1 = class(TForm)
    Edit1: TEdit;
    Edit2: TEdit;
    Button1: TButton;
    Label1: TLabel;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.DFM}

function GetWord( st : string; nr : integer ) : string;
var i, lk : integer;
begin
  i := 1;
  lk := 1;
  result := '';
  while st[1]=' ' do System.Delete( st, 1, 1 );
  while (i<=length(st)) do
  begin
    if st[i]=' ' then
      inc(lk)
    else
      if lk=nr then
        result := result+st[i];
    inc(i);
  end;
end;

function WordCount(s:string):integer;
var I : Word;
begin
  Result := 0;
  I := 1;
  while I <= Length(s) do begin
    while (I<=Length(s)) and (S[I]=' ') do
      Inc(I);
    if I<=Length(s) then
      Inc(Result);
    while (I<=Length(s)) and not(S[I]=' ') do
      Inc(I);
  end;
end;

function GetErrCount( s1,s2 : string ) : integer;
var e1,e2,e3 : integer;
begin
  result := 0;
  if Length(s1)=0 then
    Result := Length(s2)
  else if Length(s2)=0 then
    Result := Length(s1)
  else
    if s1[1]=s2[1] then
      begin
        System.Delete( s1,1,1);
        System.Delete( s2,1,1);
        result := result+GetErrCount(s1,s2);
      end
    else
      begin
        e1 := GetErrCount( Copy(s1,2,length(s1)-1),Copy(s2,2,length(s2)-1) );
        e2 := GetErrCount( s1,Copy(s2,2,length(s2)-1) );
        e3 := GetErrCount( Copy(s1,2,length(s1)-1),s2 );
        if e1<e2 then
          if e1<e3 then
            result := result+e1+1
          else
            result := result+e3+1
        else
          if e2<e3 then
            result := result+e2+1
          else
            result := result+e3;
      end;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  s1, s2 : string;
  i, w, err : integer;
begin
  err := 0;
  W := WordCount(Edit1.Text);
  for i:=1 to w do
  begin
    s1 := GetWord(Edit1.Text,i);
    s2 := GetWord(Edit2.Text,i);
    Err := Err + GetErrCount( s1,s2 );
  end;
  Label1.Caption := IntToStr( Err );
end;

end.

0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 3

Expert Comment

by:mirek071497
ID: 1346227
sorry one mistake

this is corrected :

function GetErrCount( s1,s2 : string ) : integer;
var e1,e2,e3 : integer;
begin
  result := 0;
  if Length(s1)=0 then
    Result := Length(s2)
  else if Length(s2)=0 then
    Result := Length(s1)
  else
    if s1[1]=s2[1] then
      begin
        System.Delete( s1,1,1);
        System.Delete( s2,1,1);
        result := result+GetErrCount(s1,s2);
      end
    else
      begin
        e1 := GetErrCount( Copy(s1,2,length(s1)-1),Copy(s2,2,length(s2)-1) );
        e2 := GetErrCount( s1,Copy(s2,2,length(s2)-1) );
        e3 := GetErrCount( Copy(s1,2,length(s1)-1),s2 );
        if e1<e2 then
          if e1<e3 then
            result := result+e1+1
          else
            result := result+e3+1
        else
          if e2<e3 then
            result := result+e2+1
          else
            result := result+e3+1;
      end;
end;
0
 

Author Comment

by:jpdupont
ID: 1346228
Hi Mirek,

Thanks for your help.
Unfortunately, as I explained, the problem is complex. Your unit gives me no precise result.

try :
1=Papa va à Virton
2=papa vaà Virton (2 errors. Your code give me 14 errors)

(Try with the others examples (1=... 2=... in the question history)
Some tests give me a 202 error.

One error may be :
  1. User has forgotten the letter (or a space between 2 words !!!)
  2. User has written a letter too much
  3. User has written a wrong letter

JP
0
 
LVL 3

Expert Comment

by:mirek071497
ID: 1346229
Yes, but this is good start point.
Try :

  s1 := UpperCase( Edit1.Text );
  s2 := UpperCase( Edit2.Text );
  err := err + GetErrCount( s1,s2 );

This can be in some errors but the idea i think is good. One problem is in the Stack Use. This is recursive Proc, so i was try first devide strings to smaller.

0
 

Expert Comment

by:krisz
ID: 1346230
This won't work in this way. You can't break the string into
words because the error can be an extra space or a missing space
too.

I'll look into this, but I think there will be no solution to
this problem.
0
 
LVL 3

Expert Comment

by:mirek071497
ID: 1346231
krish - my proc works fine without breaking to single words (my last comment) but this is very very slow so i try now to improve this.
0
 

Author Comment

by:jpdupont
ID: 1346232
Adjusted points to 180
0
 

Author Comment

by:jpdupont
ID: 1346233
Adjusted points to 200
0
 
LVL 3

Expert Comment

by:mirek071497
ID: 1346234
Hi
I just found improvement of this function.

This is more fastest proc for counting errors :

function CountErrors( St1, St2 : string ) : integer;
const MaxWidth = 128; { max width of strings }
var   ETable : array[1..MaxWidth,1..MaxWidth] of integer;
      L1,l2  : integer;

  function Err( p1,p2 : integer ) : integer;
  var e1,e2,e3 : integer;
  begin
    result  := ETable[p1,p2];
    if result>=0 then exit else result := 0;
    while (L1>=p1) and (L2>=p2) and (st1[p1]=st2[p2]) do
    begin
      inc(p1);
      inc(p2);
    end;
    if (L1>=p1) and (L2>=p2) then
      begin
        e1 := Err( p1+1,p2+1 ); { icorrect char }
        if e1=0 then
          result := e1+1
        else
          begin
            e2 := Err( p1, p2+1  ); { added char }
            if e2=0 then
              result := e2+1
            else
              begin
                e3 := Err( p1+1, p2  ); { mising char }
                if e1<=e2 then
                  if e1<=e3 then result := e1+1 else result := e3+1
                else
                  if e2<=e3 then result := e2+1 else result := e3+1
              end;
          end;
      end
    else if L1=(p1-1) then { s1 ended }
        Result := abs(L2-p2+1)
    else if L2=(p2-1) then { s2 ended }
      Result := abs(L1-p1+1);
    ETable[p1,p2] := result;
  end;

begin
  FillChar( ETable, SizeOf(ErrTable), -1 );
  L1 := Length(st1);
  L2 := Length(st2);
  Result := Err(1,1 );
end;

For finding positions of this errors you must wait some hours.
0
 
LVL 3

Expert Comment

by:mirek071497
ID: 1346235
Hi this is the final version (i think) of function which You need.

You don't write which Delphi you have, however this function work at all Delphi.

function FindErrors( St1, St2 : string; var ErrStr : string ) : integer;
const MaxWidth   = 255; { max width of strings }
type  TErrTable  = array[1..MaxWidth,1..MaxWidth] of byte;
var
  ETable : ^TErrTable;
  L1, L2,
  ErrPos, ErrCount,
  ErrType,TmpPos : Integer;

  function Err( p1,p2 : integer ) : integer;
  var e1,e2,e3 : integer;
  begin
    result  := ETable^[p1,p2];
    if result=255 then
    begin
      result := 0;
      while (L1>=p1) and (L2>=p2) and (st1[p1]=st2[p2]) do
      begin
        inc(p1);
        inc(p2);
      end;
      if (L1>=p1) and (L2>=p2) then
        begin
          e1 := Err( p1+1,p2+1 ); { icorrect char }
          if e1=0 then
            begin ErrType := 1; result := e1+1 end
          else
            begin
              e2 := Err( p1, p2+1  ); { added char }
              if e2=0 then
                begin ErrType := 2; result := e2+1 end
              else
                begin
                  e3 := Err( p1+1, p2  ); { mising char }
                  if e1<=e2 then
                    if e1<=e3 then begin ErrType := 1; result := e1+1 end else begin ErrType := 3; result := e3+1 end
                  else
                    if e2<=e3 then begin ErrType := 2; result := e2+1 end else begin ErrType := 3; result := e3+1 end
                end;
            end;
        end
      else if L1=(p1-1) then { s1 ended }
        begin
          Result := abs(L2-p2+1);
          if Result<>0 then ErrType := 2;
        end
      else if L2=(p2-1) then { s2 ended }
        begin
          Result := abs(L1-p1+1);
          if Result<>0 then ErrType := 3;
        end
    end;
    ETable^[p1,p2] := result;
    if Result<>0 then ErrPos := p1;
  end;

begin
  St1 := Copy(St1,1,MaxWidth);
  St2 := Copy(St2,1,MaxWidth);
  New( ETable );
  ErrStr := '';
  ErrCount := 0;
  TmpPos := 0;
  while (Length(st1)<>0) or (Length(St2)<>0) do
  begin
    ErrPos   := 0;
    ErrType  := 0;
    FillChar( ETable^, SizeOf(TErrTable), 255 );
    if Err(1,1)>0 then
      begin
        Case ErrType of
          1 : begin
                ErrStr := ErrStr+'I'+IntToStr(ErrPos+TmpPos)+'; ';
                st1 := Copy(st1,ErrPos+1, L1 );
                st2 := Copy(st2,ErrPos+1, L2 );
              end;
          2 : begin
                ErrStr := ErrStr+'A'+IntToStr(ErrPos+TmpPos)+'; ';
                st1 := Copy(st1,ErrPos, L1 );
                st2 := Copy(st2,ErrPos+1, L2 );
                Dec(TmpPos);
              end;
          3 : begin
                ErrStr := ErrStr+'M'+IntToStr(ErrPos+TmpPos)+'; ';
                st1 := Copy(st1,ErrPos+1, L1 );
                st2 := Copy(st2,ErrPos, L2 );
              end;
        end;
        TmpPos := TmpPos+ErrPos;
        inc(ErrCount);
      end
    else
      begin
        St1 := '';
        St2 := '';
      end;
  end;
  result := ErrCount;
  Dispose(ETable);
end;

This function count the errors and return information, what and where are the errors. A=added, I=Incorrect, M=missing;

You can use this as in this example :

procedure TForm1.Button1Click(Sender: TObject);
var ErrStr : string;
begin
  Edit3.Text := 'errors = '+ IntToStr( FindErrors( Edit1.Text, Edit2.Text, ErrStr ) );
  Edit4.Text := ErrStr;
end;

for string :
Edit1.Text :='The first line is the reference.'
Edit2.Text :='Thehe frst line isthe rference.'

You get Result=5
and ErrStr:='A4; A4; M6; M18; M24;'

0
 

Author Comment

by:jpdupont
ID: 1346236
Thank you Mirek for this very good work !

0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

The uses clause is one of those things that just tends to grow and grow. Most of the time this is in the main form, as it's from this form that all others are called. If you have a big application (including many forms), the uses clause in the in…
Introduction I have seen many questions in this Delphi topic area where queries in threads are needed or suggested. I know bumped into a similar need. This article will address some of the concepts when dealing with a multithreaded delphi database…
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now