Solved

Compare 2 strings

Posted on 1997-09-24
20
183 Views
Last Modified: 2010-04-04
I have 2 strings (100 chars max).
The first line is the reference.
The second is introduced to the keyboard by a student.

How to count *all*  the errors in the second line?

Ex :
1=The first line is the reference.
==================================
2=Thhe first lineis the reference. (2 errors)      
2=Thehehe first line is the reference.  (4 errors)
2=te first line is the eference. (3 errors)


0
Comment
Question by:jpdupont
  • 9
  • 6
  • 2
  • +2
20 Comments
 
LVL 1

Expert Comment

by:anilms
ID: 1346217
Here is your answer. Go on extracting the substrings from the first and the second strings and compare them. I assume that there are two edit boxes and a button for starting the comparison process. I have done a simple comparison between the extracted strings. You can make it much more complicated and add counters to count errors. To achieve this, I have first developed a function called extract which I later use on the ButtonClick event.

function extract(var instr:string):string;
{Extracts the first substring in a string and removes the extract from the original string}
var blankpos :integer;
begin
     blankpos := pos(' ',instr); {Blank space separates two substrings}
     if blankpos > 0 then
     begin
          extract := copy(instr, 1, blankpos-1);
          instr := copy(instr, blankpos+1, length(instr)-blankpos);
     end else extract := instr;
end;


procedure TForm1.Button1Click(Sender: TObject);
var string1,string2:string; {String1:Original String, String2- Typed in String}
    string1e,string2e:string; {String1e,2e : Sub String Extract from String1,2}
begin
     string1:=Edit1.Text;
     string2:=Edit2.Text;
     repeat
           string1e := extract(string1);
           string2e := extract(string2);
           if string1e<>string2e then
           { Do your match processing}
           showmessage(string1e+'<>'string2e);
     until string1e =  string1 {What has been extracted is equal to what is left in the string}
end;

0
 

Author Comment

by:jpdupont
ID: 1346218
Hi!
Thank you for your proposal ...

but the problem is well more complex!

try :
1=Go on extracting the substrings
2=Goon extracting the substrings (1 error)
(Here your function give me 5 errors)
2=Go on estrating the substrings (2 errors)
(here your function give me 1 error)
2=on extracting the substrings
(The student don't type the first word : your function say ALL is wrong !)

I have already somme code :
I have already analyzed the problem.
I can send it by email :
That can put you on the way.

Regards.
Jean-Pol




0
 
LVL 1

Expert Comment

by:anilms
ID: 1346219
This answer was to help you, not exactly to solve your problem or earn me points. We do develop a lot of parsers and tokenisers. We first break down the whole text into tokens and positions and then we parse/group/verify/(whatever is needed to be done) with the tokens. This happens to be the core code for extracting tokens - whether it is Pascal or C. Hope you enjoyed it.
0
 
LVL 8

Expert Comment

by:ZifNab
ID: 1346220
1. First extract the words like animls says.
2. Then count the words -> gives already some sort of errors.
3. Then analyse each word by checking each character.

function CheckWord(Reference, check : string):integer;
var LReference, LCheck, i, WordError : integer;

begin
 LReference := Length(Reference);
 LCheck     := Length(Check);
 WordError  := ABS(LReference-LCheck);
 For i := 1 To LReference do begin
  if Reference[i] <> Check[i] then begin
   Inc(WordError);
 end;
 Result := WordError;
end;

4. The beauty of this method is that you actually can count
   different errors!  
5. Haven't checked the code, so if it doesn't work, let me know
   and I'll put everything together.

Have fun,
c.u. ZifNab;
0
 

Author Comment

by:jpdupont
ID: 1346221
Thank you for your proposal.
Unfortunately, as I explained, the problem is more complex.  Your function gives me no precise result.

With your function, try :

1=papa
2=paa ( 1 error [paPa])  

Your function give me 3 errors !!!

My analysis of the problem is the following:
1) I loop the 2 lines until the first error.  
2) The error can be:
- a letter lacks
- a letter to much
- a bad letter

3) Tests
- I copy the line 2 in 3 temporary strings
- I test each these 3 strings by correcting them 3 different manners according to the possible error type.  I seek in the 3 strings where is found the next error.  The largest result should give me the correct error type.
That functions correctly *sometimes* !
Would you like to see my code ?

Regards.
JP
0
 
LVL 8

Expert Comment

by:ZifNab
ID: 1346222
Sure, let me see your code.

Tom.Deprez@uz.kuleuven.ac.be
0
 

Author Comment

by:jpdupont
ID: 1346223
I increase points to 140.

If what I ask is too difficult, an acceptable solution would be
to verify word by word.  

!!!
- The student can have forgotten a space between 2 words.
- The student can have forgotten a word.

Ex.

1=An acceptable solution would be to verify word by word.  

2=Anacceptable solution would be to verify word by word.  (=1 error ->Space between the 2 first words - all the others are correct)
2=An aceptable solution would be to verify word by word.   (=1 error ->acCeptable - easy : all the words are in the correct position)
2=An acceptable solution would to verify word by word.  (=1error ->BE)
2=An acceptable solution would beto verify word by wrd.  (=2 errors -> space between BE and TO, wOrd)
0
 

Author Comment

by:jpdupont
ID: 1346224

I increase the point to 150
0
 

Author Comment

by:jpdupont
ID: 1346225
Adjusted points to 150
0
 
LVL 3

Accepted Solution

by:
mirek071497 earned 200 total points
ID: 1346226
Try this example :

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls;

type
  TForm1 = class(TForm)
    Edit1: TEdit;
    Edit2: TEdit;
    Button1: TButton;
    Label1: TLabel;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.DFM}

function GetWord( st : string; nr : integer ) : string;
var i, lk : integer;
begin
  i := 1;
  lk := 1;
  result := '';
  while st[1]=' ' do System.Delete( st, 1, 1 );
  while (i<=length(st)) do
  begin
    if st[i]=' ' then
      inc(lk)
    else
      if lk=nr then
        result := result+st[i];
    inc(i);
  end;
end;

function WordCount(s:string):integer;
var I : Word;
begin
  Result := 0;
  I := 1;
  while I <= Length(s) do begin
    while (I<=Length(s)) and (S[I]=' ') do
      Inc(I);
    if I<=Length(s) then
      Inc(Result);
    while (I<=Length(s)) and not(S[I]=' ') do
      Inc(I);
  end;
end;

function GetErrCount( s1,s2 : string ) : integer;
var e1,e2,e3 : integer;
begin
  result := 0;
  if Length(s1)=0 then
    Result := Length(s2)
  else if Length(s2)=0 then
    Result := Length(s1)
  else
    if s1[1]=s2[1] then
      begin
        System.Delete( s1,1,1);
        System.Delete( s2,1,1);
        result := result+GetErrCount(s1,s2);
      end
    else
      begin
        e1 := GetErrCount( Copy(s1,2,length(s1)-1),Copy(s2,2,length(s2)-1) );
        e2 := GetErrCount( s1,Copy(s2,2,length(s2)-1) );
        e3 := GetErrCount( Copy(s1,2,length(s1)-1),s2 );
        if e1<e2 then
          if e1<e3 then
            result := result+e1+1
          else
            result := result+e3+1
        else
          if e2<e3 then
            result := result+e2+1
          else
            result := result+e3;
      end;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  s1, s2 : string;
  i, w, err : integer;
begin
  err := 0;
  W := WordCount(Edit1.Text);
  for i:=1 to w do
  begin
    s1 := GetWord(Edit1.Text,i);
    s2 := GetWord(Edit2.Text,i);
    Err := Err + GetErrCount( s1,s2 );
  end;
  Label1.Caption := IntToStr( Err );
end;

end.

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 3

Expert Comment

by:mirek071497
ID: 1346227
sorry one mistake

this is corrected :

function GetErrCount( s1,s2 : string ) : integer;
var e1,e2,e3 : integer;
begin
  result := 0;
  if Length(s1)=0 then
    Result := Length(s2)
  else if Length(s2)=0 then
    Result := Length(s1)
  else
    if s1[1]=s2[1] then
      begin
        System.Delete( s1,1,1);
        System.Delete( s2,1,1);
        result := result+GetErrCount(s1,s2);
      end
    else
      begin
        e1 := GetErrCount( Copy(s1,2,length(s1)-1),Copy(s2,2,length(s2)-1) );
        e2 := GetErrCount( s1,Copy(s2,2,length(s2)-1) );
        e3 := GetErrCount( Copy(s1,2,length(s1)-1),s2 );
        if e1<e2 then
          if e1<e3 then
            result := result+e1+1
          else
            result := result+e3+1
        else
          if e2<e3 then
            result := result+e2+1
          else
            result := result+e3+1;
      end;
end;
0
 

Author Comment

by:jpdupont
ID: 1346228
Hi Mirek,

Thanks for your help.
Unfortunately, as I explained, the problem is complex. Your unit gives me no precise result.

try :
1=Papa va à Virton
2=papa vaà Virton (2 errors. Your code give me 14 errors)

(Try with the others examples (1=... 2=... in the question history)
Some tests give me a 202 error.

One error may be :
  1. User has forgotten the letter (or a space between 2 words !!!)
  2. User has written a letter too much
  3. User has written a wrong letter

JP
0
 
LVL 3

Expert Comment

by:mirek071497
ID: 1346229
Yes, but this is good start point.
Try :

  s1 := UpperCase( Edit1.Text );
  s2 := UpperCase( Edit2.Text );
  err := err + GetErrCount( s1,s2 );

This can be in some errors but the idea i think is good. One problem is in the Stack Use. This is recursive Proc, so i was try first devide strings to smaller.

0
 

Expert Comment

by:krisz
ID: 1346230
This won't work in this way. You can't break the string into
words because the error can be an extra space or a missing space
too.

I'll look into this, but I think there will be no solution to
this problem.
0
 
LVL 3

Expert Comment

by:mirek071497
ID: 1346231
krish - my proc works fine without breaking to single words (my last comment) but this is very very slow so i try now to improve this.
0
 

Author Comment

by:jpdupont
ID: 1346232
Adjusted points to 180
0
 

Author Comment

by:jpdupont
ID: 1346233
Adjusted points to 200
0
 
LVL 3

Expert Comment

by:mirek071497
ID: 1346234
Hi
I just found improvement of this function.

This is more fastest proc for counting errors :

function CountErrors( St1, St2 : string ) : integer;
const MaxWidth = 128; { max width of strings }
var   ETable : array[1..MaxWidth,1..MaxWidth] of integer;
      L1,l2  : integer;

  function Err( p1,p2 : integer ) : integer;
  var e1,e2,e3 : integer;
  begin
    result  := ETable[p1,p2];
    if result>=0 then exit else result := 0;
    while (L1>=p1) and (L2>=p2) and (st1[p1]=st2[p2]) do
    begin
      inc(p1);
      inc(p2);
    end;
    if (L1>=p1) and (L2>=p2) then
      begin
        e1 := Err( p1+1,p2+1 ); { icorrect char }
        if e1=0 then
          result := e1+1
        else
          begin
            e2 := Err( p1, p2+1  ); { added char }
            if e2=0 then
              result := e2+1
            else
              begin
                e3 := Err( p1+1, p2  ); { mising char }
                if e1<=e2 then
                  if e1<=e3 then result := e1+1 else result := e3+1
                else
                  if e2<=e3 then result := e2+1 else result := e3+1
              end;
          end;
      end
    else if L1=(p1-1) then { s1 ended }
        Result := abs(L2-p2+1)
    else if L2=(p2-1) then { s2 ended }
      Result := abs(L1-p1+1);
    ETable[p1,p2] := result;
  end;

begin
  FillChar( ETable, SizeOf(ErrTable), -1 );
  L1 := Length(st1);
  L2 := Length(st2);
  Result := Err(1,1 );
end;

For finding positions of this errors you must wait some hours.
0
 
LVL 3

Expert Comment

by:mirek071497
ID: 1346235
Hi this is the final version (i think) of function which You need.

You don't write which Delphi you have, however this function work at all Delphi.

function FindErrors( St1, St2 : string; var ErrStr : string ) : integer;
const MaxWidth   = 255; { max width of strings }
type  TErrTable  = array[1..MaxWidth,1..MaxWidth] of byte;
var
  ETable : ^TErrTable;
  L1, L2,
  ErrPos, ErrCount,
  ErrType,TmpPos : Integer;

  function Err( p1,p2 : integer ) : integer;
  var e1,e2,e3 : integer;
  begin
    result  := ETable^[p1,p2];
    if result=255 then
    begin
      result := 0;
      while (L1>=p1) and (L2>=p2) and (st1[p1]=st2[p2]) do
      begin
        inc(p1);
        inc(p2);
      end;
      if (L1>=p1) and (L2>=p2) then
        begin
          e1 := Err( p1+1,p2+1 ); { icorrect char }
          if e1=0 then
            begin ErrType := 1; result := e1+1 end
          else
            begin
              e2 := Err( p1, p2+1  ); { added char }
              if e2=0 then
                begin ErrType := 2; result := e2+1 end
              else
                begin
                  e3 := Err( p1+1, p2  ); { mising char }
                  if e1<=e2 then
                    if e1<=e3 then begin ErrType := 1; result := e1+1 end else begin ErrType := 3; result := e3+1 end
                  else
                    if e2<=e3 then begin ErrType := 2; result := e2+1 end else begin ErrType := 3; result := e3+1 end
                end;
            end;
        end
      else if L1=(p1-1) then { s1 ended }
        begin
          Result := abs(L2-p2+1);
          if Result<>0 then ErrType := 2;
        end
      else if L2=(p2-1) then { s2 ended }
        begin
          Result := abs(L1-p1+1);
          if Result<>0 then ErrType := 3;
        end
    end;
    ETable^[p1,p2] := result;
    if Result<>0 then ErrPos := p1;
  end;

begin
  St1 := Copy(St1,1,MaxWidth);
  St2 := Copy(St2,1,MaxWidth);
  New( ETable );
  ErrStr := '';
  ErrCount := 0;
  TmpPos := 0;
  while (Length(st1)<>0) or (Length(St2)<>0) do
  begin
    ErrPos   := 0;
    ErrType  := 0;
    FillChar( ETable^, SizeOf(TErrTable), 255 );
    if Err(1,1)>0 then
      begin
        Case ErrType of
          1 : begin
                ErrStr := ErrStr+'I'+IntToStr(ErrPos+TmpPos)+'; ';
                st1 := Copy(st1,ErrPos+1, L1 );
                st2 := Copy(st2,ErrPos+1, L2 );
              end;
          2 : begin
                ErrStr := ErrStr+'A'+IntToStr(ErrPos+TmpPos)+'; ';
                st1 := Copy(st1,ErrPos, L1 );
                st2 := Copy(st2,ErrPos+1, L2 );
                Dec(TmpPos);
              end;
          3 : begin
                ErrStr := ErrStr+'M'+IntToStr(ErrPos+TmpPos)+'; ';
                st1 := Copy(st1,ErrPos+1, L1 );
                st2 := Copy(st2,ErrPos, L2 );
              end;
        end;
        TmpPos := TmpPos+ErrPos;
        inc(ErrCount);
      end
    else
      begin
        St1 := '';
        St2 := '';
      end;
  end;
  result := ErrCount;
  Dispose(ETable);
end;

This function count the errors and return information, what and where are the errors. A=added, I=Incorrect, M=missing;

You can use this as in this example :

procedure TForm1.Button1Click(Sender: TObject);
var ErrStr : string;
begin
  Edit3.Text := 'errors = '+ IntToStr( FindErrors( Edit1.Text, Edit2.Text, ErrStr ) );
  Edit4.Text := ErrStr;
end;

for string :
Edit1.Text :='The first line is the reference.'
Edit2.Text :='Thehe frst line isthe rference.'

You get Result=5
and ErrStr:='A4; A4; M6; M18; M24;'

0
 

Author Comment

by:jpdupont
ID: 1346236
Thank you Mirek for this very good work !

0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
PDF library for Delphi 2 104
Mydac connection data base issue 3 135
Delphi procedure  - Unsatisfied forward or external declaration. 2 140
code issue 8 99
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
Introduction Raise your hands if you were as upset with FireMonkey as I was when I discovered that there was no TListview.  I use TListView in almost all of my applications I've written, and I was not going to compromise by resorting to TStringGrid…
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …

932 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now