We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you two Citrix podcasts. Learn about 2020 trends and get answers to your biggest Citrix questions!Listen Now

x

# Compare 2 strings

jpdupont asked
on
Medium Priority
214 Views
Last Modified: 2010-04-04
I have 2 strings (100 chars max).
The first line is the reference.
The second is introduced to the keyboard by a student.

How to count *all*  the errors in the second line?

Ex :
1=The first line is the reference.
==================================
2=Thhe first lineis the reference. (2 errors)
2=Thehehe first line is the reference.  (4 errors)
2=te first line is the eference. (3 errors)

Comment
Watch Question

## View Solution Only

Commented:
Here is your answer. Go on extracting the substrings from the first and the second strings and compare them. I assume that there are two edit boxes and a button for starting the comparison process. I have done a simple comparison between the extracted strings. You can make it much more complicated and add counters to count errors. To achieve this, I have first developed a function called extract which I later use on the ButtonClick event.

function extract(var instr:string):string;
{Extracts the first substring in a string and removes the extract from the original string}
var blankpos :integer;
begin
blankpos := pos(' ',instr); {Blank space separates two substrings}
if blankpos > 0 then
begin
extract := copy(instr, 1, blankpos-1);
instr := copy(instr, blankpos+1, length(instr)-blankpos);
end else extract := instr;
end;

procedure TForm1.Button1Click(Sender: TObject);
var string1,string2:string; {String1:Original String, String2- Typed in String}
string1e,string2e:string; {String1e,2e : Sub String Extract from String1,2}
begin
string1:=Edit1.Text;
string2:=Edit2.Text;
repeat
string1e := extract(string1);
string2e := extract(string2);
if string1e<>string2e then
{ Do your match processing}
showmessage(string1e+'<>'string2e);
until string1e =  string1 {What has been extracted is equal to what is left in the string}
end;

Commented:
Hi!
Thank you for your proposal ...

but the problem is well more complex!

try :
1=Go on extracting the substrings
2=Goon extracting the substrings (1 error)
(Here your function give me 5 errors)
2=Go on estrating the substrings (2 errors)
(here your function give me 1 error)
2=on extracting the substrings
(The student don't type the first word : your function say ALL is wrong !)

I have already somme code :
I have already analyzed the problem.
I can send it by email :
That can put you on the way.

Regards.
Jean-Pol

Commented:
This answer was to help you, not exactly to solve your problem or earn me points. We do develop a lot of parsers and tokenisers. We first break down the whole text into tokens and positions and then we parse/group/verify/(whatever is needed to be done) with the tokens. This happens to be the core code for extracting tokens - whether it is Pascal or C. Hope you enjoyed it.

Commented:
1. First extract the words like animls says.
2. Then count the words -> gives already some sort of errors.
3. Then analyse each word by checking each character.

function CheckWord(Reference, check : string):integer;
var LReference, LCheck, i, WordError : integer;

begin
LReference := Length(Reference);
LCheck     := Length(Check);
WordError  := ABS(LReference-LCheck);
For i := 1 To LReference do begin
if Reference[i] <> Check[i] then begin
Inc(WordError);
end;
Result := WordError;
end;

4. The beauty of this method is that you actually can count
different errors!
5. Haven't checked the code, so if it doesn't work, let me know
and I'll put everything together.

Have fun,
c.u. ZifNab;

Commented:
Thank you for your proposal.
Unfortunately, as I explained, the problem is more complex.  Your function gives me no precise result.

With your function, try :

1=papa
2=paa ( 1 error [paPa])

Your function give me 3 errors !!!

My analysis of the problem is the following:
1) I loop the 2 lines until the first error.
2) The error can be:
- a letter lacks
- a letter to much
- a bad letter

3) Tests
- I copy the line 2 in 3 temporary strings
- I test each these 3 strings by correcting them 3 different manners according to the possible error type.  I seek in the 3 strings where is found the next error.  The largest result should give me the correct error type.
That functions correctly *sometimes* !
Would you like to see my code ?

Regards.
JP

Commented:
Sure, let me see your code.

Tom.Deprez@uz.kuleuven.ac.be

Commented:
I increase points to 140.

If what I ask is too difficult, an acceptable solution would be
to verify word by word.

!!!
- The student can have forgotten a space between 2 words.
- The student can have forgotten a word.

Ex.

1=An acceptable solution would be to verify word by word.

2=Anacceptable solution would be to verify word by word.  (=1 error ->Space between the 2 first words - all the others are correct)
2=An aceptable solution would be to verify word by word.   (=1 error ->acCeptable - easy : all the words are in the correct position)
2=An acceptable solution would to verify word by word.  (=1error ->BE)
2=An acceptable solution would beto verify word by wrd.  (=2 errors -> space between BE and TO, wOrd)

Commented:

I increase the point to 150

Commented:
Adjusted points to 150
Commented:
Try this example :

unit Unit1;

interface

uses
Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
StdCtrls;

type
TForm1 = class(TForm)
Edit1: TEdit;
Edit2: TEdit;
Button1: TButton;
Label1: TLabel;
procedure Button1Click(Sender: TObject);
private
{ Private declarations }
public
{ Public declarations }
end;

var
Form1: TForm1;

implementation

{\$R *.DFM}

function GetWord( st : string; nr : integer ) : string;
var i, lk : integer;
begin
i := 1;
lk := 1;
result := '';
while st[1]=' ' do System.Delete( st, 1, 1 );
while (i<=length(st)) do
begin
if st[i]=' ' then
inc(lk)
else
if lk=nr then
result := result+st[i];
inc(i);
end;
end;

function WordCount(s:string):integer;
var I : Word;
begin
Result := 0;
I := 1;
while I <= Length(s) do begin
while (I<=Length(s)) and (S[I]=' ') do
Inc(I);
if I<=Length(s) then
Inc(Result);
while (I<=Length(s)) and not(S[I]=' ') do
Inc(I);
end;
end;

function GetErrCount( s1,s2 : string ) : integer;
var e1,e2,e3 : integer;
begin
result := 0;
if Length(s1)=0 then
Result := Length(s2)
else if Length(s2)=0 then
Result := Length(s1)
else
if s1[1]=s2[1] then
begin
System.Delete( s1,1,1);
System.Delete( s2,1,1);
result := result+GetErrCount(s1,s2);
end
else
begin
e1 := GetErrCount( Copy(s1,2,length(s1)-1),Copy(s2,2,length(s2)-1) );
e2 := GetErrCount( s1,Copy(s2,2,length(s2)-1) );
e3 := GetErrCount( Copy(s1,2,length(s1)-1),s2 );
if e1<e2 then
if e1<e3 then
result := result+e1+1
else
result := result+e3+1
else
if e2<e3 then
result := result+e2+1
else
result := result+e3;
end;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
s1, s2 : string;
i, w, err : integer;
begin
err := 0;
W := WordCount(Edit1.Text);
for i:=1 to w do
begin
s1 := GetWord(Edit1.Text,i);
s2 := GetWord(Edit2.Text,i);
Err := Err + GetErrCount( s1,s2 );
end;
Label1.Caption := IntToStr( Err );
end;

end.

Not the solution you were looking for? Getting a personalized solution is easy.

Commented:
sorry one mistake

this is corrected :

function GetErrCount( s1,s2 : string ) : integer;
var e1,e2,e3 : integer;
begin
result := 0;
if Length(s1)=0 then
Result := Length(s2)
else if Length(s2)=0 then
Result := Length(s1)
else
if s1[1]=s2[1] then
begin
System.Delete( s1,1,1);
System.Delete( s2,1,1);
result := result+GetErrCount(s1,s2);
end
else
begin
e1 := GetErrCount( Copy(s1,2,length(s1)-1),Copy(s2,2,length(s2)-1) );
e2 := GetErrCount( s1,Copy(s2,2,length(s2)-1) );
e3 := GetErrCount( Copy(s1,2,length(s1)-1),s2 );
if e1<e2 then
if e1<e3 then
result := result+e1+1
else
result := result+e3+1
else
if e2<e3 then
result := result+e2+1
else
result := result+e3+1;
end;
end;

Commented:
Hi Mirek,

Thanks for your help.
Unfortunately, as I explained, the problem is complex. Your unit gives me no precise result.

try :
1=Papa va Ă  Virton
2=papa vaĂ  Virton (2 errors. Your code give me 14 errors)

(Try with the others examples (1=... 2=... in the question history)
Some tests give me a 202 error.

One error may be :
1. User has forgotten the letter (or a space between 2 words !!!)
2. User has written a letter too much
3. User has written a wrong letter

JP

Commented:
Yes, but this is good start point.
Try :

s1 := UpperCase( Edit1.Text );
s2 := UpperCase( Edit2.Text );
err := err + GetErrCount( s1,s2 );

This can be in some errors but the idea i think is good. One problem is in the Stack Use. This is recursive Proc, so i was try first devide strings to smaller.

Commented:
This won't work in this way. You can't break the string into
words because the error can be an extra space or a missing space
too.

I'll look into this, but I think there will be no solution to
this problem.

Commented:
krish - my proc works fine without breaking to single words (my last comment) but this is very very slow so i try now to improve this.

Commented:
Adjusted points to 180

Commented:
Adjusted points to 200

Commented:
Hi
I just found improvement of this function.

This is more fastest proc for counting errors :

function CountErrors( St1, St2 : string ) : integer;
const MaxWidth = 128; { max width of strings }
var   ETable : array[1..MaxWidth,1..MaxWidth] of integer;
L1,l2  : integer;

function Err( p1,p2 : integer ) : integer;
var e1,e2,e3 : integer;
begin
result  := ETable[p1,p2];
if result>=0 then exit else result := 0;
while (L1>=p1) and (L2>=p2) and (st1[p1]=st2[p2]) do
begin
inc(p1);
inc(p2);
end;
if (L1>=p1) and (L2>=p2) then
begin
e1 := Err( p1+1,p2+1 ); { icorrect char }
if e1=0 then
result := e1+1
else
begin
e2 := Err( p1, p2+1  ); { added char }
if e2=0 then
result := e2+1
else
begin
e3 := Err( p1+1, p2  ); { mising char }
if e1<=e2 then
if e1<=e3 then result := e1+1 else result := e3+1
else
if e2<=e3 then result := e2+1 else result := e3+1
end;
end;
end
else if L1=(p1-1) then { s1 ended }
Result := abs(L2-p2+1)
else if L2=(p2-1) then { s2 ended }
Result := abs(L1-p1+1);
ETable[p1,p2] := result;
end;

begin
FillChar( ETable, SizeOf(ErrTable), -1 );
L1 := Length(st1);
L2 := Length(st2);
Result := Err(1,1 );
end;

For finding positions of this errors you must wait some hours.

Commented:
Hi this is the final version (i think) of function which You need.

You don't write which Delphi you have, however this function work at all Delphi.

function FindErrors( St1, St2 : string; var ErrStr : string ) : integer;
const MaxWidth   = 255; { max width of strings }
type  TErrTable  = array[1..MaxWidth,1..MaxWidth] of byte;
var
ETable : ^TErrTable;
L1, L2,
ErrPos, ErrCount,
ErrType,TmpPos : Integer;

function Err( p1,p2 : integer ) : integer;
var e1,e2,e3 : integer;
begin
result  := ETable^[p1,p2];
if result=255 then
begin
result := 0;
while (L1>=p1) and (L2>=p2) and (st1[p1]=st2[p2]) do
begin
inc(p1);
inc(p2);
end;
if (L1>=p1) and (L2>=p2) then
begin
e1 := Err( p1+1,p2+1 ); { icorrect char }
if e1=0 then
begin ErrType := 1; result := e1+1 end
else
begin
e2 := Err( p1, p2+1  ); { added char }
if e2=0 then
begin ErrType := 2; result := e2+1 end
else
begin
e3 := Err( p1+1, p2  ); { mising char }
if e1<=e2 then
if e1<=e3 then begin ErrType := 1; result := e1+1 end else begin ErrType := 3; result := e3+1 end
else
if e2<=e3 then begin ErrType := 2; result := e2+1 end else begin ErrType := 3; result := e3+1 end
end;
end;
end
else if L1=(p1-1) then { s1 ended }
begin
Result := abs(L2-p2+1);
if Result<>0 then ErrType := 2;
end
else if L2=(p2-1) then { s2 ended }
begin
Result := abs(L1-p1+1);
if Result<>0 then ErrType := 3;
end
end;
ETable^[p1,p2] := result;
if Result<>0 then ErrPos := p1;
end;

begin
St1 := Copy(St1,1,MaxWidth);
St2 := Copy(St2,1,MaxWidth);
New( ETable );
ErrStr := '';
ErrCount := 0;
TmpPos := 0;
while (Length(st1)<>0) or (Length(St2)<>0) do
begin
ErrPos   := 0;
ErrType  := 0;
FillChar( ETable^, SizeOf(TErrTable), 255 );
if Err(1,1)>0 then
begin
Case ErrType of
1 : begin
ErrStr := ErrStr+'I'+IntToStr(ErrPos+TmpPos)+'; ';
st1 := Copy(st1,ErrPos+1, L1 );
st2 := Copy(st2,ErrPos+1, L2 );
end;
2 : begin
ErrStr := ErrStr+'A'+IntToStr(ErrPos+TmpPos)+'; ';
st1 := Copy(st1,ErrPos, L1 );
st2 := Copy(st2,ErrPos+1, L2 );
Dec(TmpPos);
end;
3 : begin
ErrStr := ErrStr+'M'+IntToStr(ErrPos+TmpPos)+'; ';
st1 := Copy(st1,ErrPos+1, L1 );
st2 := Copy(st2,ErrPos, L2 );
end;
end;
TmpPos := TmpPos+ErrPos;
inc(ErrCount);
end
else
begin
St1 := '';
St2 := '';
end;
end;
result := ErrCount;
Dispose(ETable);
end;

This function count the errors and return information, what and where are the errors. A=added, I=Incorrect, M=missing;

You can use this as in this example :

procedure TForm1.Button1Click(Sender: TObject);
var ErrStr : string;
begin
Edit3.Text := 'errors = '+ IntToStr( FindErrors( Edit1.Text, Edit2.Text, ErrStr ) );
Edit4.Text := ErrStr;
end;

for string :
Edit1.Text :='The first line is the reference.'
Edit2.Text :='Thehe frst line isthe rference.'

You get Result=5
and ErrStr:='A4; A4; M6; M18; M24;'

Commented:
Thank you Mirek for this very good work !

Access more of Experts Exchange with a free account
##### Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

• View three pieces of content (articles, solutions, posts, and videos)
• Ask the experts questions (counted toward content limit)
• Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the