Link to home
Start Free TrialLog in
Avatar of 2266180
2266180Flag for United States of America

asked on

Performance contest :)

I am interested in the speed and memory usage of all solutions from here: https://www.experts-exchange.com/questions/21980074/Remove-spaces-in-sentence.html

testing will be done on the same string of at least 100KB (you choose it randomly)
first 3 good solutions win each 20 points.
any first 3 bugs/real issues (not mentioned already in the above question) will get additional 20 points
any other serios bug/real issue (not awared) will get a separate 20 points

you must do at least 3 runs on each test in order to make sure that there is no interference from antivirus, or otehr applications that might influence the speed output.

posters are asked to also post the method used for calculating the speed and memory usage. I will award 20 points for each method (not for each poster) in order of posting. (so you might want to post the method first :P). There is no problem if we get 4 posters getting the results for all or for a few of the soltuions as it is good to have a comparison. if one poster says it took him 5 minutes to run one of teh solution and anotehr 3 says it only took 2 minutes then I guess the otehr 3 are rigth ;)

please ***DO NOT*** post the string used for testing. we can generate one any time or copy the content of a big enough web page.

let the fun begin :)
Avatar of rmeere
rmeere

lol. nice one. I won't participate cause I will be baised towards my own solution :)
Avatar of 2266180

ASKER

you can participate because you will not cheat (I hope :P ) and since you wil be using the same measurement tehnique and data for all solutions you should be pretty fair from this point of view ;)
lol
well ok, there has to be 1 person or 1 program to compare the sollutions then so we can rule out speed of the compu, hd etc
Avatar of 2266180

ASKER

I know this might look funny to some of you, but if you look at the constructive part you'll notice that in the end this question will hold valuable information regarding performance in delphi.
I am not expecting anything from people that are not used to thinking and/or meaasuring performance but from the people that deal with this on a regular basis.
I hope to gather here the information one needs to have when considering performance.

the idea is not to have 1 program or method, but more. probably it would have been an idea to ask for the cpu speed as well. so lets assume I did that now :)
yeah you're right Limbeck. I nominate ciuly seeing as it's his thread.
that is true. well why not have one program using each  sollutions as a function and use each function in a row (maybe even in a diff. order on the second and third try). thats the only way to compare it imho, you wont be bothered with other programs running, anti-viri soft. mem. issues etc.

i dont have the time right now to build it, sorry.
ASKER CERTIFIED SOLUTION
Avatar of Wim ten Brink
Wim ten Brink
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Ditto Limbeck
Limbeck
I don't say that absolutele :)
Alex,
You new version gives me 2000ms
Avatar of 2266180

ASKER

well the nice thing about this is that you can also test it yourselves ;)
and of course if anybody else comes up with other, possibly better, solutions.

good one mokule. seems that delete really put a break in there :)
Limbeck
4600ms - but the loop shortened 10 times
Thanks, Mokule. :-) As I said, I replaced the Length() functioncalls which save quite a few calls. The only difference now is that I still call UniqueStringA, which (if it could be removed) still slows things down a bit. Your solution is more like the suggested PChar solution but there's probably something else slowing it down. :-)

Okay, an even fastersolution from me then... :-)

procedure StripMultipleSpacesProc(var Result: string);
var
  I, J: Integer;
  C: PChar;
  Max: Integer;
begin
  I := Pos(#32#32, Result);
  if (I > 0) then
  begin
    UniqueString(Result);
    Max := Length(Result);
    C := PChar(@Result[I]);
    repeat
      Inc(I);
      Inc(C);
      J := Succ(I);
      while (J <= Max) and (Result[J] = #32) do
        Inc(J);
      while (J <= Max) do
      begin
        if (Result[Pred(I)] = #32) then
        begin
          while (J <= Max) and (Result[J] = #32) do
            Inc(J);
        end;
        if (J <= Max) then
        begin
          C^ := Result[J];
          Inc(I);
          Inc(C);
          Inc(J);
        end;
      end;
    until (J > Max);
    SetLength(Result, Pred(I));
  end;
end;

The difference? The UniqueStringA call is gone since I now use a PChar to access the contents of the string. To still make it work okay, I call UniqueString at the beginning of this procedure so I'm not accidently modifying too much. :-)
And yes, it's a procedure, which makes it more interesting since there's no need now to copy the string contents first. :-) It's another speed increase...

Wild guess? I think this one ends within 1750 ms... :-)
Interesting stuff, what about memory usage mokule? And what about my solution?
ok, ta :)
Alex,
No need for this
    UniqueString(Result);
it is forced here
    C := PChar(@Result[I]);

Your current result
1850-1900ms
So is mine at the moment, though I don't change anything in my code.
rmeere,
You want it.
19328ms
- Your code can't be compiled.
- result is not initialized
- result is wrong - no last character
lol. that's BAD.
It is? Well, better safe than sorry. :-) Still only a reduction of 100 ms... :-( I had hoped for more.

Funny to see how slow the solution from rmeere is compared to some of the others. :-) And still there are many inexperienced Delphi developers who would just choose for exactly the same solution. :-)

Now, about memory usage... :-) That's going to be interesting...

My solution is modifying the string in it's own memory space so all I need is the memory for the string plus the memory of the local variables. I use 3 integers and one PChar. These are all 4 bytes so I use 100 KB for the string plus 16 bytes for the remaining stuff. :-) Not much, when you realise that the solution already had this 100 KB in memory.
Only problem could be that because UniqueString is called (implicitly or explicitly) there's a chance that the string is still copied in-memory thus doubling the amount of memory required.

But we're at a point here where more speed can be gained by improving the algorithm used to find the double spaces. I think these could be a bit better. :-)
Which reminds me... In the original question an answer was accepted but what is the performance of that solution? :-)
Avatar of 2266180

ASKER

I would also like to see a method with performance counters ;) (I have the functions at home somewhere, probably google can spit a few as well)
then, if I remember correctly, I've seen a method a year or two back that was working more or less directly with the CPU but unfortunatly I don't remember what it did. if anyone knows what I am talking abuot, pointers and/or info would be nice :)

also, for calculating the memory used, I can only think of getting the memory used by the whole app using api or wmi. any otehr ideas/solutions?
function RDTSC: Int64;
asm
  dw $310F  {BASM doesn't support RDTSC}
end;

That's the function which tells you the number of clock cycles. Call it twice, calculate the difference and you'll know how many clock cycles some task took, no matter at which speed the performance test was run. :-)

For memory usage, you could use GlobalMemoryStatus() to get this information. Unfortunately, it's not that precise. Still, if you calculate differences before and after every solution you would have some good indication. Or use GetHeapStatus() to get the Delphi status of it's internal heap.
Avatar of 2266180

ASKER

THAT is what I was talking about :D thanks a bunch alex ;)

good idea with the getheapstatus. though it requires some research ormaybe testing to see how exact it is.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
And the form, in case anyone wants to try with my interface:

object Form1: TForm1
  Left = 0
  Top = 0
  Caption = 'Form1'
  ClientHeight = 351
  ClientWidth = 723
  Color = clBtnFace
  Font.Charset = DEFAULT_CHARSET
  Font.Color = clWindowText
  Font.Height = -11
  Font.Name = 'Tahoma'
  Font.Style = []
  OldCreateOrder = False
  DesignSize = (
    723
    351)
  PixelsPerInch = 96
  TextHeight = 13
  object Label1: TLabel
    Left = 24
    Top = 195
    Width = 57
    Height = 25
    Anchors = [akLeft, akBottom]
    AutoSize = False
    Caption = 'Text size:'
    Layout = tlCenter
    ExplicitTop = 207
  end
  object Memo1: TMemo
    Left = 0
    Top = 0
    Width = 723
    Height = 189
    Align = alTop
    Anchors = [akLeft, akTop, akRight, akBottom]
    Font.Charset = DEFAULT_CHARSET
    Font.Color = clWindowText
    Font.Height = -11
    Font.Name = 'Courier New'
    Font.Style = []
    Lines.Strings = (
      'Press [Generate text] or paste any text here')
    ParentFont = False
    ScrollBars = ssVertical
    TabOrder = 0
    ExplicitHeight = 201
  end
  object btnGenerateText: TButton
    Left = 288
    Top = 195
    Width = 139
    Height = 25
    Anchors = [akBottom]
    Caption = 'Generate text'
    TabOrder = 1
    OnClick = btnGenerateTextClick
    ExplicitTop = 207
  end
  object btnLetTheTestsBegin: TButton
    Left = 545
    Top = 195
    Width = 152
    Height = 25
    Anchors = [akRight, akBottom]
    Caption = 'Let the tests begin!'
    TabOrder = 2
    OnClick = btnLetTheTestsBeginClick
    ExplicitTop = 207
  end
  object Edit1: TEdit
    Left = 87
    Top = 195
    Width = 121
    Height = 25
    Anchors = [akLeft, akBottom]
    TabOrder = 3
    Text = '100000'
    ExplicitTop = 207
  end
  object Memo2: TMemo
    Left = 0
    Top = 226
    Width = 723
    Height = 125
    Align = alBottom
    Lines.Strings = (
      'Test results')
    TabOrder = 4
    ExplicitTop = 238
  end
end
Nice test, alkisg. Too bad you used some older code of mine, which used the Length() function multiple times. Try using the last code example that I posted and test it again. :-)
...and a comment:

Mokule didn't test for long enough strings. Delete/Insert/+ etc allocate new memory all the time, so on big strings (e.g. 1Mb) they suck. On 10Mb they would even cause memory swapping and would take all day.

That's the main reason I had had implemented a replaceall substitude.
The result may be BIGGER than s, so I kept all the match offsets to use ONLY ONE ALLOCATION.

function StringReplaceAll(const s, oldPattern, newPattern: AnsiString): AnsiString;
var
  offsets: array of integer;
  ofs, lastResultOfs, lastSOfs, numOffsets, i: integer;
begin
  ofs := PosEx(oldPattern, s, 1);                      //returns 0 even for Length(oldPattern) = 0
  if ofs = 0 then
  begin
    Result := s;
    Exit;
  end;
  if Length(oldPattern) = Length(newPattern) then              //both <> 0
  begin
    Result := s;
    repeat                                                    //ofs <> 0 on the first loop
      StrMove(@Result[ofs], @newPattern[1], Length(newPattern));
      ofs := PosEx(oldPattern, s, ofs + Length(oldPattern));
    until ofs = 0;
    Exit;
  end;
//The general case: Length(oldPattern) <> Length(newPattern). Find all offsets first.
  numOffsets := 0;
  repeat                                                      //ofs <> 0 on the first loop
    if numOffsets > High(offsets) then
      SetLength(offsets, 100 + MulDiv(Length(offsets), 150, 100));
    offsets[numOffsets] := ofs;
    Inc(numOffsets);
    ofs := PosEx(oldPattern, s, ofs + Length(oldPattern));
  until ofs = 0;
  SetLength(offsets, numOffsets);
//Build Result length. Length(newPattern)-Length(oldPattern) could be < 0.
  SetLength(Result, Length(s) + numOffsets * (Length(newPattern) - Length(oldPattern)));
//Do the copying.
  lastResultOfs := 1;
  lastSOfs := 1;
  for i := 0 to numOffsets - 1 do
  begin                                                                
    StrMove(@Result[lastResultOfs], @s[lastSOfs], offsets[i] - lastSOfs);
    Inc(lastResultOfs, offsets[i] - lastSOfs);
    StrMove(@Result[lastResultOfs], PAnsiChar(newPattern), Length(newPattern));
    Inc(lastResultOfs, Length(newPattern));
    lastSOfs := offsets[i] + Length(oldPattern);
  end;
  StrMove(@Result[lastResultOfs], @s[lastSOfs],
    Length(Result) - lastResultOfs + 1);
end;
Alex, I used your last procedure, but converted it into a function.
I also did 100 loops and got the averages, to provide more accurate results.

Running with Length(text) = 10000000
AlkisgStripMultipleSpaces completed in 129 milliseconds.
Workshop_AlexStripMultipleSpaces completed in 148 milliseconds.
MokuleStripMultipleSpaces completed in 148 milliseconds.

THE INTERESTING PART:
----------------------------
I changed the parameters of the source string, to have fewer continued spaces:
const
  OneSpaceProbability = 1;
  MoreSpacesProbability = 60;
//combine the above 2 probabilities to find the probability of three or more spaces

Results:

Running with Length(text) = 10000000
AlkigStripMultipleSpaces completed in 83 milliseconds.
Workshop_AlexStripMultipleSpaces completed in 123 milliseconds.
MokuleStripMultipleSpaces completed in 123 milliseconds.

My function uses Move, which uses rep movsb, and that's why it is faster...
Avatar of 2266180

ASKER

even though some solutions are very slow, I would like you all to include ALL solutions in your comparative tests. it is always a good thing to find a comparison between a good solution and a bad solution to see excatly how bad it is ;)
so please also give numbers for the other solutions as well :)

thanks
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@ciuly: sorry, this is IMPOSSIBLE for such large strings...
It would probably take a day or more...
I tried it with Delphi's functions once, and waited 17 hours...

I'll post a comparison for smaller strings (< 1Mb, maybe even < 100Kb) in a while...
And as an additional suggestion, use RDTSC instead of GetTickCounts to count CPU cycles instead of CPU times. CPU Cycles should be about the same for all computers with similar processors. Thus if you and I are both using an Intel Pentium IV system but yours is 3.2 GHz and mine only 1.8 GHZ then you would get smaller CPU times but we should both get about the same number of cycles. I just have to wait twice as long. :-)

Btw, the use of move would speed things up of course, at assembler level, but it shouldn't have to make that much difference. Actually, in a situation where you have many one-character words between lots of spaces then you might not have much gain there.
Then again, my solution isn't even using the most optimized Delphi code. It should be possible to make it faster, by doing more with PChar() typecasts...
Well done, Russell! On first glance, I'd say yours is the fastest... :-)
Avatar of 2266180

ASKER

alkisg: I did mention using 100KB strings. if the method is still very slow, you can use a string of say 10KB and mention this that the moethod takes 4 mionutes wheres the otehr methods take less thn 100 ms or something like this. the user will understand then that there is a very big difference between the 2 ;)

and of course alexs suggestion of calculating cpu cycles is a very good one, so lets have a shoot of that. or both :)
I included the other expert's functions as well.
All the test were run in a loop (100 times).
The times/cycles are for all the loops (no average).
CPU = Celeron 1.6GHz
Text size = 100.000
OneSpaceProbability = 1; //approximately one space per 100 chars
MoreSpacesProbability = 60; //60% that another space follows
//combine the above 2 probabilities to find the probability of three or more spaces

WARNING: For the functions that use string Deletions/Additions etc, the time grows exponentially with the text size. This is because they alloc memory for each delete/add, and large memory allocations are very time consuming.

1) CPU CYCLES
---------------------------
To measure CPU cycles, I used

function RDTSC: Int64; assembler;
asm
  RDTSC  // result Int64 in EAX and EDX
end;

Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 47238688 CPU cycles.
Workshop_AlexStripMultipleSpaces completed in 131259408 CPU cycles.
MokuleStripMultipleSpaces completed in 116306592 CPU cycles.
RussellStripMultipleSpaces completed in 73143008 CPU cycles.
CiulyStripMultipleSpaces completed in 8345003120 CPU cycles.
LimbeckStripMultipleSpaces completed in 103809248160 CPU cycles.
RmeereSubstituteSpaces completed in 4557413756 CPU cycles.
 ... and had errors!
MahdiparakStripMultipleSpaces completed in 4792759152 CPU cycles.
 ... and had errors!


2) TIMEGETTIME
---------------------------
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 30 milliseconds.
Workshop_AlexStripMultipleSpaces completed in 74 milliseconds.
MokuleStripMultipleSpaces completed in 80 milliseconds.
RussellStripMultipleSpaces completed in 55 milliseconds.
CiulyStripMultipleSpaces completed in 4739 milliseconds.
LimbeckStripMultipleSpaces completed in 62763 milliseconds.
RmeereSubstituteSpaces completed in 3053 milliseconds.
 ... and had errors!
MahdiparakStripMultipleSpaces completed in 3133 milliseconds.
 ... and had errors!
 

THE CODE
---------------------------
unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, StrUtils, MMSystem;

type
  TForm1 = class(TForm)
    Memo1: TMemo;
    btnGenerateText: TButton;
    btnLetTheTestsBegin: TButton;
    Label1: TLabel;
    Edit1: TEdit;
    Memo2: TMemo;
    procedure btnGenerateTextClick(Sender: TObject);
    procedure btnLetTheTestsBeginClick(Sender: TObject);
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.btnGenerateTextClick(Sender: TObject);
const
  OneSpaceProbability = 1;
  MoreSpacesProbability = 60;
//combine the above 2 probabilities to find the probability of three or more spaces
var
  i: Integer;
  lastOneWasSpace: Boolean;
  theText: String;
  textSize: Integer;
begin
  Randomize;
  textSize := StrToInt(Edit1.Text);
  SetLength(theText, textSize);
  i := 1;
  lastOneWasSpace := false;
  while i <= textSize do
  begin
    if i mod 82 = 81 then //insert line breaks every 80 chars.
    begin
      theText[i] := #13;
      Inc(i);
      theText[i] := #10;
    end
    else if lastOneWasSpace then
    begin
      if Random(100) < MoreSpacesProbability then
        theText[i] := ' '
      else
        theText[i] := char(ord('a') + random(26));
    end
    else begin
      if Random(100) < OneSpaceProbability then
        theText[i] := ' '
      else
        theText[i] := char(ord('a') + random(26));
    end;
    lastOneWasSpace := theText[i] = ' ';
    Inc(i);
  end;
  Memo1.Text := theText;
end;

function AlkisgStripMultipleSpaces(s: String): String;
var
  spacesStart, nextWordStart, nextSpacesStart, whereToMove, howManyToMove: Integer;
  finished: Boolean;
begin
  Result := s;
//Working only with Result from now on.
//It would be faster if we were permitted to use s directly.
  spacesStart := Pos('  ', Result);
  whereToMove := spacesStart; //a single space followed by the word should be put in this position
  finished := spacesStart <= 0;
  while not finished do
  begin
    nextWordStart := spacesStart + 2;
{$IFOPT R+}{$DEFINE RestoreRangeChecking}{$ELSE}{$UNDEF RestoreRangeChecking}{$ENDIF}{$R-}
//Range checking is turned off to avoid checking if nextWordStart <= Length(Result)
//It would be much faster if assembly rep scasb was used
    while Result[nextWordStart] = ' ' do //this will also stop on #0 on end of string
      Inc(nextWordStart);
{$IFDEF RestoreRangeChecking}{$R+}{$ENDIF}
    nextSpacesStart := PosEx('  ', Result, nextWordStart);
    finished := nextSpacesStart <= 0;
    if not finished then
    begin
      howManyToMove := nextSpacesStart - nextWordStart + 1;
      Move(Result[nextWordStart - 1], Result[whereToMove], howManyToMove);
      Inc(whereToMove, howManyToMove);
      spacesStart := nextSpacesStart;
    end
    else begin
      howManyToMove := Length(Result) + 1 - nextWordStart + 1;
      Move(Result[nextWordStart - 1], Result[whereToMove], howManyToMove);
      SetLength(Result, whereToMove + howManyToMove - 1);
    end;
  end;
end;

function Workshop_AlexStripMultipleSpaces(s: string): string;
var
  I, J: Integer;
  C: PChar;
  Max: Integer;
begin
  Result := s;
  I := Pos(#32#32, Result);
  if (I > 0) then
  begin
    UniqueString(Result);
    Max := Length(Result);
    C := PChar(@Result[I]);
    repeat
      Inc(I);
      Inc(C);
      J := Succ(I);
      while (J <= Max) and (Result[J] = #32) do
        Inc(J);
      while (J <= Max) do
      begin
        if (Result[Pred(I)] = #32) then
        begin
          while (J <= Max) and (Result[J] = #32) do
            Inc(J);
        end;
        if (J <= Max) then
        begin
          C^ := Result[J];
          Inc(I);
          Inc(C);
          Inc(J);
        end;
      end;
    until (J > Max);
    SetLength(Result, Pred(I));
  end;
end;

function CiulyStripMultipleSpaces(s:String):string;
var
  i:integer;
begin
  i:=1;
  while i<=length(s) do
  begin
    while (i<=length(s)) and (s[i]<>' ') do
      i:=i+1;
    i:=+i+1;
    while (i<=length(s)) and (s[i]=' ') do
      delete(s,i,1);// I don't know excatly how badly this affects performance. it could be calculated.
  end;
  result:=s;
end;

function MokuleStripMultipleSpaces(s:String):string;
var
  j: integer;
  first: boolean;
  p: ^Char;
  len: integer;
begin
  Result := s;
  p := @Result[1];
  first := True;
  len := Length(Result);
  for j := 1 to len do
    begin
    if Result[j] <> ' ' then
      begin
      p^ := Result[j];
      Inc(p);
      first := True;
      end
    else if first then
      begin
      p^ := Result[j];
      Inc(p);
      first := False;
      end
    end;
  j := integer(p)-integer(@Result[1]);
  SetLength(Result,j);
end;

function RussellStripMultipleSpaces(Value: String): String;
var  lpszSource:    PChar;
     lpszDest:      PChar;
     dwLength:      Integer;
     dwIndex:       Integer;
begin

  // Get source length
  dwLength:=Length(Value);

  //Check source length
  if (dwLength > 1) then
  begin
     // Make string safe for casting
     SetLength(result, dwLength);
     // Get pointer to source
     lpszSource:=Pointer(Value);
     // Get pointer to dest
     lpszDest:=Pointer(result);
     // Walk the string
     for dwIndex:=0 to Pred(dwLength) do
     begin
        // Check for double spaces
        if not(Word(Pointer(lpszSource)^) = 8224) then
        begin
           // Copy source
           lpszDest^:=lpszSource^;
           // Increment destination pointer
           Inc(lpszDest);
        end;
        // Increment the source string
        Inc(lpszSource);
     end;
     // Truncate string
     SetLength(result, lpszDest - PChar(Pointer(result)));
  end
  else
     // Straight copy
     result:=Value;
end;

function LimbeckStripMultipleSpaces(s: String): String;
begin
  Result := s;
  while Pos('  ', Result) > 0 do
    Result := StringReplace(Result, '  ', ' ', [rfReplaceAll]);
end;

function MahdiparakStripMultipleSpaces(s: String): String;
var
  i:Integer;
  space:Boolean;
begin
  s := s + ' ';
  for i:=1 to Length(s) do
  begin
    if s[i]<>' ' then
    begin
      Result:=Result+s[i];
      space:=false;
    end
    else
    begin
      if space<>true then
      begin
        Result:=Result+' ';
        space:=true;
      end;
    end;
  end;
end;

function RmeereSubstituteSpaces(sIn: string) : string;
var
  i : Integer;
begin
  i := 1;
  while i < Length(sIn) do
  begin
    if sIn[i] <> ' ' then
    begin
      Result := Result + sIn[i];
      Inc(i);
      Continue;
    end;
    Result := Result + ' ';
    while (sIn[i] = ' ') and (i < Length(sIn)) do
      Inc(i);
  end;
end;

function TimeGetTime: Int64; assembler;
asm
  RDTSC  // result Int64 in EAX and EDX
end;

procedure TForm1.btnLetTheTestsBeginClick(Sender: TObject);
type
  SubstituteSpacesFunction = record
    ssName: string;
    ssFunction: function(s: string) : string;
  end;
const
  functions: array[0..7] of SubstituteSpacesFunction = (
    (ssName: 'AlkisgStripMultipleSpaces'; ssFunction: AlkisgStripMultipleSpaces),
    (ssName: 'Workshop_AlexStripMultipleSpaces'; ssFunction: Workshop_AlexStripMultipleSpaces),
    (ssName: 'MokuleStripMultipleSpaces'; ssFunction: MokuleStripMultipleSpaces),
    (ssName: 'RussellStripMultipleSpaces'; ssFunction: RussellStripMultipleSpaces),
    (ssName: 'CiulyStripMultipleSpaces'; ssFunction: CiulyStripMultipleSpaces),
    (ssName: 'LimbeckStripMultipleSpaces'; ssFunction: LimbeckStripMultipleSpaces),
    (ssName: 'RmeereSubstituteSpaces'; ssFunction: RmeereSubstituteSpaces),
    (ssName: 'MahdiparakStripMultipleSpaces'; ssFunction: MahdiparakStripMultipleSpaces));
var
  theText, compacted, correctStr: String;
  startMS, endMS: Int64;
  f, i: integer;
begin
  theText := Memo1.Text;
  Edit1.Text := IntToStr(Length(theText));
  Memo2.Text := 'Running with Length(text) = ' + Edit1.Text;
  Application.ProcessMessages;

  correctStr := AlkisgStripMultipleSpaces(theText); //:-) OK, its just for comparisons...
  for f := Low(functions) to High(functions) do
  begin
    startMS := TimeGetTime;
    for i := 1 to 100 do
      compacted := functions[f].ssFunction(theText);
    endMs := TimeGetTime;
    Memo2.Lines.Add(functions[f].ssName + ' completed in '
      + IntToStr((endMs - StartMs) div 1) + ' CPU cycles.');
    if compacted <> correctStr then
      Memo2.Lines.Add(' ... and had errors!');
    Application.ProcessMessages;
    Sleep(100); //This is to give Windows some time to adjust paged memory
  end;
end;

end.

Anyone run this on a Pentium 4 class (or above) system? I would, but only have D5 Enterprise and don't have the PosEx function...

The only reason I ask is that I know for fact that the rep(X), scas(X), etc functions are slower (time wise) than performing the same thing using single instructions. (like cmp, mov, etc). It would be interesting to know if the numbers stay the same....

Russell
rep(x) is SLOWER than cmp???
Wow, I've stopped assembly programming 12 years ago, I guess some things DO change!!!
In 80x86, if I remember correctly (it's been 12 years),
rep scasb used 1 cycle
cmp (reg to mem) 1 cycle
jmp 2 cycles
so it was 3 times faster...
I did some disassembly once in Pascal Pos, it used rep scasb.
I haven't looked at PosEx.

Now that you mentioned it, I looked at PosEx source code. It DOESN'T use rep scasb!!!
==> So I guess you are right here (and the timings are validated).

Since PosEx has an MPL licence, I guess it's OK to post it:

* ***** BEGIN LICENSE BLOCK *****
 * Version: MPL 1.1
 *
 * The implementation of function PosEx is subject to the
 * Mozilla Public License Version 1.1 (the "License"); you may
 * not use this file except in compliance with the License.
 * You may obtain a copy of the License at http://www.mozilla.org/MPL/
 *
 * Software distributed under the License is distributed on an "AS IS" basis,
 * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
 * for the specific language governing rights and limitations under the
 * License.
 *
 * The Original Code is Fastcode
 *
 * The Initial Developer of the Original Code is Fastcode
 *
 * Portions created by the Initial Developer are Copyright (C) 2002-2004
 * the Initial Developer. All Rights Reserved.
 *
 * Contributor(s): Aleksandr Sharahov
 *
 * ***** END LICENSE BLOCK ***** *)
function PosEx(const SubStr, S: string; Offset: Integer = 1): Integer;
asm
       test  eax, eax
       jz    @Nil
       test  edx, edx
       jz    @Nil
       dec   ecx
       jl    @Nil

       push  esi
       push  ebx

       mov   esi, [edx-4]  //Length(Str)
       mov   ebx, [eax-4]  //Length(Substr)
       sub   esi, ecx      //effective length of Str
       add   edx, ecx      //addr of the first char at starting position
       cmp   esi, ebx
       jl    @Past         //jump if EffectiveLength(Str)<Length(Substr)
       test  ebx, ebx
       jle   @Past         //jump if Length(Substr)<=0

       add   esp, -12
       add   ebx, -1       //Length(Substr)-1
       add   esi, edx      //addr of the terminator
       add   edx, ebx      //addr of the last char at starting position
       mov   [esp+8], esi  //save addr of the terminator
       add   eax, ebx      //addr of the last char of Substr
       sub   ecx, edx      //-@Str[Length(Substr)]
       neg   ebx           //-(Length(Substr)-1)
       mov   [esp+4], ecx  //save -@Str[Length(Substr)]
       mov   [esp], ebx    //save -(Length(Substr)-1)
       movzx ecx, byte ptr [eax] //the last char of Substr

@Loop:
       cmp   cl, [edx]
       jz    @Test0
@AfterTest0:
       cmp   cl, [edx+1]
       jz    @TestT
@AfterTestT:
       add   edx, 4
       cmp   edx, [esp+8]
       jb   @Continue
@EndLoop:
       add   edx, -2
       cmp   edx, [esp+8]
       jb    @Loop
@Exit:
       add   esp, 12
@Past:
       pop   ebx
       pop   esi
@Nil:
       xor   eax, eax
       ret
@Continue:
       cmp   cl, [edx-2]
       jz    @Test2
       cmp   cl, [edx-1]
       jnz   @Loop
@Test1:
       add   edx,  1
@Test2:
       add   edx, -2
@Test0:
       add   edx, -1
@TestT:
       mov   esi, [esp]
       test  esi, esi
       jz    @Found
@String:
       movzx ebx, word ptr [esi+eax]
       cmp   bx, word ptr [esi+edx+1]
       jnz   @AfterTestT
       cmp   esi, -2
       jge   @Found
       movzx ebx, word ptr [esi+eax+2]
       cmp   bx, word ptr [esi+edx+3]
       jnz   @AfterTestT
       add   esi, 4
       jl    @String
@Found:
       mov   eax, [esp+4]
       add   edx, 2

       cmp   edx, [esp+8]
       ja    @Exit

       add   esp, 12
       add   eax, edx
       pop   ebx
       pop   esi
end;
Just out of curiosity, on the weekend I'll do my own implementation of PosEx using rep scasb, and I'll time them both and I'll post the results here...
You *HAVE* to be right, Move also doesn't use rep movsb.
Yet, most other system functions (_Insert, _Delete, _Copy) do use rep(X)...

Move is also programmed by "FastCode" and under MPL license, while the others are (c) Borland.
Alkisg,

Thanks for the PosEx post, much appreciated. And regarding the asm topic, I *do* know what I am talking about...
The P4 changed alot of things; one of those was a slowdown (compared to hand rolled code) for operations like:

mov(x)
rep
repne
scas(b)
bt(x) bit testing

Much of borland's code was written years ago, but you should find that newer code does not (or makes little use of) the instructions mentioned above.

Anyways, with the PosEx in place, I was able to setup your testbed, and also included the ms run time as well as the rdtsc time. The only changes were as follows:

// rdtsc not implemented in inline assembler for D5 and below
function TimeGetTime: Int64;
asm
  db $0f,$31
end;

 -- and --

// Some of these were REALLY slow, commented out so I could quickly baseline the first 5
procedure TForm1.btnLetTheTestsBeginClick(Sender: TObject);
type
  SubstituteSpacesFunction = record
    ssName: string;
    ssFunction: function(s: string) : string;
  end;
const
  functions: array[0..4] of SubstituteSpacesFunction = (
    (ssName: 'AlkisgStripMultipleSpaces'; ssFunction: AlkisgStripMultipleSpaces),
    (ssName: 'Workshop_AlexStripMultipleSpaces'; ssFunction: Workshop_AlexStripMultipleSpaces),
    (ssName: 'MokuleStripMultipleSpaces'; ssFunction: MokuleStripMultipleSpaces),
    (ssName: 'RussellStripMultipleSpaces'; ssFunction: RussellStripMultipleSpaces),
    (ssName: 'CiulyStripMultipleSpaces'; ssFunction: CiulyStripMultipleSpaces));
    //
    //(ssName: 'LimbeckStripMultipleSpaces'; ssFunction: LimbeckStripMultipleSpaces),
    //(ssName: 'RmeereSubstituteSpaces'; ssFunction: RmeereSubstituteSpaces),
    //(ssName: 'MahdiparakStripMultipleSpaces'; ssFunction: MahdiparakStripMultipleSpaces));
var
  theText, compacted, correctStr: String;
  startMS, endMS: Int64;
  dwMark:     Cardinal;
  f, i: integer;
begin
  theText := Memo1.Text;
  Edit1.Text := IntToStr(Length(theText));
  Memo2.Text := 'Running with Length(text) = ' + Edit1.Text;
  Application.ProcessMessages;

  correctStr := AlkisgStripMultipleSpaces(theText); //:-) OK, its just for comparisons...
  for f := Low(functions) to High(functions) do
  begin
     dwMark:=MMSystem.timeGetTime;
     startMS := TimeGetTime;
     for i := 1 to 100 do
        compacted := functions[f].ssFunction(theText);
     endMs := TimeGetTime;
     dwMark:=MMSystem.timeGetTime - dwMark;
    Memo2.Lines.Add(functions[f].ssName + ' completed in '
      + IntToStr((endMs - StartMs) div 1) + ' CPU cycles. ('+IntToStr(dwMark)+') ms');
    if compacted <> correctStr then
      Memo2.Lines.Add(' ... and had errors!');
    Application.ProcessMessages;
    Sleep(100); //This is to give Windows some time to adjust paged memory
  end;
end;

end.

On my P4 2.8GHZ / WinXP Pro / 1GB Ram I got the following results:

Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 94923772 CPU cycles. (47) ms
Workshop_AlexStripMultipleSpaces completed in 141730824 CPU cycles. (47) ms
MokuleStripMultipleSpaces completed in 142410388 CPU cycles. (47) ms
RussellStripMultipleSpaces completed in 68480552 CPU cycles. (16) ms
CiulyStripMultipleSpaces completed in 10139402120 CPU cycles. (4000) ms



Maybe Delphi 5 uses an older and slower Move() implementation, which uses repXXX, which as you said are slower on P4.
I include FastCode's Move() below, if you want you may test my function with it.


I tested AlkisgMove() with rep movsd (dword aligned)
vs FastCode's Move(), which mainly uses fild and fistp qword ptr ...

Results (1Mb data / 10000 loops):
AlkisgMove = 28525 ms
Move = 28253 ms

FastCode, well, has (a little!) faster code (maybe it'll be a lot faster on P4). But he uses 8-byte operands (qword ptr ...), which are faster that 4-byte operands (movsd)... I wish movsq existed to test it! :-)

Nevertheless, he also uses cmp/jle in the main loop. I guess instruction pipelining / branch prediction or whatever they call them has improved a lot since 80x86!!!


(* ***** BEGIN LICENSE BLOCK *****
 * Version: MPL 1.1
 *
 * The assembly implementation of function Move is subject to the
 * Mozilla Public License Version 1.1 (the "License"); you may
 * not use this file except in compliance with the License.
 * You may obtain a copy of the License at http://www.mozilla.org/MPL/
 *
 * Software distributed under the License is distributed on an "AS IS" basis,
 * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
 * for the specific language governing rights and limitations under the
 * License.
 *
 * The Original Code is Fastcode
 *
 * The Initial Developer of the Original Code is Fastcode
 *
 * Portions created by the Initial Developer are Copyright (C) 2002-2004
 * the Initial Developer. All Rights Reserved.
 *
 * Contributor(s): John O'Harrow
 *
 * ***** END LICENSE BLOCK ***** *)
procedure Move(const Source; var Dest; count : Integer);
{$IFDEF PUREPASCAL}
var
  S, D: PChar;
  I: Integer;
begin
  S := PChar(@Source);
  D := PChar(@Dest);
  if S = D then Exit;
  if Cardinal(D) > Cardinal(S) then
    for I := count-1 downto 0 do
      D[I] := S[I]
  else
    for I := 0 to count-1 do
      D[I] := S[I];
end;
{$ELSE}
asm
  cmp     eax, edx
  je      @@Exit {Source = Dest}
  cmp     ecx, 32
  ja      @@LargeMove {Count > 32 or Count < 0}
  sub     ecx, 8
  jg      @@SmallMove
@@TinyMove: {0..8 Byte Move}
  jmp     dword ptr [@@JumpTable+32+ecx*4]
@@SmallMove: {9..32 Byte Move}
  fild    qword ptr [eax+ecx] {Load Last 8}
  fild    qword ptr [eax] {Load First 8}
  cmp     ecx, 8
  jle     @@Small16
  fild    qword ptr [eax+8] {Load Second 8}
  cmp     ecx, 16
  jle     @@Small24
  fild    qword ptr [eax+16] {Load Third 8}
  fistp   qword ptr [edx+16] {Save Third 8}
@@Small24:
  fistp   qword ptr [edx+8] {Save Second 8}
@@Small16:
  fistp   qword ptr [edx] {Save First 8}
  fistp   qword ptr [edx+ecx] {Save Last 8}
@@Exit:
  ret
  nop {4-Byte Align JumpTable}
  nop
@@JumpTable: {4-Byte Aligned}
  dd      @@Exit, @@M01, @@M02, @@M03, @@M04, @@M05, @@M06, @@M07, @@M08
@@LargeForwardMove: {4-Byte Aligned}
  push    edx
  fild    qword ptr [eax] {First 8}
  lea     eax, [eax+ecx-8]
  lea     ecx, [ecx+edx-8]
  fild    qword ptr [eax] {Last 8}
  push    ecx
  neg     ecx
  and     edx, -8 {8-Byte Align Writes}
  lea     ecx, [ecx+edx+8]
  pop     edx
@FwdLoop:
  fild    qword ptr [eax+ecx]
  fistp   qword ptr [edx+ecx]
  add     ecx, 8
  jl      @FwdLoop
  fistp   qword ptr [edx] {Last 8}
  pop     edx
  fistp   qword ptr [edx] {First 8}
  ret
@@LargeMove:
  jng     @@LargeDone {Count < 0}
  cmp     eax, edx
  ja      @@LargeForwardMove
  sub     edx, ecx
  cmp     eax, edx
  lea     edx, [edx+ecx]
  jna     @@LargeForwardMove
  sub     ecx, 8 {Backward Move}
  push    ecx
  fild    qword ptr [eax+ecx] {Last 8}
  fild    qword ptr [eax] {First 8}
  add     ecx, edx
  and     ecx, -8 {8-Byte Align Writes}
  sub     ecx, edx
@BwdLoop:
  fild    qword ptr [eax+ecx]
  fistp   qword ptr [edx+ecx]
  sub     ecx, 8
  jg      @BwdLoop
  pop     ecx
  fistp   qword ptr [edx] {First 8}
  fistp   qword ptr [edx+ecx] {Last 8}
@@LargeDone:
  ret
@@M01:
  movzx   ecx, [eax]
  mov     [edx], cl
  ret
@@M02:
  movzx   ecx, word ptr [eax]
  mov     [edx], cx
  ret
@@M03:
  mov     cx, [eax]
  mov     al, [eax+2]
  mov     [edx], cx
  mov     [edx+2], al
  ret
@@M04:
  mov     ecx, [eax]
  mov     [edx], ecx
  ret
@@M05:
  mov     ecx, [eax]
  mov     al, [eax+4]
  mov     [edx], ecx
  mov     [edx+4], al
  ret
@@M06:
  mov     ecx, [eax]
  mov     ax, [eax+4]
  mov     [edx], ecx
  mov     [edx+4], ax
  ret
@@M07:
  mov     ecx, [eax]
  mov     eax, [eax+3]
  mov     [edx], ecx
  mov     [edx+3], eax
  ret
@@M08:
  fild    qword ptr [eax]
  fistp   qword ptr [edx]
end;
{$ENDIF}
Well, the inline asm code for the move won't compile on D5, but I was able to run the last testbed I compiled on my 2 other systems. The following are the results (bumped the iterations from 100 to 1000):

----

Prescott 930 (3GHZ Dual Core) / Windows Media Center 2005 / 3GB Ram

Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 868390033 CPU cycles. (309) ms
Workshop_AlexStripMultipleSpaces completed in 1663597012 CPU cycles. (593) ms
MokuleStripMultipleSpaces completed in 1237805814 CPU cycles. (441) ms
RussellStripMultipleSpaces completed in 743218154 CPU cycles. (265) ms

----

Pentium III 667 MHZ / Win2K / 128 MB Memory (Network file server)

Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 1179376415 CPU cycles. (1773) ms
Workshop_AlexStripMultipleSpaces completed in 1849779541 CPU cycles. (2774) ms
MokuleStripMultipleSpaces completed in 1530019465 CPU cycles. (2293) ms
RussellStripMultipleSpaces completed in 973187178 CPU cycles. (1462) ms


Russell

Another way to try and determine the differences is this. I made the compiled testbed program (running top 4 functions @ 1000 iterations) available at:

http://users.adelphia.net/~rllibby/downloads/testbed.zip

If you run it, and find that your code runs faster, then it can be concluded that the processor is making the difference. (all 3 systems I ran it on showed similar scaled results). If you run it and find that my function is faster, then we are able to determine that its the Delphi compiler/source version that makes the difference.

Regards,
Russell

(im just viewing this Q for curiousity)
i tested rllibbys program:

P III m 733~1000MHz, XP, 256MB RAM (the running system wasnt idle, around 35 processes were active):

Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 1002606495 CPU cycles. (1006) ms
Workshop_AlexStripMultipleSpaces completed in 1902691726 CPU cycles. (1908) ms
MokuleStripMultipleSpaces completed in 1440543952 CPU cycles. (1444) ms
RussellStripMultipleSpaces completed in 1084997620 CPU cycles. (1088) ms


i have another system: P II 400MHz, 640MB RAM, either 98SE, 2000 AS or XP - tomorrow i will test it there. will also try to run it on the one above again under more "calm" times.

in case my benchmarks are interesting for you: i have also d5 - so i could compile any program while others of you try it with a higher version...
wow this test is quite revealing, nice one ciuly. This goes to show that the fastest coding isn't necessarily the fastest code.

I must say that I'm quite embarressed that my function doesn't work properly and I'll admit that I just typed it in the browser without checking if it compiles. This taught me a lesson :).
Actually, I now start wondering if we should not also take the Delphi version into account for these solutions. So actually, the test tool should be compiled with every version since Delphi 3 (since the original questioneer in the other Q uses D3) to the latest Turbo Delphi version. :-) Now, that would be a real interesting test. :-)

I could imagine that Borland used some slow methods in D5 which they improved in D7 but then had to replace with a slower one in D2005, for example. :-)

Processor & Delphi version make a BIG difference....
which is what I was aiming to illustrate, especially with the the comiled program that is available for download. Perfect example of the same code (delphi version does not come into play) that shows different results (fastest routine) based on the hardware it is run on.

Russell
Russel's testbed results on my Celeron 1.6GHz, 768Ram, XP home

Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 1226723140 CPU cycles. (770) ms
Workshop_AlexStripMultipleSpaces completed in 1758365360 CPU cycles. (1103) ms
MokuleStripMultipleSpaces completed in 1795918512 CPU cycles. (1127) ms
RussellStripMultipleSpaces completed in 823440352 CPU cycles. (517) ms

Could somebody with a P4 also try my testbed?
http://www.alkisg.com/experts/AlkisgTestBed.zip
Compiled with D2006, optimized and with A VERY LOW SPACE PROBABILITY, OneSpaceProbability = 0.001 %.

Each of these algorithms searches and moves data.
Lowering the space probability reports as fastest the algorithm with the fastest searching part (as opposed to the moving part).
My purpose is to give PosEx a more significant role in this test...

My results:
Running with Length(text) = 1000000
AlkisgStripMultipleSpaces completed in 1198784164 CPU cycles (752 ms).
Workshop_AlexStripMultipleSpaces completed in 1820262932 CPU cycles (1142 ms).
MokuleStripMultipleSpaces completed in 1901184948 CPU cycles (1193 ms).
RussellStripMultipleSpaces completed in 2082098812 CPU cycles (1307 ms).
My conclusions so far:
*) D2006 is faster that D5 :-) at least on Move... Congrats, FastCode!
*) Russell's function is faster than mine if many spaces exist, perhaps due to the "setup" cost of calling PosEx.
*) My function is faster that Russell's if only a few spaces exist, perhaps because PosEx can search faster if the next space is e.g. 10Kb after the current one. The "call" instruction cost is not very significant if PosEx is called only a few times for the whole 1Mb data...
*) Alex and Mokule functions are also faster than Russell's if only a few spaces exist. I don't know why's that, I'll look into the generated asm code later...
I ran the test on my first system, P4 2.8 GHZ. With no spaces (cleaned all spaces out of text), your code runs the fastest and mine runs the slowest. Once spaces are introduced into the code, then the results turn the other way:

Results as follows:


With 0 spaces in text:

Running with Length(text) = 1000018
AlkisgStripMultipleSpaces completed in 1206538504 CPU cycles (469 ms).
Workshop_AlexStripMultipleSpaces completed in 1714296236 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1681699584 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 1762327528 CPU cycles (687 ms).

----

With 10 double-spaces in text:

Running with Length(text) = 1000038
AlkisgStripMultipleSpaces completed in 1297215528 CPU cycles (516 ms).
Workshop_AlexStripMultipleSpaces completed in 1723582236 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1682098092 CPU cycles (657 ms).
RussellStripMultipleSpaces completed in 1455484492 CPU cycles (563 ms).

----

With 20 double-spaces in text:

Running with Length(text) = 1000058
AlkisgStripMultipleSpaces completed in 1287794540 CPU cycles (500 ms).
Workshop_AlexStripMultipleSpaces completed in 1712113628 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1692969756 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 1254475676 CPU cycles (484 ms).

----

With 30 double-spaces in text:

Running with Length(text) = 1000078
AlkisgStripMultipleSpaces completed in 1271748352 CPU cycles (515 ms).
Workshop_AlexStripMultipleSpaces completed in 1721996184 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1683155012 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 1034804192 CPU cycles (406 ms).

----

With 40 double-spaces in text:

Running with Length(text) = 1000098
AlkisgStripMultipleSpaces completed in 1287267368 CPU cycles (500 ms).
Workshop_AlexStripMultipleSpaces completed in 1710652708 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1685094780 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 874739376 CPU cycles (344 ms).

----

With 50 double-spaces in text:

Running with Length(text) = 1000118
AlkisgStripMultipleSpaces completed in 1286180728 CPU cycles (500 ms).
Workshop_AlexStripMultipleSpaces completed in 1704431776 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1686519552 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 845381592 CPU cycles (328 ms).

----

With 100 double-spaces in text:

Running with Length(text) = 1000218
AlkisgStripMultipleSpaces completed in 1286267352 CPU cycles (516 ms).
Workshop_AlexStripMultipleSpaces completed in 1735722828 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1701265072 CPU cycles (672 ms).
RussellStripMultipleSpaces completed in 845979476 CPU cycles (328 ms).
Tested at work on a Pentium IV dual-core 3 GHz system. XP Pro, using the binary from Alkisq
-------------------------------------------------------------------------------
Running with Length(text) = 1000000
AlkisgStripMultipleSpaces completed in 928273253 CPU cycles (310 ms).
Workshop_AlexStripMultipleSpaces completed in 1863291434 CPU cycles (621 ms).
MokuleStripMultipleSpaces completed in 1646885730 CPU cycles (549 ms).
RussellStripMultipleSpaces completed in 754627754 CPU cycles (251 ms).

Used the generic random text generator for above list.
-------------------------------------------------------------------------------
Running with Length(text) = 765490
AlkisgStripMultipleSpaces completed in 640248863 CPU cycles (213 ms).
Workshop_AlexStripMultipleSpaces completed in 1087663965 CPU cycles (363 ms).
MokuleStripMultipleSpaces completed in 1149836924 CPU cycles (383 ms).
RussellStripMultipleSpaces completed in 543709995 CPU cycles (180 ms).

The input file was an XML file with almost no spaces... :-)
-------------------------------------------------------------------------------
Another test, same system:Running with Length(text) = 802320
AlkisgStripMultipleSpaces completed in 1732407810 CPU cycles (577 ms).
Workshop_AlexStripMultipleSpaces completed in 1960128135 CPU cycles (653 ms).
MokuleStripMultipleSpaces completed in 1835230950 CPU cycles (611 ms).
RussellStripMultipleSpaces completed in 743925525 CPU cycles (247 ms).

This time I uses the SysUtils unit from Delphi 6 as text. Twice copied to the textfield. :-)
-------------------------------------------------------------------------------
Then even funnier results:
Running with Length(text) = 12207616
AlkisgStripMultipleSpaces completed in 2763672397 CPU cycles (921 ms).
Workshop_AlexStripMultipleSpaces completed in 2788594297 CPU cycles (928 ms).
MokuleStripMultipleSpaces completed in 17644017907 CPU cycles (5881 ms).
RussellStripMultipleSpaces completed in 9792352192 CPU cycles (3264 ms).

The difference? The input text had absolutely NO double spaces. Not even normal spaces... :-)
-------------------------------------------------------------------------------
And another funny result:
Running with Length(text) = 21725186
AlkisgStripMultipleSpaces completed in 20222208420 CPU cycles (6740 ms).
Workshop_AlexStripMultipleSpaces completed in 23716101810 CPU cycles (7905 ms).
MokuleStripMultipleSpaces completed in 29768851282 CPU cycles (9923 ms).
RussellStripMultipleSpaces completed in 11165073089 CPU cycles (3722 ms).

This only only had spaces in it... :-)
-------------------------------------------------------------------------------
The dual-core has not much effect on this test, if you ask me. One thing I do notice which is that Alkisq and I have the fastest solutions when the original string has no dual spaces. And Russells version is the fastest when it's only spaces. :-) Russells solution is quite fast in general, as you can see from all these tests. Apparantly it only slows down when it has lots of spaces to process. If that could be improved too then WOW! :-)
Then again, this all for just a function which replaces two spaces. It would be more interesting to do the same for a faster StringReplace solution, where you can specify additional parameters for what must be changed and to what it should be changed. :-)

I'll test it again when I'm back home on my own system. It's a Pentium III at 866 MHz so it should be reasonable interesting for comparison. Maybe the amount of clock cycles will be very different on my system at home.
Hey, Russell...

Do you notice what I noticed? My solution seems to be at a constant speed. :-) In your test results I see: "Workshop_AlexStripMultipleSpaces completed in 1704431776 CPU cycles (672 ms)." everywhere... :-)

Thus mine has the most constant speed. :-P

Btw, my code might improve in speed if you just remove the Pos() function from it and just start looping from the first character. But then again, it's not that big an issue anyways... :-)

Alex, I'm with you for the StringReplace thing. Then we would have something useful to experiment on!
Start a question!

I would, but I'm not very proud of my English :-(
As you wish. :-)

https://www.experts-exchange.com/questions/21982965/Competition-Build-the-fastest-StringReplace-function.html for the StringReplace competition. :-)
https://www.experts-exchange.com/questions/21982971/Competition-Build-the-fastest-StringReplace-function-with-Delphi-for-NET.html for exactly the same competition but with a difference. In this one, I want a .NET solution instead of a WIN32 solution. :-)

This also makes it fun to see if .NET really performs about as well as WIN32 or that it sucks because it's way too slow... :-)
Avatar of 2266180

ASKER

I've been out of town for the past 2 days.

interesting posts. lot of good information :)

mainly the scope of this thread is comming to it's shape. I'll wait to se if anyone has anything else to add to this and then I'll try to make a short resume and accept answers ;)
Well, the one thing that I fins funny is the fact that while my solution isn't the fastest one, it happens to be the one that is most constant in speed. :-) Which happens to be an interesting side effect. :-)
Avatar of 2266180

ASKER

true :)

I was thinking about a few stuff last night and remembered that there isn't yet a top for the memory consumption :)

one other solution I thought about is to use an external application that monitors process memory and since the test applications used here are only doing one operations we can use the peek value found by teh monitoring software to see how much memory was used since the peek value will actually happen during the actual string operation.

so who's got time to throw in some code on this and alexs idea? :)
I can post something using WMI in one of my breaks later on.

This modified version should prove to be a good "all around" performer; regardless if the data contains double spaces or not (or is all spaces). As to the whole memory question, thats kind of a moot point. Worst case scenario the algorithms will allocate a string of a length equal to the source string. Now without making the source string "by reference", the best you could do is allocate: {Length of source} - {Double space run count} * ({Double space run length} - 1). That would of course require a full prescan of the data, and in most cases would result in a higher cost time wise, albeit slightly smaller memory footprint.

Russell

--

function RussellStripMultipleSpaces(Value: String): String;
var  lpszSource:    PChar;
     lpszDest:      PChar;
     dwLength:      Integer;
     dwMove:        Integer;
begin

  // Get the source length
  dwLength:=Length(Value);

  // If length is less than two then there is nothing to do
  if (dwLength < 2) then
     // Return source string
     result:=Value
  else
  begin
     // Set source pointer
     lpszSource:=Pointer(Value);
     // Set starting scan location
     dwMove:=0;
     // Scan for first set of double spaces
     while (dwMove < dwLength) do
     begin
        // Check for space
        if (lpszSource[dwMove] = #32) then
        begin
           // Check for following space
           if (lpszSource[Succ(dwMove)] = #32) then break;
        end;
        // Push next
        Inc(dwMove);
     end;
     // Check result of double space scan
     if (dwMove = dwLength) then
        // No double spaces in string, so just return the source string
        result:=Value
     else
     begin
        // Allocate result string
        SetLength(result, dwLength);
        // Set dest pointer
        lpszDest:=Pointer(result);
        // Process source string
        while (dwLength > dwMove) do
        begin
           // Check for space
           if (lpszSource[dwMove] = #32) then
           begin
              // Update move
              Inc(dwMove);
              // Check next byte
              if (lpszSource[dwMove] = #32) then
              begin
                 // Move the data
                 Move(lpszSource^, lpszDest^, dwMove);
                 // Update destination pointer
                 Inc(lpszDest, dwMove);
                 // Increment move counter
                 Inc(dwMove);
                 // Increment source pointer
                 Inc(lpszSource, dwMove);
                 // Decrement the length counter
                 Dec(dwLength, dwMove);
                 // Skip following spaces
                 while (lpszSource^ = #32) do
                 begin
                    // Increment source
                    Inc(lpszSource);
                    // Decrement length
                    Dec(dwLength);
                 end;
                 // Reset move count
                 dwMove:=(-1);
              end;
           end;
           // Update move counter
           Inc(dwMove);
        end;
        // Check final move
        if (dwLength > 0) then
        begin
           // Final data move
           Move(lpszSource^, lpszDest^, dwMove);
           // Update dest
           Inc(lpszDest, dwMove);
        end;
        // Truncate the result
        SetLength(result, lpszDest - Pointer(result));
     end
  end;

end;
Avatar of 2266180

ASKER

russel, you are of course correct, but lets not forget that the main reason of this thread is learing.
and to provide some examples of how to measure different types of performance issues (machine cycles, times and of course memory allocation and who knows what other apsects that cannot be covered in one thread :) ).

probably this isn't exactly the best issue to work on the memory allocation performance. I don't recall seeing that kind of issues for quite some time. maybe I'll just make something up :D

until then, I started on that app I was talking above. maybe I'll find time tomorrow evening to finish it.
Russel, I was ready to post a solution when I noticed your post. Quite fast I'd say, but the memory is not a moot issue. If the string is 2GB your proc might not work on some computyers.

As I was saying my solution I just wrote is a bit similar to your, it goes the same way, except: it does not havve memory troubles, it uses one call to SetLength and one call to Move. Plus it's shorter and I hope faster :-)

Cheers

program Project2;

{$APPTYPE CONSOLE}

procedure CutSpaces(var strInput: String);
var
  ptr, spacesFound, lastpos, workLen, deleted, inLen: Integer;
begin
  writeln( ' input: "'+strInput+'"');
  inLen := Length(strInput);         // save original length
  if inLen < 2 then Exit;            // nothing to do? out of here
  deleted := 0;                      // initialize count of deleted spaces
  workLen := inLen;                  // initialize working length
  ptr := 1;                          // start from first char in string
  repeat                             // start the global loop
    // skip non-space stuff
    while (ptr < workLen) and (strInput[ptr] <> ' ') do
      Inc(ptr);
    // initialize spaces counter correct
    if strInput[ptr] = ' ' then
      spacesFound := 1
    else
      spacesFound := 0;
    lastpos := ptr;
    while (ptr < workLen) and (strInput[ptr] = ' ') do
    begin
      Inc(ptr);                      // move to next char
      if strInput[ptr] = ' ' then    // if this is a space
        Inc(spacesFound);            // increase the counter
    end;
    if spacesFound > 1 then          // if we detected more than one space
    begin
      // slide to the left exact numbers of chars left
      Move(strInput[ptr], strInput[ptr-(spacesFound-1)], workLen-ptr+1);
      Inc(deleted, spacesFound-1);   // update deleted chars
      Dec(workLen, spacesFound-1);   // update our working length
      //Dec(ptr, spacesFound-1);       // safe go back one char to avoid missing some spaces
    end;
    ptr := lastpos+1;
  until (spacesFound = 0)            // stop if no more spaces
     or (ptr > workLen);             // or current position past working length
  SetLength(strInput, inLen-deleted); // one time set length of output string
  writeln( 'output: "'+strInput+'"  deleted ', deleted );
end;

var
  S: String;
begin
  S := 'The       cat     sat   on   the         mat';
  CutSpaces( S );
  S := '  The    cat      sat   on   the         mat';
  CutSpaces( S );
  S := '  The     cat       sat   on   the         mat  ';
  CutSpaces( S );
  S := 'The cat sat on the mat';
  CutSpaces( S );
  S := '  The cat sat on the mat';
  CutSpaces( S );
  S := '  The cat sat on the mat   ';
  CutSpaces( S );
  S := 'The cat sat on the mat  ';
  CutSpaces( S );
  readln;
end.

Read my comments fully:

>>  Now without making the source string "by reference",

Given the usage of const for the source string in all the sample functions (I assumed that was a requirement), then the point is moot. Now if a var string is allowed as the parameter, then thats a whole different story. If I did my function as (var Value) it would be hard to touch.


Russell


Oh I read your comments fully before posting, except that I did not feel any constraints. When I first saw this problem, I looked in fast-forward mode to a few solutions and I was kind of worried :-) why nobody used anything as simple as "walk and count", except for your code. That's why probably I did not feel the type of parameter was a requirement. Plus I recently had a job interview with a "string" problem in it. They wanted me to write a "ReverseString" method with much care about memory consumption. Another reason why it was so hard in my mind.

Anyway, I am curios as of how fast my method is amongst non-ASM solutions (I do not know any assembler) and I love pascal-ish solutions.

ljee

Even when passed as var, it can still be a moot point when dealing with the copy-on-write semantics of delphi strings. If B = A (both string vars), and you pass B to the procedure by pointer and modify B using code that forces UniqueString to get called, then B is allocated as a whole new string. (The assignment of B:=A; does no memory allocation in and of itself.)

As far as speed goes, well.... I'll post the informal example that I used and let you be the judge. I will say that on my system, one ran at 391ms and the other at a full 18 seconds.

Russell


---

program Project1;

{$APPTYPE CONSOLE}
{$ASSERTIONS ON}

uses
  Windows,
  SysUtils;

procedure CutSpaces(var strInput: String);
var
  ptr, spacesFound, lastpos, workLen, deleted, inLen: Integer;
begin
  inLen := Length(strInput);         // save original length
  if inLen < 2 then Exit;            // nothing to do? out of here
  deleted := 0;                      // initialize count of deleted spaces
  workLen := inLen;                  // initialize working length
  ptr := 1;                          // start from first char in string
  repeat                             // start the global loop
    // skip non-space stuff
    while (ptr < workLen) and (strInput[ptr] <> ' ') do
      Inc(ptr);
    // initialize spaces counter correct
    if strInput[ptr] = ' ' then
      spacesFound := 1
    else
      spacesFound := 0;
    lastpos := ptr;
    while (ptr < workLen) and (strInput[ptr] = ' ') do
    begin
      Inc(ptr);                      // move to next char
      if strInput[ptr] = ' ' then    // if this is a space
        Inc(spacesFound);            // increase the counter
    end;
    if spacesFound > 1 then          // if we detected more than one space
    begin
      // slide to the left exact numbers of chars left
      Move(strInput[ptr], strInput[ptr-(spacesFound-1)], workLen-ptr+1);
      Inc(deleted, spacesFound-1);   // update deleted chars
      Dec(workLen, spacesFound-1);   // update our working length
      //Dec(ptr, spacesFound-1);       // safe go back one char to avoid missing some spaces
    end;
    ptr := lastpos+1;
  until (spacesFound = 0)            // stop if no more spaces
     or (ptr > workLen);             // or current position past working length
  SetLength(strInput, inLen-deleted); // one time set length of output string
end;

procedure RussellStripMultipleSpaces(var Value: String);
var  lpszSource:    PChar;
     lpszDest:      PChar;
     dwLength:      Integer;
     dwMove:        Integer;
begin

  // Get the source length
  dwLength:=Length(Value);

  // If length is less than two then there is nothing to do
  if (dwLength > 1) then
  begin
     // Set source pointer
     lpszSource:=Pointer(Value);
     // Set starting scan location
     dwMove:=0;
     // Scan for first set of double spaces
     while (dwMove < dwLength) do
     begin
        // Check for space
        if (lpszSource[dwMove] = #32) then
        begin
           // Check for following space
           if (lpszSource[Succ(dwMove)] = #32) then break;
        end;
        // Push next
        Inc(dwMove);
     end;
     // Check result of double space scan
     if (dwMove < dwLength) then
     begin
        // Make unique
        UniqueString(Value);
        // Reset source and dest pointer
        lpszSource:=Pointer(Value);
        lpszDest:=Pointer(Value);
        // Process the first portion (no move required)
        Inc(dwMove);
        // Update destination pointer
        Inc(lpszDest, dwMove);
        // Increment move counter
        Inc(dwMove);
        // Increment source pointer
        Inc(lpszSource, dwMove);
        // Decrement the length counter
        Dec(dwLength, dwMove);
        // Skip following spaces
        while (lpszSource^ = #32) do
        begin
           // Increment source
           Inc(lpszSource);
           // Decrement length
           Dec(dwLength);
        end;
        // Reset move count
        dwMove:=0;
        // Process source string
        while (dwLength > dwMove) do
        begin
           // Check for space
           if (lpszSource[dwMove] = #32) then
           begin
              // Update move
              Inc(dwMove);
              // Check next byte
              if (lpszSource[dwMove] = #32) then
              begin
                 // Move the data
                 Move(lpszSource^, lpszDest^, dwMove);
                 // Update destination pointer
                 Inc(lpszDest, dwMove);
                 // Increment move counter
                 Inc(dwMove);
                 // Increment source pointer
                 Inc(lpszSource, dwMove);
                 // Decrement the length counter
                 Dec(dwLength, dwMove);
                 // Skip following spaces
                 while (lpszSource^ = #32) do
                 begin
                    // Increment source
                    Inc(lpszSource);
                    // Decrement length
                    Dec(dwLength);
                 end;
                 // Reset move count
                 dwMove:=(-1);
              end;
           end;
           // Update move counter
           Inc(dwMove);
        end;
        // Check final move
        if (dwLength > 0) then
        begin
           // Final data move
           Move(lpszSource^, lpszDest^, dwMove);
           // Update dest
           Inc(lpszDest, dwMove);
        end;
        // Truncate the result
        SetLength(Value, lpszDest - Pointer(Value));
     end
  end;

end;

function TheText: String;
const
  OneSpaceProbability = 1;
  MoreSpacesProbability = 60;
var
  i: Integer;
  lastOneWasSpace: Boolean;
  textSize: Integer;
begin

  Randomize;
  textSize := 100000;
  SetLength(result, textSize);
  i := 1;
  lastOneWasSpace := false;
  while i <= textSize do
  begin
    if i mod 82 = 81 then //insert line breaks every 80 chars.
    begin
      result[i] := #13;
      Inc(i);
      result[i] := #10;
    end
    else if lastOneWasSpace then
    begin
      if Random(100) < MoreSpacesProbability then
        result[i] := ' '
      else
        result[i] := char(ord('a') + random(26));
    end
    else begin
      if Random(100) < OneSpaceProbability then
        result[i] := ' '
      else
        result[i] := char(ord('a') + random(26));
    end;
    lastOneWasSpace := result[i] = ' ';
    Inc(i);
  end;

end;

var
  A:          String;
  B:          String;
  dwMark:     LongWord;
  dwIndex:    Integer;

begin

  A:=TheText + '  ';

  dwMark:=GetTickCount;
  for dwIndex:=1 to 1000 do
  begin
     B:=A;
     RussellStripMultipleSpaces( B );
     Assert(Length(B) < Length(A), 'Assertion failure');
  end;
  dwMark:=GetTickCount - dwMark;
  WriteLn(dwMark, 'ms for 1000 iterations of RussellStripMultipleSpaces');

  dwMark:=GetTickCount;
  for dwIndex:=1 to 1000 do
  begin
     B:=A;
     CutSpaces( B );
     Assert(Length(B) < Length(A), 'Assertion failure');
  end;
  dwMark:=GetTickCount - dwMark;
  WriteLn(dwMark, 'ms for 1000 iterations of CutSpaces');

  ReadLn;

end.



thanks for posting all this. looks very weird, I tested it and you're right. I do not understand why my code is so slow, any idea? I was really thinking we practically kind of doing same thing but just slightly different. This *slightly* though is the difference between life and death. I'll allocate more time to study your code, maybe I'll see why is faster :-)

But once again, I believe your code uses twice memory as needed because it does not operate directly on the input string instead it moves it to the second one. Speed is speed, grant you that, but why didn't you write it like that? Is that the cause? I am moving *much* more bytes than your code, perhaps??

ljee
>> But once again, I believe your code uses twice memory as needed because it does not operate directly on the input string instead it moves it to the second one.

That is where you would be wrong. The code I provided operates on a pointer to the input the string. I call UniqueString once to ensure that the copy on write is done, and I don't end up modifying both B and A (in the test example). If the string is already unique (ref count is checked), then no allocation is done.

But by using pointers, I control the single call to UniqueString, vs your code which implicitly calls UniqueString (generated by the compiler) for every call to the Move(...) procedure in your code. Bottom line is that my code uses no more memory than yours, but operates roughly 45x faster.

Russell
aaaa .... ok man! nice to learn something today! I did not know about UniqueString-Move thinggie. I won't take your word for it and I'll check all this stuff one of these days cuz it's so ... wonderful

(and stop telling me how fast your code was .... it gets to my nerves :-))

ljee
Well, must say that my solution uses almost no extra memory. :-) If you scroll up (a lot!) then you'll see that I first make an uniqeu copy of the string to serve as a result and then start moving characters within that string and at the end I shorten the string. It's not really optimized but hey, the memory I need for this solution equals the size of this string plus the size of the local values and that's basically all. :-)

Btw, while you could try to keep track of memory usage, this tends to be a bit useless exercise since in a Windows application there might be plenty of other things that will claim additional memory. For example, you might have a virusscanner running and this virusscanner might work by injecting some code in every running executable to make sure it behaves nicely. But that also means the virusscanner will sometimes eat up additional memory. And there might be other reasons why an application takes up more than expected.
To keep the memory consumption as low as possible for a valid memory test you should just write a console application and avoid the use of all the VCL classes. (You can still use SysUtils and ActiveX units, though.) Furthermore, to calculate the memory usage you will have to walk through the heap of Delphi, adding up all the memory blocks it uses. This can be done and apparantly isn't even that difficult. (Just wished I knew where I'd stored the sample code I used to have for this.)

Russell's code was fast, wasn't it? :-P
Keeping track of memory is never useless, especially when you have to run a server continuously. What other apps are doing  is their own business, if they eat to much or leak ... they just go on my blacklist and get uninstalled, deleted, erased, phooo :-), I certainly don't want that to happen with my apps on other people's computers.

I made some minor changes to my code, basically to use pointers in the Move call. It became avery little bit faster, or maybe not at all. I think the "Move" statement is slow and it has to be written with care. Basically the difference between Russell's code and mine is that he moves just small part of the source string to the target string, while my code moves always the whole right side of the string after the spaces found. I wasn't thinking how slow Move can be, but this is another lesson.

(And yeah, it was damn fast :-))

I can't wait for the StringReplace replacement guys
Avatar of 2266180

ASKER

guess it's time to close this :)

I re-read the thread (huh, that took a while) and there is a lot of usefull information and people really put their minds to this. so I am now thinking to modify the reward sheet by increasing the points to 500 and splitting more or less equally among the active participants (I hope I won't skip anybody :D).
I'll accept alexes solution as it was the first one to post :)

thanks for the input everybody, *GREAT* job. I'll maybe think of a subject for the memory usage problem. we'll see.

(for anyone wondering, limbecks solution is in the other/"original" question)
Yep. It was a *GREAT* job we all did here. :-) Very educational for everyone who is digging deeper into Delphi to find ways to improve the speed a bit more.

One thing we've all learned from this all: to speed up string handling, it's better if you just look at it as an array of character and try to avoid moving around blocks of data. :-)