2266180
asked on
Performance contest :)
I am interested in the speed and memory usage of all solutions from here: https://www.experts-exchange.com/questions/21980074/Remove-spaces-in-sentence.html
testing will be done on the same string of at least 100KB (you choose it randomly)
first 3 good solutions win each 20 points.
any first 3 bugs/real issues (not mentioned already in the above question) will get additional 20 points
any other serios bug/real issue (not awared) will get a separate 20 points
you must do at least 3 runs on each test in order to make sure that there is no interference from antivirus, or otehr applications that might influence the speed output.
posters are asked to also post the method used for calculating the speed and memory usage. I will award 20 points for each method (not for each poster) in order of posting. (so you might want to post the method first :P). There is no problem if we get 4 posters getting the results for all or for a few of the soltuions as it is good to have a comparison. if one poster says it took him 5 minutes to run one of teh solution and anotehr 3 says it only took 2 minutes then I guess the otehr 3 are rigth ;)
please ***DO NOT*** post the string used for testing. we can generate one any time or copy the content of a big enough web page.
let the fun begin :)
testing will be done on the same string of at least 100KB (you choose it randomly)
first 3 good solutions win each 20 points.
any first 3 bugs/real issues (not mentioned already in the above question) will get additional 20 points
any other serios bug/real issue (not awared) will get a separate 20 points
you must do at least 3 runs on each test in order to make sure that there is no interference from antivirus, or otehr applications that might influence the speed output.
posters are asked to also post the method used for calculating the speed and memory usage. I will award 20 points for each method (not for each poster) in order of posting. (so you might want to post the method first :P). There is no problem if we get 4 posters getting the results for all or for a few of the soltuions as it is good to have a comparison. if one poster says it took him 5 minutes to run one of teh solution and anotehr 3 says it only took 2 minutes then I guess the otehr 3 are rigth ;)
please ***DO NOT*** post the string used for testing. we can generate one any time or copy the content of a big enough web page.
let the fun begin :)
lol. nice one. I won't participate cause I will be baised towards my own solution :)
ASKER
you can participate because you will not cheat (I hope :P ) and since you wil be using the same measurement tehnique and data for all solutions you should be pretty fair from this point of view ;)
lol
well ok, there has to be 1 person or 1 program to compare the sollutions then so we can rule out speed of the compu, hd etc
ASKER
I know this might look funny to some of you, but if you look at the constructive part you'll notice that in the end this question will hold valuable information regarding performance in delphi.
I am not expecting anything from people that are not used to thinking and/or meaasuring performance but from the people that deal with this on a regular basis.
I hope to gather here the information one needs to have when considering performance.
the idea is not to have 1 program or method, but more. probably it would have been an idea to ask for the cpu speed as well. so lets assume I did that now :)
I am not expecting anything from people that are not used to thinking and/or meaasuring performance but from the people that deal with this on a regular basis.
I hope to gather here the information one needs to have when considering performance.
the idea is not to have 1 program or method, but more. probably it would have been an idea to ask for the cpu speed as well. so lets assume I did that now :)
yeah you're right Limbeck. I nominate ciuly seeing as it's his thread.
that is true. well why not have one program using each sollutions as a function and use each function in a row (maybe even in a diff. order on the second and third try). thats the only way to compare it imho, you wont be bothered with other programs running, anti-viri soft. mem. issues etc.
i dont have the time right now to build it, sorry.
i dont have the time right now to build it, sorry.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Ditto Limbeck
Limbeck
I don't say that absolutele :)
Alex,
You new version gives me 2000ms
I don't say that absolutele :)
Alex,
You new version gives me 2000ms
ASKER
well the nice thing about this is that you can also test it yourselves ;)
and of course if anybody else comes up with other, possibly better, solutions.
good one mokule. seems that delete really put a break in there :)
and of course if anybody else comes up with other, possibly better, solutions.
good one mokule. seems that delete really put a break in there :)
Limbeck
4600ms - but the loop shortened 10 times
4600ms - but the loop shortened 10 times
Thanks, Mokule. :-) As I said, I replaced the Length() functioncalls which save quite a few calls. The only difference now is that I still call UniqueStringA, which (if it could be removed) still slows things down a bit. Your solution is more like the suggested PChar solution but there's probably something else slowing it down. :-)
Okay, an even fastersolution from me then... :-)
procedure StripMultipleSpacesProc(va r Result: string);
var
I, J: Integer;
C: PChar;
Max: Integer;
begin
I := Pos(#32#32, Result);
if (I > 0) then
begin
UniqueString(Result);
Max := Length(Result);
C := PChar(@Result[I]);
repeat
Inc(I);
Inc(C);
J := Succ(I);
while (J <= Max) and (Result[J] = #32) do
Inc(J);
while (J <= Max) do
begin
if (Result[Pred(I)] = #32) then
begin
while (J <= Max) and (Result[J] = #32) do
Inc(J);
end;
if (J <= Max) then
begin
C^ := Result[J];
Inc(I);
Inc(C);
Inc(J);
end;
end;
until (J > Max);
SetLength(Result, Pred(I));
end;
end;
The difference? The UniqueStringA call is gone since I now use a PChar to access the contents of the string. To still make it work okay, I call UniqueString at the beginning of this procedure so I'm not accidently modifying too much. :-)
And yes, it's a procedure, which makes it more interesting since there's no need now to copy the string contents first. :-) It's another speed increase...
Wild guess? I think this one ends within 1750 ms... :-)
Okay, an even fastersolution from me then... :-)
procedure StripMultipleSpacesProc(va
var
I, J: Integer;
C: PChar;
Max: Integer;
begin
I := Pos(#32#32, Result);
if (I > 0) then
begin
UniqueString(Result);
Max := Length(Result);
C := PChar(@Result[I]);
repeat
Inc(I);
Inc(C);
J := Succ(I);
while (J <= Max) and (Result[J] = #32) do
Inc(J);
while (J <= Max) do
begin
if (Result[Pred(I)] = #32) then
begin
while (J <= Max) and (Result[J] = #32) do
Inc(J);
end;
if (J <= Max) then
begin
C^ := Result[J];
Inc(I);
Inc(C);
Inc(J);
end;
end;
until (J > Max);
SetLength(Result, Pred(I));
end;
end;
The difference? The UniqueStringA call is gone since I now use a PChar to access the contents of the string. To still make it work okay, I call UniqueString at the beginning of this procedure so I'm not accidently modifying too much. :-)
And yes, it's a procedure, which makes it more interesting since there's no need now to copy the string contents first. :-) It's another speed increase...
Wild guess? I think this one ends within 1750 ms... :-)
Interesting stuff, what about memory usage mokule? And what about my solution?
ok, ta :)
Alex,
No need for this
UniqueString(Result);
it is forced here
C := PChar(@Result[I]);
Your current result
1850-1900ms
So is mine at the moment, though I don't change anything in my code.
No need for this
UniqueString(Result);
it is forced here
C := PChar(@Result[I]);
Your current result
1850-1900ms
So is mine at the moment, though I don't change anything in my code.
rmeere,
You want it.
19328ms
- Your code can't be compiled.
- result is not initialized
- result is wrong - no last character
You want it.
19328ms
- Your code can't be compiled.
- result is not initialized
- result is wrong - no last character
lol. that's BAD.
It is? Well, better safe than sorry. :-) Still only a reduction of 100 ms... :-( I had hoped for more.
Funny to see how slow the solution from rmeere is compared to some of the others. :-) And still there are many inexperienced Delphi developers who would just choose for exactly the same solution. :-)
Now, about memory usage... :-) That's going to be interesting...
My solution is modifying the string in it's own memory space so all I need is the memory for the string plus the memory of the local variables. I use 3 integers and one PChar. These are all 4 bytes so I use 100 KB for the string plus 16 bytes for the remaining stuff. :-) Not much, when you realise that the solution already had this 100 KB in memory.
Only problem could be that because UniqueString is called (implicitly or explicitly) there's a chance that the string is still copied in-memory thus doubling the amount of memory required.
But we're at a point here where more speed can be gained by improving the algorithm used to find the double spaces. I think these could be a bit better. :-)
Funny to see how slow the solution from rmeere is compared to some of the others. :-) And still there are many inexperienced Delphi developers who would just choose for exactly the same solution. :-)
Now, about memory usage... :-) That's going to be interesting...
My solution is modifying the string in it's own memory space so all I need is the memory for the string plus the memory of the local variables. I use 3 integers and one PChar. These are all 4 bytes so I use 100 KB for the string plus 16 bytes for the remaining stuff. :-) Not much, when you realise that the solution already had this 100 KB in memory.
Only problem could be that because UniqueString is called (implicitly or explicitly) there's a chance that the string is still copied in-memory thus doubling the amount of memory required.
But we're at a point here where more speed can be gained by improving the algorithm used to find the double spaces. I think these could be a bit better. :-)
Which reminds me... In the original question an answer was accepted but what is the performance of that solution? :-)
ASKER
I would also like to see a method with performance counters ;) (I have the functions at home somewhere, probably google can spit a few as well)
then, if I remember correctly, I've seen a method a year or two back that was working more or less directly with the CPU but unfortunatly I don't remember what it did. if anyone knows what I am talking abuot, pointers and/or info would be nice :)
also, for calculating the memory used, I can only think of getting the memory used by the whole app using api or wmi. any otehr ideas/solutions?
then, if I remember correctly, I've seen a method a year or two back that was working more or less directly with the CPU but unfortunatly I don't remember what it did. if anyone knows what I am talking abuot, pointers and/or info would be nice :)
also, for calculating the memory used, I can only think of getting the memory used by the whole app using api or wmi. any otehr ideas/solutions?
function RDTSC: Int64;
asm
dw $310F {BASM doesn't support RDTSC}
end;
That's the function which tells you the number of clock cycles. Call it twice, calculate the difference and you'll know how many clock cycles some task took, no matter at which speed the performance test was run. :-)
For memory usage, you could use GlobalMemoryStatus() to get this information. Unfortunately, it's not that precise. Still, if you calculate differences before and after every solution you would have some good indication. Or use GetHeapStatus() to get the Delphi status of it's internal heap.
asm
dw $310F {BASM doesn't support RDTSC}
end;
That's the function which tells you the number of clock cycles. Call it twice, calculate the difference and you'll know how many clock cycles some task took, no matter at which speed the performance test was run. :-)
For memory usage, you could use GlobalMemoryStatus() to get this information. Unfortunately, it's not that precise. Still, if you calculate differences before and after every solution you would have some good indication. Or use GetHeapStatus() to get the Delphi status of it's internal heap.
ASKER
THAT is what I was talking about :D thanks a bunch alex ;)
good idea with the getheapstatus. though it requires some research ormaybe testing to see how exact it is.
good idea with the getheapstatus. though it requires some research ormaybe testing to see how exact it is.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
And the form, in case anyone wants to try with my interface:
object Form1: TForm1
Left = 0
Top = 0
Caption = 'Form1'
ClientHeight = 351
ClientWidth = 723
Color = clBtnFace
Font.Charset = DEFAULT_CHARSET
Font.Color = clWindowText
Font.Height = -11
Font.Name = 'Tahoma'
Font.Style = []
OldCreateOrder = False
DesignSize = (
723
351)
PixelsPerInch = 96
TextHeight = 13
object Label1: TLabel
Left = 24
Top = 195
Width = 57
Height = 25
Anchors = [akLeft, akBottom]
AutoSize = False
Caption = 'Text size:'
Layout = tlCenter
ExplicitTop = 207
end
object Memo1: TMemo
Left = 0
Top = 0
Width = 723
Height = 189
Align = alTop
Anchors = [akLeft, akTop, akRight, akBottom]
Font.Charset = DEFAULT_CHARSET
Font.Color = clWindowText
Font.Height = -11
Font.Name = 'Courier New'
Font.Style = []
Lines.Strings = (
'Press [Generate text] or paste any text here')
ParentFont = False
ScrollBars = ssVertical
TabOrder = 0
ExplicitHeight = 201
end
object btnGenerateText: TButton
Left = 288
Top = 195
Width = 139
Height = 25
Anchors = [akBottom]
Caption = 'Generate text'
TabOrder = 1
OnClick = btnGenerateTextClick
ExplicitTop = 207
end
object btnLetTheTestsBegin: TButton
Left = 545
Top = 195
Width = 152
Height = 25
Anchors = [akRight, akBottom]
Caption = 'Let the tests begin!'
TabOrder = 2
OnClick = btnLetTheTestsBeginClick
ExplicitTop = 207
end
object Edit1: TEdit
Left = 87
Top = 195
Width = 121
Height = 25
Anchors = [akLeft, akBottom]
TabOrder = 3
Text = '100000'
ExplicitTop = 207
end
object Memo2: TMemo
Left = 0
Top = 226
Width = 723
Height = 125
Align = alBottom
Lines.Strings = (
'Test results')
TabOrder = 4
ExplicitTop = 238
end
end
object Form1: TForm1
Left = 0
Top = 0
Caption = 'Form1'
ClientHeight = 351
ClientWidth = 723
Color = clBtnFace
Font.Charset = DEFAULT_CHARSET
Font.Color = clWindowText
Font.Height = -11
Font.Name = 'Tahoma'
Font.Style = []
OldCreateOrder = False
DesignSize = (
723
351)
PixelsPerInch = 96
TextHeight = 13
object Label1: TLabel
Left = 24
Top = 195
Width = 57
Height = 25
Anchors = [akLeft, akBottom]
AutoSize = False
Caption = 'Text size:'
Layout = tlCenter
ExplicitTop = 207
end
object Memo1: TMemo
Left = 0
Top = 0
Width = 723
Height = 189
Align = alTop
Anchors = [akLeft, akTop, akRight, akBottom]
Font.Charset = DEFAULT_CHARSET
Font.Color = clWindowText
Font.Height = -11
Font.Name = 'Courier New'
Font.Style = []
Lines.Strings = (
'Press [Generate text] or paste any text here')
ParentFont = False
ScrollBars = ssVertical
TabOrder = 0
ExplicitHeight = 201
end
object btnGenerateText: TButton
Left = 288
Top = 195
Width = 139
Height = 25
Anchors = [akBottom]
Caption = 'Generate text'
TabOrder = 1
OnClick = btnGenerateTextClick
ExplicitTop = 207
end
object btnLetTheTestsBegin: TButton
Left = 545
Top = 195
Width = 152
Height = 25
Anchors = [akRight, akBottom]
Caption = 'Let the tests begin!'
TabOrder = 2
OnClick = btnLetTheTestsBeginClick
ExplicitTop = 207
end
object Edit1: TEdit
Left = 87
Top = 195
Width = 121
Height = 25
Anchors = [akLeft, akBottom]
TabOrder = 3
Text = '100000'
ExplicitTop = 207
end
object Memo2: TMemo
Left = 0
Top = 226
Width = 723
Height = 125
Align = alBottom
Lines.Strings = (
'Test results')
TabOrder = 4
ExplicitTop = 238
end
end
Nice test, alkisg. Too bad you used some older code of mine, which used the Length() function multiple times. Try using the last code example that I posted and test it again. :-)
...and a comment:
Mokule didn't test for long enough strings. Delete/Insert/+ etc allocate new memory all the time, so on big strings (e.g. 1Mb) they suck. On 10Mb they would even cause memory swapping and would take all day.
That's the main reason I had had implemented a replaceall substitude.
The result may be BIGGER than s, so I kept all the match offsets to use ONLY ONE ALLOCATION.
function StringReplaceAll(const s, oldPattern, newPattern: AnsiString): AnsiString;
var
offsets: array of integer;
ofs, lastResultOfs, lastSOfs, numOffsets, i: integer;
begin
ofs := PosEx(oldPattern, s, 1); //returns 0 even for Length(oldPattern) = 0
if ofs = 0 then
begin
Result := s;
Exit;
end;
if Length(oldPattern) = Length(newPattern) then //both <> 0
begin
Result := s;
repeat //ofs <> 0 on the first loop
StrMove(@Result[ofs], @newPattern[1], Length(newPattern));
ofs := PosEx(oldPattern, s, ofs + Length(oldPattern));
until ofs = 0;
Exit;
end;
//The general case: Length(oldPattern) <> Length(newPattern). Find all offsets first.
numOffsets := 0;
repeat //ofs <> 0 on the first loop
if numOffsets > High(offsets) then
SetLength(offsets, 100 + MulDiv(Length(offsets), 150, 100));
offsets[numOffsets] := ofs;
Inc(numOffsets);
ofs := PosEx(oldPattern, s, ofs + Length(oldPattern));
until ofs = 0;
SetLength(offsets, numOffsets);
//Build Result length. Length(newPattern)-Length( oldPattern ) could be < 0.
SetLength(Result, Length(s) + numOffsets * (Length(newPattern) - Length(oldPattern)));
//Do the copying.
lastResultOfs := 1;
lastSOfs := 1;
for i := 0 to numOffsets - 1 do
begin
StrMove(@Result[lastResult Ofs], @s[lastSOfs], offsets[i] - lastSOfs);
Inc(lastResultOfs, offsets[i] - lastSOfs);
StrMove(@Result[lastResult Ofs], PAnsiChar(newPattern), Length(newPattern));
Inc(lastResultOfs, Length(newPattern));
lastSOfs := offsets[i] + Length(oldPattern);
end;
StrMove(@Result[lastResult Ofs], @s[lastSOfs],
Length(Result) - lastResultOfs + 1);
end;
Mokule didn't test for long enough strings. Delete/Insert/+ etc allocate new memory all the time, so on big strings (e.g. 1Mb) they suck. On 10Mb they would even cause memory swapping and would take all day.
That's the main reason I had had implemented a replaceall substitude.
The result may be BIGGER than s, so I kept all the match offsets to use ONLY ONE ALLOCATION.
function StringReplaceAll(const s, oldPattern, newPattern: AnsiString): AnsiString;
var
offsets: array of integer;
ofs, lastResultOfs, lastSOfs, numOffsets, i: integer;
begin
ofs := PosEx(oldPattern, s, 1); //returns 0 even for Length(oldPattern) = 0
if ofs = 0 then
begin
Result := s;
Exit;
end;
if Length(oldPattern) = Length(newPattern) then //both <> 0
begin
Result := s;
repeat //ofs <> 0 on the first loop
StrMove(@Result[ofs], @newPattern[1], Length(newPattern));
ofs := PosEx(oldPattern, s, ofs + Length(oldPattern));
until ofs = 0;
Exit;
end;
//The general case: Length(oldPattern) <> Length(newPattern). Find all offsets first.
numOffsets := 0;
repeat //ofs <> 0 on the first loop
if numOffsets > High(offsets) then
SetLength(offsets, 100 + MulDiv(Length(offsets), 150, 100));
offsets[numOffsets] := ofs;
Inc(numOffsets);
ofs := PosEx(oldPattern, s, ofs + Length(oldPattern));
until ofs = 0;
SetLength(offsets, numOffsets);
//Build Result length. Length(newPattern)-Length(
SetLength(Result, Length(s) + numOffsets * (Length(newPattern) - Length(oldPattern)));
//Do the copying.
lastResultOfs := 1;
lastSOfs := 1;
for i := 0 to numOffsets - 1 do
begin
StrMove(@Result[lastResult
Inc(lastResultOfs, offsets[i] - lastSOfs);
StrMove(@Result[lastResult
Inc(lastResultOfs, Length(newPattern));
lastSOfs := offsets[i] + Length(oldPattern);
end;
StrMove(@Result[lastResult
Length(Result) - lastResultOfs + 1);
end;
Alex, I used your last procedure, but converted it into a function.
I also did 100 loops and got the averages, to provide more accurate results.
Running with Length(text) = 10000000
AlkisgStripMultipleSpaces completed in 129 milliseconds.
Workshop_AlexStripMultiple Spaces completed in 148 milliseconds.
MokuleStripMultipleSpaces completed in 148 milliseconds.
THE INTERESTING PART:
-------------------------- --
I changed the parameters of the source string, to have fewer continued spaces:
const
OneSpaceProbability = 1;
MoreSpacesProbability = 60;
//combine the above 2 probabilities to find the probability of three or more spaces
Results:
Running with Length(text) = 10000000
AlkigStripMultipleSpaces completed in 83 milliseconds.
Workshop_AlexStripMultiple Spaces completed in 123 milliseconds.
MokuleStripMultipleSpaces completed in 123 milliseconds.
My function uses Move, which uses rep movsb, and that's why it is faster...
I also did 100 loops and got the averages, to provide more accurate results.
Running with Length(text) = 10000000
AlkisgStripMultipleSpaces completed in 129 milliseconds.
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 148 milliseconds.
THE INTERESTING PART:
--------------------------
I changed the parameters of the source string, to have fewer continued spaces:
const
OneSpaceProbability = 1;
MoreSpacesProbability = 60;
//combine the above 2 probabilities to find the probability of three or more spaces
Results:
Running with Length(text) = 10000000
AlkigStripMultipleSpaces completed in 83 milliseconds.
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 123 milliseconds.
My function uses Move, which uses rep movsb, and that's why it is faster...
ASKER
even though some solutions are very slow, I would like you all to include ALL solutions in your comparative tests. it is always a good thing to find a comparison between a good solution and a bad solution to see excatly how bad it is ;)
so please also give numbers for the other solutions as well :)
thanks
so please also give numbers for the other solutions as well :)
thanks
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
@ciuly: sorry, this is IMPOSSIBLE for such large strings...
It would probably take a day or more...
I tried it with Delphi's functions once, and waited 17 hours...
I'll post a comparison for smaller strings (< 1Mb, maybe even < 100Kb) in a while...
It would probably take a day or more...
I tried it with Delphi's functions once, and waited 17 hours...
I'll post a comparison for smaller strings (< 1Mb, maybe even < 100Kb) in a while...
And as an additional suggestion, use RDTSC instead of GetTickCounts to count CPU cycles instead of CPU times. CPU Cycles should be about the same for all computers with similar processors. Thus if you and I are both using an Intel Pentium IV system but yours is 3.2 GHz and mine only 1.8 GHZ then you would get smaller CPU times but we should both get about the same number of cycles. I just have to wait twice as long. :-)
Btw, the use of move would speed things up of course, at assembler level, but it shouldn't have to make that much difference. Actually, in a situation where you have many one-character words between lots of spaces then you might not have much gain there.
Then again, my solution isn't even using the most optimized Delphi code. It should be possible to make it faster, by doing more with PChar() typecasts...
Btw, the use of move would speed things up of course, at assembler level, but it shouldn't have to make that much difference. Actually, in a situation where you have many one-character words between lots of spaces then you might not have much gain there.
Then again, my solution isn't even using the most optimized Delphi code. It should be possible to make it faster, by doing more with PChar() typecasts...
Well done, Russell! On first glance, I'd say yours is the fastest... :-)
ASKER
alkisg: I did mention using 100KB strings. if the method is still very slow, you can use a string of say 10KB and mention this that the moethod takes 4 mionutes wheres the otehr methods take less thn 100 ms or something like this. the user will understand then that there is a very big difference between the 2 ;)
and of course alexs suggestion of calculating cpu cycles is a very good one, so lets have a shoot of that. or both :)
and of course alexs suggestion of calculating cpu cycles is a very good one, so lets have a shoot of that. or both :)
I included the other expert's functions as well.
All the test were run in a loop (100 times).
The times/cycles are for all the loops (no average).
CPU = Celeron 1.6GHz
Text size = 100.000
OneSpaceProbability = 1; //approximately one space per 100 chars
MoreSpacesProbability = 60; //60% that another space follows
//combine the above 2 probabilities to find the probability of three or more spaces
WARNING: For the functions that use string Deletions/Additions etc, the time grows exponentially with the text size. This is because they alloc memory for each delete/add, and large memory allocations are very time consuming.
1) CPU CYCLES
-------------------------- -
To measure CPU cycles, I used
function RDTSC: Int64; assembler;
asm
RDTSC // result Int64 in EAX and EDX
end;
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 47238688 CPU cycles.
Workshop_AlexStripMultiple Spaces completed in 131259408 CPU cycles.
MokuleStripMultipleSpaces completed in 116306592 CPU cycles.
RussellStripMultipleSpaces completed in 73143008 CPU cycles.
CiulyStripMultipleSpaces completed in 8345003120 CPU cycles.
LimbeckStripMultipleSpaces completed in 103809248160 CPU cycles.
RmeereSubstituteSpaces completed in 4557413756 CPU cycles.
... and had errors!
MahdiparakStripMultipleSpa ces completed in 4792759152 CPU cycles.
... and had errors!
2) TIMEGETTIME
-------------------------- -
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 30 milliseconds.
Workshop_AlexStripMultiple Spaces completed in 74 milliseconds.
MokuleStripMultipleSpaces completed in 80 milliseconds.
RussellStripMultipleSpaces completed in 55 milliseconds.
CiulyStripMultipleSpaces completed in 4739 milliseconds.
LimbeckStripMultipleSpaces completed in 62763 milliseconds.
RmeereSubstituteSpaces completed in 3053 milliseconds.
... and had errors!
MahdiparakStripMultipleSpa ces completed in 3133 milliseconds.
... and had errors!
THE CODE
-------------------------- -
unit Unit1;
interface
uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls, StrUtils, MMSystem;
type
TForm1 = class(TForm)
Memo1: TMemo;
btnGenerateText: TButton;
btnLetTheTestsBegin: TButton;
Label1: TLabel;
Edit1: TEdit;
Memo2: TMemo;
procedure btnGenerateTextClick(Sende r: TObject);
procedure btnLetTheTestsBeginClick(S ender: TObject);
end;
var
Form1: TForm1;
implementation
{$R *.dfm}
procedure TForm1.btnGenerateTextClic k(Sender: TObject);
const
OneSpaceProbability = 1;
MoreSpacesProbability = 60;
//combine the above 2 probabilities to find the probability of three or more spaces
var
i: Integer;
lastOneWasSpace: Boolean;
theText: String;
textSize: Integer;
begin
Randomize;
textSize := StrToInt(Edit1.Text);
SetLength(theText, textSize);
i := 1;
lastOneWasSpace := false;
while i <= textSize do
begin
if i mod 82 = 81 then //insert line breaks every 80 chars.
begin
theText[i] := #13;
Inc(i);
theText[i] := #10;
end
else if lastOneWasSpace then
begin
if Random(100) < MoreSpacesProbability then
theText[i] := ' '
else
theText[i] := char(ord('a') + random(26));
end
else begin
if Random(100) < OneSpaceProbability then
theText[i] := ' '
else
theText[i] := char(ord('a') + random(26));
end;
lastOneWasSpace := theText[i] = ' ';
Inc(i);
end;
Memo1.Text := theText;
end;
function AlkisgStripMultipleSpaces( s: String): String;
var
spacesStart, nextWordStart, nextSpacesStart, whereToMove, howManyToMove: Integer;
finished: Boolean;
begin
Result := s;
//Working only with Result from now on.
//It would be faster if we were permitted to use s directly.
spacesStart := Pos(' ', Result);
whereToMove := spacesStart; //a single space followed by the word should be put in this position
finished := spacesStart <= 0;
while not finished do
begin
nextWordStart := spacesStart + 2;
{$IFOPT R+}{$DEFINE RestoreRangeChecking}{$ELS E}{$UNDEF RestoreRangeChecking}{$END IF}{$R-}
//Range checking is turned off to avoid checking if nextWordStart <= Length(Result)
//It would be much faster if assembly rep scasb was used
while Result[nextWordStart] = ' ' do //this will also stop on #0 on end of string
Inc(nextWordStart);
{$IFDEF RestoreRangeChecking}{$R+} {$ENDIF}
nextSpacesStart := PosEx(' ', Result, nextWordStart);
finished := nextSpacesStart <= 0;
if not finished then
begin
howManyToMove := nextSpacesStart - nextWordStart + 1;
Move(Result[nextWordStart - 1], Result[whereToMove], howManyToMove);
Inc(whereToMove, howManyToMove);
spacesStart := nextSpacesStart;
end
else begin
howManyToMove := Length(Result) + 1 - nextWordStart + 1;
Move(Result[nextWordStart - 1], Result[whereToMove], howManyToMove);
SetLength(Result, whereToMove + howManyToMove - 1);
end;
end;
end;
function Workshop_AlexStripMultiple Spaces(s: string): string;
var
I, J: Integer;
C: PChar;
Max: Integer;
begin
Result := s;
I := Pos(#32#32, Result);
if (I > 0) then
begin
UniqueString(Result);
Max := Length(Result);
C := PChar(@Result[I]);
repeat
Inc(I);
Inc(C);
J := Succ(I);
while (J <= Max) and (Result[J] = #32) do
Inc(J);
while (J <= Max) do
begin
if (Result[Pred(I)] = #32) then
begin
while (J <= Max) and (Result[J] = #32) do
Inc(J);
end;
if (J <= Max) then
begin
C^ := Result[J];
Inc(I);
Inc(C);
Inc(J);
end;
end;
until (J > Max);
SetLength(Result, Pred(I));
end;
end;
function CiulyStripMultipleSpaces(s :String):s tring;
var
i:integer;
begin
i:=1;
while i<=length(s) do
begin
while (i<=length(s)) and (s[i]<>' ') do
i:=i+1;
i:=+i+1;
while (i<=length(s)) and (s[i]=' ') do
delete(s,i,1);// I don't know excatly how badly this affects performance. it could be calculated.
end;
result:=s;
end;
function MokuleStripMultipleSpaces( s:String): string;
var
j: integer;
first: boolean;
p: ^Char;
len: integer;
begin
Result := s;
p := @Result[1];
first := True;
len := Length(Result);
for j := 1 to len do
begin
if Result[j] <> ' ' then
begin
p^ := Result[j];
Inc(p);
first := True;
end
else if first then
begin
p^ := Result[j];
Inc(p);
first := False;
end
end;
j := integer(p)-integer(@Result [1]);
SetLength(Result,j);
end;
function RussellStripMultipleSpaces (Value: String): String;
var lpszSource: PChar;
lpszDest: PChar;
dwLength: Integer;
dwIndex: Integer;
begin
// Get source length
dwLength:=Length(Value);
//Check source length
if (dwLength > 1) then
begin
// Make string safe for casting
SetLength(result, dwLength);
// Get pointer to source
lpszSource:=Pointer(Value) ;
// Get pointer to dest
lpszDest:=Pointer(result);
// Walk the string
for dwIndex:=0 to Pred(dwLength) do
begin
// Check for double spaces
if not(Word(Pointer(lpszSourc e)^) = 8224) then
begin
// Copy source
lpszDest^:=lpszSource^;
// Increment destination pointer
Inc(lpszDest);
end;
// Increment the source string
Inc(lpszSource);
end;
// Truncate string
SetLength(result, lpszDest - PChar(Pointer(result)));
end
else
// Straight copy
result:=Value;
end;
function LimbeckStripMultipleSpaces (s: String): String;
begin
Result := s;
while Pos(' ', Result) > 0 do
Result := StringReplace(Result, ' ', ' ', [rfReplaceAll]);
end;
function MahdiparakStripMultipleSpa ces(s: String): String;
var
i:Integer;
space:Boolean;
begin
s := s + ' ';
for i:=1 to Length(s) do
begin
if s[i]<>' ' then
begin
Result:=Result+s[i];
space:=false;
end
else
begin
if space<>true then
begin
Result:=Result+' ';
space:=true;
end;
end;
end;
end;
function RmeereSubstituteSpaces(sIn : string) : string;
var
i : Integer;
begin
i := 1;
while i < Length(sIn) do
begin
if sIn[i] <> ' ' then
begin
Result := Result + sIn[i];
Inc(i);
Continue;
end;
Result := Result + ' ';
while (sIn[i] = ' ') and (i < Length(sIn)) do
Inc(i);
end;
end;
function TimeGetTime: Int64; assembler;
asm
RDTSC // result Int64 in EAX and EDX
end;
procedure TForm1.btnLetTheTestsBegin Click(Send er: TObject);
type
SubstituteSpacesFunction = record
ssName: string;
ssFunction: function(s: string) : string;
end;
const
functions: array[0..7] of SubstituteSpacesFunction = (
(ssName: 'AlkisgStripMultipleSpaces '; ssFunction: AlkisgStripMultipleSpaces) ,
(ssName: 'Workshop_AlexStripMultipl eSpaces'; ssFunction: Workshop_AlexStripMultiple Spaces),
(ssName: 'MokuleStripMultipleSpaces '; ssFunction: MokuleStripMultipleSpaces) ,
(ssName: 'RussellStripMultipleSpace s'; ssFunction: RussellStripMultipleSpaces ),
(ssName: 'CiulyStripMultipleSpaces' ; ssFunction: CiulyStripMultipleSpaces),
(ssName: 'LimbeckStripMultipleSpace s'; ssFunction: LimbeckStripMultipleSpaces ),
(ssName: 'RmeereSubstituteSpaces'; ssFunction: RmeereSubstituteSpaces),
(ssName: 'MahdiparakStripMultipleSp aces'; ssFunction: MahdiparakStripMultipleSpa ces));
var
theText, compacted, correctStr: String;
startMS, endMS: Int64;
f, i: integer;
begin
theText := Memo1.Text;
Edit1.Text := IntToStr(Length(theText));
Memo2.Text := 'Running with Length(text) = ' + Edit1.Text;
Application.ProcessMessage s;
correctStr := AlkisgStripMultipleSpaces( theText); //:-) OK, its just for comparisons...
for f := Low(functions) to High(functions) do
begin
startMS := TimeGetTime;
for i := 1 to 100 do
compacted := functions[f].ssFunction(th eText);
endMs := TimeGetTime;
Memo2.Lines.Add(functions[ f].ssName + ' completed in '
+ IntToStr((endMs - StartMs) div 1) + ' CPU cycles.');
if compacted <> correctStr then
Memo2.Lines.Add(' ... and had errors!');
Application.ProcessMessage s;
Sleep(100); //This is to give Windows some time to adjust paged memory
end;
end;
end.
All the test were run in a loop (100 times).
The times/cycles are for all the loops (no average).
CPU = Celeron 1.6GHz
Text size = 100.000
OneSpaceProbability = 1; //approximately one space per 100 chars
MoreSpacesProbability = 60; //60% that another space follows
//combine the above 2 probabilities to find the probability of three or more spaces
WARNING: For the functions that use string Deletions/Additions etc, the time grows exponentially with the text size. This is because they alloc memory for each delete/add, and large memory allocations are very time consuming.
1) CPU CYCLES
--------------------------
To measure CPU cycles, I used
function RDTSC: Int64; assembler;
asm
RDTSC // result Int64 in EAX and EDX
end;
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 47238688 CPU cycles.
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 116306592 CPU cycles.
RussellStripMultipleSpaces
CiulyStripMultipleSpaces completed in 8345003120 CPU cycles.
LimbeckStripMultipleSpaces
RmeereSubstituteSpaces completed in 4557413756 CPU cycles.
... and had errors!
MahdiparakStripMultipleSpa
... and had errors!
2) TIMEGETTIME
--------------------------
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 30 milliseconds.
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 80 milliseconds.
RussellStripMultipleSpaces
CiulyStripMultipleSpaces completed in 4739 milliseconds.
LimbeckStripMultipleSpaces
RmeereSubstituteSpaces completed in 3053 milliseconds.
... and had errors!
MahdiparakStripMultipleSpa
... and had errors!
THE CODE
--------------------------
unit Unit1;
interface
uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls, StrUtils, MMSystem;
type
TForm1 = class(TForm)
Memo1: TMemo;
btnGenerateText: TButton;
btnLetTheTestsBegin: TButton;
Label1: TLabel;
Edit1: TEdit;
Memo2: TMemo;
procedure btnGenerateTextClick(Sende
procedure btnLetTheTestsBeginClick(S
end;
var
Form1: TForm1;
implementation
{$R *.dfm}
procedure TForm1.btnGenerateTextClic
const
OneSpaceProbability = 1;
MoreSpacesProbability = 60;
//combine the above 2 probabilities to find the probability of three or more spaces
var
i: Integer;
lastOneWasSpace: Boolean;
theText: String;
textSize: Integer;
begin
Randomize;
textSize := StrToInt(Edit1.Text);
SetLength(theText, textSize);
i := 1;
lastOneWasSpace := false;
while i <= textSize do
begin
if i mod 82 = 81 then //insert line breaks every 80 chars.
begin
theText[i] := #13;
Inc(i);
theText[i] := #10;
end
else if lastOneWasSpace then
begin
if Random(100) < MoreSpacesProbability then
theText[i] := ' '
else
theText[i] := char(ord('a') + random(26));
end
else begin
if Random(100) < OneSpaceProbability then
theText[i] := ' '
else
theText[i] := char(ord('a') + random(26));
end;
lastOneWasSpace := theText[i] = ' ';
Inc(i);
end;
Memo1.Text := theText;
end;
function AlkisgStripMultipleSpaces(
var
spacesStart, nextWordStart, nextSpacesStart, whereToMove, howManyToMove: Integer;
finished: Boolean;
begin
Result := s;
//Working only with Result from now on.
//It would be faster if we were permitted to use s directly.
spacesStart := Pos(' ', Result);
whereToMove := spacesStart; //a single space followed by the word should be put in this position
finished := spacesStart <= 0;
while not finished do
begin
nextWordStart := spacesStart + 2;
{$IFOPT R+}{$DEFINE RestoreRangeChecking}{$ELS
//Range checking is turned off to avoid checking if nextWordStart <= Length(Result)
//It would be much faster if assembly rep scasb was used
while Result[nextWordStart] = ' ' do //this will also stop on #0 on end of string
Inc(nextWordStart);
{$IFDEF RestoreRangeChecking}{$R+}
nextSpacesStart := PosEx(' ', Result, nextWordStart);
finished := nextSpacesStart <= 0;
if not finished then
begin
howManyToMove := nextSpacesStart - nextWordStart + 1;
Move(Result[nextWordStart - 1], Result[whereToMove], howManyToMove);
Inc(whereToMove, howManyToMove);
spacesStart := nextSpacesStart;
end
else begin
howManyToMove := Length(Result) + 1 - nextWordStart + 1;
Move(Result[nextWordStart - 1], Result[whereToMove], howManyToMove);
SetLength(Result, whereToMove + howManyToMove - 1);
end;
end;
end;
function Workshop_AlexStripMultiple
var
I, J: Integer;
C: PChar;
Max: Integer;
begin
Result := s;
I := Pos(#32#32, Result);
if (I > 0) then
begin
UniqueString(Result);
Max := Length(Result);
C := PChar(@Result[I]);
repeat
Inc(I);
Inc(C);
J := Succ(I);
while (J <= Max) and (Result[J] = #32) do
Inc(J);
while (J <= Max) do
begin
if (Result[Pred(I)] = #32) then
begin
while (J <= Max) and (Result[J] = #32) do
Inc(J);
end;
if (J <= Max) then
begin
C^ := Result[J];
Inc(I);
Inc(C);
Inc(J);
end;
end;
until (J > Max);
SetLength(Result, Pred(I));
end;
end;
function CiulyStripMultipleSpaces(s
var
i:integer;
begin
i:=1;
while i<=length(s) do
begin
while (i<=length(s)) and (s[i]<>' ') do
i:=i+1;
i:=+i+1;
while (i<=length(s)) and (s[i]=' ') do
delete(s,i,1);// I don't know excatly how badly this affects performance. it could be calculated.
end;
result:=s;
end;
function MokuleStripMultipleSpaces(
var
j: integer;
first: boolean;
p: ^Char;
len: integer;
begin
Result := s;
p := @Result[1];
first := True;
len := Length(Result);
for j := 1 to len do
begin
if Result[j] <> ' ' then
begin
p^ := Result[j];
Inc(p);
first := True;
end
else if first then
begin
p^ := Result[j];
Inc(p);
first := False;
end
end;
j := integer(p)-integer(@Result
SetLength(Result,j);
end;
function RussellStripMultipleSpaces
var lpszSource: PChar;
lpszDest: PChar;
dwLength: Integer;
dwIndex: Integer;
begin
// Get source length
dwLength:=Length(Value);
//Check source length
if (dwLength > 1) then
begin
// Make string safe for casting
SetLength(result, dwLength);
// Get pointer to source
lpszSource:=Pointer(Value)
// Get pointer to dest
lpszDest:=Pointer(result);
// Walk the string
for dwIndex:=0 to Pred(dwLength) do
begin
// Check for double spaces
if not(Word(Pointer(lpszSourc
begin
// Copy source
lpszDest^:=lpszSource^;
// Increment destination pointer
Inc(lpszDest);
end;
// Increment the source string
Inc(lpszSource);
end;
// Truncate string
SetLength(result, lpszDest - PChar(Pointer(result)));
end
else
// Straight copy
result:=Value;
end;
function LimbeckStripMultipleSpaces
begin
Result := s;
while Pos(' ', Result) > 0 do
Result := StringReplace(Result, ' ', ' ', [rfReplaceAll]);
end;
function MahdiparakStripMultipleSpa
var
i:Integer;
space:Boolean;
begin
s := s + ' ';
for i:=1 to Length(s) do
begin
if s[i]<>' ' then
begin
Result:=Result+s[i];
space:=false;
end
else
begin
if space<>true then
begin
Result:=Result+' ';
space:=true;
end;
end;
end;
end;
function RmeereSubstituteSpaces(sIn
var
i : Integer;
begin
i := 1;
while i < Length(sIn) do
begin
if sIn[i] <> ' ' then
begin
Result := Result + sIn[i];
Inc(i);
Continue;
end;
Result := Result + ' ';
while (sIn[i] = ' ') and (i < Length(sIn)) do
Inc(i);
end;
end;
function TimeGetTime: Int64; assembler;
asm
RDTSC // result Int64 in EAX and EDX
end;
procedure TForm1.btnLetTheTestsBegin
type
SubstituteSpacesFunction = record
ssName: string;
ssFunction: function(s: string) : string;
end;
const
functions: array[0..7] of SubstituteSpacesFunction = (
(ssName: 'AlkisgStripMultipleSpaces
(ssName: 'Workshop_AlexStripMultipl
(ssName: 'MokuleStripMultipleSpaces
(ssName: 'RussellStripMultipleSpace
(ssName: 'CiulyStripMultipleSpaces'
(ssName: 'LimbeckStripMultipleSpace
(ssName: 'RmeereSubstituteSpaces'; ssFunction: RmeereSubstituteSpaces),
(ssName: 'MahdiparakStripMultipleSp
var
theText, compacted, correctStr: String;
startMS, endMS: Int64;
f, i: integer;
begin
theText := Memo1.Text;
Edit1.Text := IntToStr(Length(theText));
Memo2.Text := 'Running with Length(text) = ' + Edit1.Text;
Application.ProcessMessage
correctStr := AlkisgStripMultipleSpaces(
for f := Low(functions) to High(functions) do
begin
startMS := TimeGetTime;
for i := 1 to 100 do
compacted := functions[f].ssFunction(th
endMs := TimeGetTime;
Memo2.Lines.Add(functions[
+ IntToStr((endMs - StartMs) div 1) + ' CPU cycles.');
if compacted <> correctStr then
Memo2.Lines.Add(' ... and had errors!');
Application.ProcessMessage
Sleep(100); //This is to give Windows some time to adjust paged memory
end;
end;
end.
Anyone run this on a Pentium 4 class (or above) system? I would, but only have D5 Enterprise and don't have the PosEx function...
The only reason I ask is that I know for fact that the rep(X), scas(X), etc functions are slower (time wise) than performing the same thing using single instructions. (like cmp, mov, etc). It would be interesting to know if the numbers stay the same....
Russell
rep(x) is SLOWER than cmp???
Wow, I've stopped assembly programming 12 years ago, I guess some things DO change!!!
In 80x86, if I remember correctly (it's been 12 years),
rep scasb used 1 cycle
cmp (reg to mem) 1 cycle
jmp 2 cycles
so it was 3 times faster...
I did some disassembly once in Pascal Pos, it used rep scasb.
I haven't looked at PosEx.
Now that you mentioned it, I looked at PosEx source code. It DOESN'T use rep scasb!!!
==> So I guess you are right here (and the timings are validated).
Since PosEx has an MPL licence, I guess it's OK to post it:
* ***** BEGIN LICENSE BLOCK *****
* Version: MPL 1.1
*
* The implementation of function PosEx is subject to the
* Mozilla Public License Version 1.1 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at http://www.mozilla.org/MPL/
*
* Software distributed under the License is distributed on an "AS IS" basis,
* WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
* for the specific language governing rights and limitations under the
* License.
*
* The Original Code is Fastcode
*
* The Initial Developer of the Original Code is Fastcode
*
* Portions created by the Initial Developer are Copyright (C) 2002-2004
* the Initial Developer. All Rights Reserved.
*
* Contributor(s): Aleksandr Sharahov
*
* ***** END LICENSE BLOCK ***** *)
function PosEx(const SubStr, S: string; Offset: Integer = 1): Integer;
asm
test eax, eax
jz @Nil
test edx, edx
jz @Nil
dec ecx
jl @Nil
push esi
push ebx
mov esi, [edx-4] //Length(Str)
mov ebx, [eax-4] //Length(Substr)
sub esi, ecx //effective length of Str
add edx, ecx //addr of the first char at starting position
cmp esi, ebx
jl @Past //jump if EffectiveLength(Str)<Lengt h(Substr)
test ebx, ebx
jle @Past //jump if Length(Substr)<=0
add esp, -12
add ebx, -1 //Length(Substr)-1
add esi, edx //addr of the terminator
add edx, ebx //addr of the last char at starting position
mov [esp+8], esi //save addr of the terminator
add eax, ebx //addr of the last char of Substr
sub ecx, edx //-@Str[Length(Substr)]
neg ebx //-(Length(Substr)-1)
mov [esp+4], ecx //save -@Str[Length(Substr)]
mov [esp], ebx //save -(Length(Substr)-1)
movzx ecx, byte ptr [eax] //the last char of Substr
@Loop:
cmp cl, [edx]
jz @Test0
@AfterTest0:
cmp cl, [edx+1]
jz @TestT
@AfterTestT:
add edx, 4
cmp edx, [esp+8]
jb @Continue
@EndLoop:
add edx, -2
cmp edx, [esp+8]
jb @Loop
@Exit:
add esp, 12
@Past:
pop ebx
pop esi
@Nil:
xor eax, eax
ret
@Continue:
cmp cl, [edx-2]
jz @Test2
cmp cl, [edx-1]
jnz @Loop
@Test1:
add edx, 1
@Test2:
add edx, -2
@Test0:
add edx, -1
@TestT:
mov esi, [esp]
test esi, esi
jz @Found
@String:
movzx ebx, word ptr [esi+eax]
cmp bx, word ptr [esi+edx+1]
jnz @AfterTestT
cmp esi, -2
jge @Found
movzx ebx, word ptr [esi+eax+2]
cmp bx, word ptr [esi+edx+3]
jnz @AfterTestT
add esi, 4
jl @String
@Found:
mov eax, [esp+4]
add edx, 2
cmp edx, [esp+8]
ja @Exit
add esp, 12
add eax, edx
pop ebx
pop esi
end;
Wow, I've stopped assembly programming 12 years ago, I guess some things DO change!!!
In 80x86, if I remember correctly (it's been 12 years),
rep scasb used 1 cycle
cmp (reg to mem) 1 cycle
jmp 2 cycles
so it was 3 times faster...
I did some disassembly once in Pascal Pos, it used rep scasb.
I haven't looked at PosEx.
Now that you mentioned it, I looked at PosEx source code. It DOESN'T use rep scasb!!!
==> So I guess you are right here (and the timings are validated).
Since PosEx has an MPL licence, I guess it's OK to post it:
* ***** BEGIN LICENSE BLOCK *****
* Version: MPL 1.1
*
* The implementation of function PosEx is subject to the
* Mozilla Public License Version 1.1 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at http://www.mozilla.org/MPL/
*
* Software distributed under the License is distributed on an "AS IS" basis,
* WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
* for the specific language governing rights and limitations under the
* License.
*
* The Original Code is Fastcode
*
* The Initial Developer of the Original Code is Fastcode
*
* Portions created by the Initial Developer are Copyright (C) 2002-2004
* the Initial Developer. All Rights Reserved.
*
* Contributor(s): Aleksandr Sharahov
*
* ***** END LICENSE BLOCK ***** *)
function PosEx(const SubStr, S: string; Offset: Integer = 1): Integer;
asm
test eax, eax
jz @Nil
test edx, edx
jz @Nil
dec ecx
jl @Nil
push esi
push ebx
mov esi, [edx-4] //Length(Str)
mov ebx, [eax-4] //Length(Substr)
sub esi, ecx //effective length of Str
add edx, ecx //addr of the first char at starting position
cmp esi, ebx
jl @Past //jump if EffectiveLength(Str)<Lengt
test ebx, ebx
jle @Past //jump if Length(Substr)<=0
add esp, -12
add ebx, -1 //Length(Substr)-1
add esi, edx //addr of the terminator
add edx, ebx //addr of the last char at starting position
mov [esp+8], esi //save addr of the terminator
add eax, ebx //addr of the last char of Substr
sub ecx, edx //-@Str[Length(Substr)]
neg ebx //-(Length(Substr)-1)
mov [esp+4], ecx //save -@Str[Length(Substr)]
mov [esp], ebx //save -(Length(Substr)-1)
movzx ecx, byte ptr [eax] //the last char of Substr
@Loop:
cmp cl, [edx]
jz @Test0
@AfterTest0:
cmp cl, [edx+1]
jz @TestT
@AfterTestT:
add edx, 4
cmp edx, [esp+8]
jb @Continue
@EndLoop:
add edx, -2
cmp edx, [esp+8]
jb @Loop
@Exit:
add esp, 12
@Past:
pop ebx
pop esi
@Nil:
xor eax, eax
ret
@Continue:
cmp cl, [edx-2]
jz @Test2
cmp cl, [edx-1]
jnz @Loop
@Test1:
add edx, 1
@Test2:
add edx, -2
@Test0:
add edx, -1
@TestT:
mov esi, [esp]
test esi, esi
jz @Found
@String:
movzx ebx, word ptr [esi+eax]
cmp bx, word ptr [esi+edx+1]
jnz @AfterTestT
cmp esi, -2
jge @Found
movzx ebx, word ptr [esi+eax+2]
cmp bx, word ptr [esi+edx+3]
jnz @AfterTestT
add esi, 4
jl @String
@Found:
mov eax, [esp+4]
add edx, 2
cmp edx, [esp+8]
ja @Exit
add esp, 12
add eax, edx
pop ebx
pop esi
end;
Just out of curiosity, on the weekend I'll do my own implementation of PosEx using rep scasb, and I'll time them both and I'll post the results here...
You *HAVE* to be right, Move also doesn't use rep movsb.
Yet, most other system functions (_Insert, _Delete, _Copy) do use rep(X)...
Move is also programmed by "FastCode" and under MPL license, while the others are (c) Borland.
Yet, most other system functions (_Insert, _Delete, _Copy) do use rep(X)...
Move is also programmed by "FastCode" and under MPL license, while the others are (c) Borland.
Alkisg,
Thanks for the PosEx post, much appreciated. And regarding the asm topic, I *do* know what I am talking about...
The P4 changed alot of things; one of those was a slowdown (compared to hand rolled code) for operations like:
mov(x)
rep
repne
scas(b)
bt(x) bit testing
Much of borland's code was written years ago, but you should find that newer code does not (or makes little use of) the instructions mentioned above.
Anyways, with the PosEx in place, I was able to setup your testbed, and also included the ms run time as well as the rdtsc time. The only changes were as follows:
// rdtsc not implemented in inline assembler for D5 and below
function TimeGetTime: Int64;
asm
db $0f,$31
end;
-- and --
// Some of these were REALLY slow, commented out so I could quickly baseline the first 5
procedure TForm1.btnLetTheTestsBegin Click(Send er: TObject);
type
SubstituteSpacesFunction = record
ssName: string;
ssFunction: function(s: string) : string;
end;
const
functions: array[0..4] of SubstituteSpacesFunction = (
(ssName: 'AlkisgStripMultipleSpaces '; ssFunction: AlkisgStripMultipleSpaces) ,
(ssName: 'Workshop_AlexStripMultipl eSpaces'; ssFunction: Workshop_AlexStripMultiple Spaces),
(ssName: 'MokuleStripMultipleSpaces '; ssFunction: MokuleStripMultipleSpaces) ,
(ssName: 'RussellStripMultipleSpace s'; ssFunction: RussellStripMultipleSpaces ),
(ssName: 'CiulyStripMultipleSpaces' ; ssFunction: CiulyStripMultipleSpaces)) ;
//
//(ssName: 'LimbeckStripMultipleSpace s'; ssFunction: LimbeckStripMultipleSpaces ),
//(ssName: 'RmeereSubstituteSpaces'; ssFunction: RmeereSubstituteSpaces),
//(ssName: 'MahdiparakStripMultipleSp aces'; ssFunction: MahdiparakStripMultipleSpa ces));
var
theText, compacted, correctStr: String;
startMS, endMS: Int64;
dwMark: Cardinal;
f, i: integer;
begin
theText := Memo1.Text;
Edit1.Text := IntToStr(Length(theText));
Memo2.Text := 'Running with Length(text) = ' + Edit1.Text;
Application.ProcessMessage s;
correctStr := AlkisgStripMultipleSpaces( theText); //:-) OK, its just for comparisons...
for f := Low(functions) to High(functions) do
begin
dwMark:=MMSystem.timeGetTi me;
startMS := TimeGetTime;
for i := 1 to 100 do
compacted := functions[f].ssFunction(th eText);
endMs := TimeGetTime;
dwMark:=MMSystem.timeGetTi me - dwMark;
Memo2.Lines.Add(functions[ f].ssName + ' completed in '
+ IntToStr((endMs - StartMs) div 1) + ' CPU cycles. ('+IntToStr(dwMark)+') ms');
if compacted <> correctStr then
Memo2.Lines.Add(' ... and had errors!');
Application.ProcessMessage s;
Sleep(100); //This is to give Windows some time to adjust paged memory
end;
end;
end.
On my P4 2.8GHZ / WinXP Pro / 1GB Ram I got the following results:
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 94923772 CPU cycles. (47) ms
Workshop_AlexStripMultiple Spaces completed in 141730824 CPU cycles. (47) ms
MokuleStripMultipleSpaces completed in 142410388 CPU cycles. (47) ms
RussellStripMultipleSpaces completed in 68480552 CPU cycles. (16) ms
CiulyStripMultipleSpaces completed in 10139402120 CPU cycles. (4000) ms
Thanks for the PosEx post, much appreciated. And regarding the asm topic, I *do* know what I am talking about...
The P4 changed alot of things; one of those was a slowdown (compared to hand rolled code) for operations like:
mov(x)
rep
repne
scas(b)
bt(x) bit testing
Much of borland's code was written years ago, but you should find that newer code does not (or makes little use of) the instructions mentioned above.
Anyways, with the PosEx in place, I was able to setup your testbed, and also included the ms run time as well as the rdtsc time. The only changes were as follows:
// rdtsc not implemented in inline assembler for D5 and below
function TimeGetTime: Int64;
asm
db $0f,$31
end;
-- and --
// Some of these were REALLY slow, commented out so I could quickly baseline the first 5
procedure TForm1.btnLetTheTestsBegin
type
SubstituteSpacesFunction = record
ssName: string;
ssFunction: function(s: string) : string;
end;
const
functions: array[0..4] of SubstituteSpacesFunction = (
(ssName: 'AlkisgStripMultipleSpaces
(ssName: 'Workshop_AlexStripMultipl
(ssName: 'MokuleStripMultipleSpaces
(ssName: 'RussellStripMultipleSpace
(ssName: 'CiulyStripMultipleSpaces'
//
//(ssName: 'LimbeckStripMultipleSpace
//(ssName: 'RmeereSubstituteSpaces'; ssFunction: RmeereSubstituteSpaces),
//(ssName: 'MahdiparakStripMultipleSp
var
theText, compacted, correctStr: String;
startMS, endMS: Int64;
dwMark: Cardinal;
f, i: integer;
begin
theText := Memo1.Text;
Edit1.Text := IntToStr(Length(theText));
Memo2.Text := 'Running with Length(text) = ' + Edit1.Text;
Application.ProcessMessage
correctStr := AlkisgStripMultipleSpaces(
for f := Low(functions) to High(functions) do
begin
dwMark:=MMSystem.timeGetTi
startMS := TimeGetTime;
for i := 1 to 100 do
compacted := functions[f].ssFunction(th
endMs := TimeGetTime;
dwMark:=MMSystem.timeGetTi
Memo2.Lines.Add(functions[
+ IntToStr((endMs - StartMs) div 1) + ' CPU cycles. ('+IntToStr(dwMark)+') ms');
if compacted <> correctStr then
Memo2.Lines.Add(' ... and had errors!');
Application.ProcessMessage
Sleep(100); //This is to give Windows some time to adjust paged memory
end;
end;
end.
On my P4 2.8GHZ / WinXP Pro / 1GB Ram I got the following results:
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 94923772 CPU cycles. (47) ms
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 142410388 CPU cycles. (47) ms
RussellStripMultipleSpaces
CiulyStripMultipleSpaces completed in 10139402120 CPU cycles. (4000) ms
Maybe Delphi 5 uses an older and slower Move() implementation, which uses repXXX, which as you said are slower on P4.
I include FastCode's Move() below, if you want you may test my function with it.
I tested AlkisgMove() with rep movsd (dword aligned)
vs FastCode's Move(), which mainly uses fild and fistp qword ptr ...
Results (1Mb data / 10000 loops):
AlkisgMove = 28525 ms
Move = 28253 ms
FastCode, well, has (a little!) faster code (maybe it'll be a lot faster on P4). But he uses 8-byte operands (qword ptr ...), which are faster that 4-byte operands (movsd)... I wish movsq existed to test it! :-)
Nevertheless, he also uses cmp/jle in the main loop. I guess instruction pipelining / branch prediction or whatever they call them has improved a lot since 80x86!!!
(* ***** BEGIN LICENSE BLOCK *****
* Version: MPL 1.1
*
* The assembly implementation of function Move is subject to the
* Mozilla Public License Version 1.1 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at http://www.mozilla.org/MPL/
*
* Software distributed under the License is distributed on an "AS IS" basis,
* WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
* for the specific language governing rights and limitations under the
* License.
*
* The Original Code is Fastcode
*
* The Initial Developer of the Original Code is Fastcode
*
* Portions created by the Initial Developer are Copyright (C) 2002-2004
* the Initial Developer. All Rights Reserved.
*
* Contributor(s): John O'Harrow
*
* ***** END LICENSE BLOCK ***** *)
procedure Move(const Source; var Dest; count : Integer);
{$IFDEF PUREPASCAL}
var
S, D: PChar;
I: Integer;
begin
S := PChar(@Source);
D := PChar(@Dest);
if S = D then Exit;
if Cardinal(D) > Cardinal(S) then
for I := count-1 downto 0 do
D[I] := S[I]
else
for I := 0 to count-1 do
D[I] := S[I];
end;
{$ELSE}
asm
cmp eax, edx
je @@Exit {Source = Dest}
cmp ecx, 32
ja @@LargeMove {Count > 32 or Count < 0}
sub ecx, 8
jg @@SmallMove
@@TinyMove: {0..8 Byte Move}
jmp dword ptr [@@JumpTable+32+ecx*4]
@@SmallMove: {9..32 Byte Move}
fild qword ptr [eax+ecx] {Load Last 8}
fild qword ptr [eax] {Load First 8}
cmp ecx, 8
jle @@Small16
fild qword ptr [eax+8] {Load Second 8}
cmp ecx, 16
jle @@Small24
fild qword ptr [eax+16] {Load Third 8}
fistp qword ptr [edx+16] {Save Third 8}
@@Small24:
fistp qword ptr [edx+8] {Save Second 8}
@@Small16:
fistp qword ptr [edx] {Save First 8}
fistp qword ptr [edx+ecx] {Save Last 8}
@@Exit:
ret
nop {4-Byte Align JumpTable}
nop
@@JumpTable: {4-Byte Aligned}
dd @@Exit, @@M01, @@M02, @@M03, @@M04, @@M05, @@M06, @@M07, @@M08
@@LargeForwardMove: {4-Byte Aligned}
push edx
fild qword ptr [eax] {First 8}
lea eax, [eax+ecx-8]
lea ecx, [ecx+edx-8]
fild qword ptr [eax] {Last 8}
push ecx
neg ecx
and edx, -8 {8-Byte Align Writes}
lea ecx, [ecx+edx+8]
pop edx
@FwdLoop:
fild qword ptr [eax+ecx]
fistp qword ptr [edx+ecx]
add ecx, 8
jl @FwdLoop
fistp qword ptr [edx] {Last 8}
pop edx
fistp qword ptr [edx] {First 8}
ret
@@LargeMove:
jng @@LargeDone {Count < 0}
cmp eax, edx
ja @@LargeForwardMove
sub edx, ecx
cmp eax, edx
lea edx, [edx+ecx]
jna @@LargeForwardMove
sub ecx, 8 {Backward Move}
push ecx
fild qword ptr [eax+ecx] {Last 8}
fild qword ptr [eax] {First 8}
add ecx, edx
and ecx, -8 {8-Byte Align Writes}
sub ecx, edx
@BwdLoop:
fild qword ptr [eax+ecx]
fistp qword ptr [edx+ecx]
sub ecx, 8
jg @BwdLoop
pop ecx
fistp qword ptr [edx] {First 8}
fistp qword ptr [edx+ecx] {Last 8}
@@LargeDone:
ret
@@M01:
movzx ecx, [eax]
mov [edx], cl
ret
@@M02:
movzx ecx, word ptr [eax]
mov [edx], cx
ret
@@M03:
mov cx, [eax]
mov al, [eax+2]
mov [edx], cx
mov [edx+2], al
ret
@@M04:
mov ecx, [eax]
mov [edx], ecx
ret
@@M05:
mov ecx, [eax]
mov al, [eax+4]
mov [edx], ecx
mov [edx+4], al
ret
@@M06:
mov ecx, [eax]
mov ax, [eax+4]
mov [edx], ecx
mov [edx+4], ax
ret
@@M07:
mov ecx, [eax]
mov eax, [eax+3]
mov [edx], ecx
mov [edx+3], eax
ret
@@M08:
fild qword ptr [eax]
fistp qword ptr [edx]
end;
{$ENDIF}
I include FastCode's Move() below, if you want you may test my function with it.
I tested AlkisgMove() with rep movsd (dword aligned)
vs FastCode's Move(), which mainly uses fild and fistp qword ptr ...
Results (1Mb data / 10000 loops):
AlkisgMove = 28525 ms
Move = 28253 ms
FastCode, well, has (a little!) faster code (maybe it'll be a lot faster on P4). But he uses 8-byte operands (qword ptr ...), which are faster that 4-byte operands (movsd)... I wish movsq existed to test it! :-)
Nevertheless, he also uses cmp/jle in the main loop. I guess instruction pipelining / branch prediction or whatever they call them has improved a lot since 80x86!!!
(* ***** BEGIN LICENSE BLOCK *****
* Version: MPL 1.1
*
* The assembly implementation of function Move is subject to the
* Mozilla Public License Version 1.1 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at http://www.mozilla.org/MPL/
*
* Software distributed under the License is distributed on an "AS IS" basis,
* WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
* for the specific language governing rights and limitations under the
* License.
*
* The Original Code is Fastcode
*
* The Initial Developer of the Original Code is Fastcode
*
* Portions created by the Initial Developer are Copyright (C) 2002-2004
* the Initial Developer. All Rights Reserved.
*
* Contributor(s): John O'Harrow
*
* ***** END LICENSE BLOCK ***** *)
procedure Move(const Source; var Dest; count : Integer);
{$IFDEF PUREPASCAL}
var
S, D: PChar;
I: Integer;
begin
S := PChar(@Source);
D := PChar(@Dest);
if S = D then Exit;
if Cardinal(D) > Cardinal(S) then
for I := count-1 downto 0 do
D[I] := S[I]
else
for I := 0 to count-1 do
D[I] := S[I];
end;
{$ELSE}
asm
cmp eax, edx
je @@Exit {Source = Dest}
cmp ecx, 32
ja @@LargeMove {Count > 32 or Count < 0}
sub ecx, 8
jg @@SmallMove
@@TinyMove: {0..8 Byte Move}
jmp dword ptr [@@JumpTable+32+ecx*4]
@@SmallMove: {9..32 Byte Move}
fild qword ptr [eax+ecx] {Load Last 8}
fild qword ptr [eax] {Load First 8}
cmp ecx, 8
jle @@Small16
fild qword ptr [eax+8] {Load Second 8}
cmp ecx, 16
jle @@Small24
fild qword ptr [eax+16] {Load Third 8}
fistp qword ptr [edx+16] {Save Third 8}
@@Small24:
fistp qword ptr [edx+8] {Save Second 8}
@@Small16:
fistp qword ptr [edx] {Save First 8}
fistp qword ptr [edx+ecx] {Save Last 8}
@@Exit:
ret
nop {4-Byte Align JumpTable}
nop
@@JumpTable: {4-Byte Aligned}
dd @@Exit, @@M01, @@M02, @@M03, @@M04, @@M05, @@M06, @@M07, @@M08
@@LargeForwardMove: {4-Byte Aligned}
push edx
fild qword ptr [eax] {First 8}
lea eax, [eax+ecx-8]
lea ecx, [ecx+edx-8]
fild qword ptr [eax] {Last 8}
push ecx
neg ecx
and edx, -8 {8-Byte Align Writes}
lea ecx, [ecx+edx+8]
pop edx
@FwdLoop:
fild qword ptr [eax+ecx]
fistp qword ptr [edx+ecx]
add ecx, 8
jl @FwdLoop
fistp qword ptr [edx] {Last 8}
pop edx
fistp qword ptr [edx] {First 8}
ret
@@LargeMove:
jng @@LargeDone {Count < 0}
cmp eax, edx
ja @@LargeForwardMove
sub edx, ecx
cmp eax, edx
lea edx, [edx+ecx]
jna @@LargeForwardMove
sub ecx, 8 {Backward Move}
push ecx
fild qword ptr [eax+ecx] {Last 8}
fild qword ptr [eax] {First 8}
add ecx, edx
and ecx, -8 {8-Byte Align Writes}
sub ecx, edx
@BwdLoop:
fild qword ptr [eax+ecx]
fistp qword ptr [edx+ecx]
sub ecx, 8
jg @BwdLoop
pop ecx
fistp qword ptr [edx] {First 8}
fistp qword ptr [edx+ecx] {Last 8}
@@LargeDone:
ret
@@M01:
movzx ecx, [eax]
mov [edx], cl
ret
@@M02:
movzx ecx, word ptr [eax]
mov [edx], cx
ret
@@M03:
mov cx, [eax]
mov al, [eax+2]
mov [edx], cx
mov [edx+2], al
ret
@@M04:
mov ecx, [eax]
mov [edx], ecx
ret
@@M05:
mov ecx, [eax]
mov al, [eax+4]
mov [edx], ecx
mov [edx+4], al
ret
@@M06:
mov ecx, [eax]
mov ax, [eax+4]
mov [edx], ecx
mov [edx+4], ax
ret
@@M07:
mov ecx, [eax]
mov eax, [eax+3]
mov [edx], ecx
mov [edx+3], eax
ret
@@M08:
fild qword ptr [eax]
fistp qword ptr [edx]
end;
{$ENDIF}
Well, the inline asm code for the move won't compile on D5, but I was able to run the last testbed I compiled on my 2 other systems. The following are the results (bumped the iterations from 100 to 1000):
----
Prescott 930 (3GHZ Dual Core) / Windows Media Center 2005 / 3GB Ram
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 868390033 CPU cycles. (309) ms
Workshop_AlexStripMultiple Spaces completed in 1663597012 CPU cycles. (593) ms
MokuleStripMultipleSpaces completed in 1237805814 CPU cycles. (441) ms
RussellStripMultipleSpaces completed in 743218154 CPU cycles. (265) ms
----
Pentium III 667 MHZ / Win2K / 128 MB Memory (Network file server)
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 1179376415 CPU cycles. (1773) ms
Workshop_AlexStripMultiple Spaces completed in 1849779541 CPU cycles. (2774) ms
MokuleStripMultipleSpaces completed in 1530019465 CPU cycles. (2293) ms
RussellStripMultipleSpaces completed in 973187178 CPU cycles. (1462) ms
Russell
----
Prescott 930 (3GHZ Dual Core) / Windows Media Center 2005 / 3GB Ram
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 868390033 CPU cycles. (309) ms
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1237805814 CPU cycles. (441) ms
RussellStripMultipleSpaces
----
Pentium III 667 MHZ / Win2K / 128 MB Memory (Network file server)
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 1179376415 CPU cycles. (1773) ms
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1530019465 CPU cycles. (2293) ms
RussellStripMultipleSpaces
Russell
Another way to try and determine the differences is this. I made the compiled testbed program (running top 4 functions @ 1000 iterations) available at:
http://users.adelphia.net/~rllibby/downloads/testbed.zip
If you run it, and find that your code runs faster, then it can be concluded that the processor is making the difference. (all 3 systems I ran it on showed similar scaled results). If you run it and find that my function is faster, then we are able to determine that its the Delphi compiler/source version that makes the difference.
Regards,
Russell
http://users.adelphia.net/~rllibby/downloads/testbed.zip
If you run it, and find that your code runs faster, then it can be concluded that the processor is making the difference. (all 3 systems I ran it on showed similar scaled results). If you run it and find that my function is faster, then we are able to determine that its the Delphi compiler/source version that makes the difference.
Regards,
Russell
(im just viewing this Q for curiousity)
i tested rllibbys program:
P III m 733~1000MHz, XP, 256MB RAM (the running system wasnt idle, around 35 processes were active):
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 1002606495 CPU cycles. (1006) ms
Workshop_AlexStripMultiple Spaces completed in 1902691726 CPU cycles. (1908) ms
MokuleStripMultipleSpaces completed in 1440543952 CPU cycles. (1444) ms
RussellStripMultipleSpaces completed in 1084997620 CPU cycles. (1088) ms
i have another system: P II 400MHz, 640MB RAM, either 98SE, 2000 AS or XP - tomorrow i will test it there. will also try to run it on the one above again under more "calm" times.
in case my benchmarks are interesting for you: i have also d5 - so i could compile any program while others of you try it with a higher version...
i tested rllibbys program:
P III m 733~1000MHz, XP, 256MB RAM (the running system wasnt idle, around 35 processes were active):
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 1002606495 CPU cycles. (1006) ms
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1440543952 CPU cycles. (1444) ms
RussellStripMultipleSpaces
i have another system: P II 400MHz, 640MB RAM, either 98SE, 2000 AS or XP - tomorrow i will test it there. will also try to run it on the one above again under more "calm" times.
in case my benchmarks are interesting for you: i have also d5 - so i could compile any program while others of you try it with a higher version...
wow this test is quite revealing, nice one ciuly. This goes to show that the fastest coding isn't necessarily the fastest code.
I must say that I'm quite embarressed that my function doesn't work properly and I'll admit that I just typed it in the browser without checking if it compiles. This taught me a lesson :).
I must say that I'm quite embarressed that my function doesn't work properly and I'll admit that I just typed it in the browser without checking if it compiles. This taught me a lesson :).
Actually, I now start wondering if we should not also take the Delphi version into account for these solutions. So actually, the test tool should be compiled with every version since Delphi 3 (since the original questioneer in the other Q uses D3) to the latest Turbo Delphi version. :-) Now, that would be a real interesting test. :-)
I could imagine that Borland used some slow methods in D5 which they improved in D7 but then had to replace with a slower one in D2005, for example. :-)
I could imagine that Borland used some slow methods in D5 which they improved in D7 but then had to replace with a slower one in D2005, for example. :-)
Processor & Delphi version make a BIG difference....
which is what I was aiming to illustrate, especially with the the comiled program that is available for download. Perfect example of the same code (delphi version does not come into play) that shows different results (fastest routine) based on the hardware it is run on.
Russell
Russel's testbed results on my Celeron 1.6GHz, 768Ram, XP home
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 1226723140 CPU cycles. (770) ms
Workshop_AlexStripMultiple Spaces completed in 1758365360 CPU cycles. (1103) ms
MokuleStripMultipleSpaces completed in 1795918512 CPU cycles. (1127) ms
RussellStripMultipleSpaces completed in 823440352 CPU cycles. (517) ms
Could somebody with a P4 also try my testbed?
http://www.alkisg.com/experts/AlkisgTestBed.zip
Compiled with D2006, optimized and with A VERY LOW SPACE PROBABILITY, OneSpaceProbability = 0.001 %.
Each of these algorithms searches and moves data.
Lowering the space probability reports as fastest the algorithm with the fastest searching part (as opposed to the moving part).
My purpose is to give PosEx a more significant role in this test...
My results:
Running with Length(text) = 1000000
AlkisgStripMultipleSpaces completed in 1198784164 CPU cycles (752 ms).
Workshop_AlexStripMultiple Spaces completed in 1820262932 CPU cycles (1142 ms).
MokuleStripMultipleSpaces completed in 1901184948 CPU cycles (1193 ms).
RussellStripMultipleSpaces completed in 2082098812 CPU cycles (1307 ms).
Running with Length(text) = 100000
AlkisgStripMultipleSpaces completed in 1226723140 CPU cycles. (770) ms
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1795918512 CPU cycles. (1127) ms
RussellStripMultipleSpaces
Could somebody with a P4 also try my testbed?
http://www.alkisg.com/experts/AlkisgTestBed.zip
Compiled with D2006, optimized and with A VERY LOW SPACE PROBABILITY, OneSpaceProbability = 0.001 %.
Each of these algorithms searches and moves data.
Lowering the space probability reports as fastest the algorithm with the fastest searching part (as opposed to the moving part).
My purpose is to give PosEx a more significant role in this test...
My results:
Running with Length(text) = 1000000
AlkisgStripMultipleSpaces completed in 1198784164 CPU cycles (752 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1901184948 CPU cycles (1193 ms).
RussellStripMultipleSpaces
My conclusions so far:
*) D2006 is faster that D5 :-) at least on Move... Congrats, FastCode!
*) Russell's function is faster than mine if many spaces exist, perhaps due to the "setup" cost of calling PosEx.
*) My function is faster that Russell's if only a few spaces exist, perhaps because PosEx can search faster if the next space is e.g. 10Kb after the current one. The "call" instruction cost is not very significant if PosEx is called only a few times for the whole 1Mb data...
*) Alex and Mokule functions are also faster than Russell's if only a few spaces exist. I don't know why's that, I'll look into the generated asm code later...
*) D2006 is faster that D5 :-) at least on Move... Congrats, FastCode!
*) Russell's function is faster than mine if many spaces exist, perhaps due to the "setup" cost of calling PosEx.
*) My function is faster that Russell's if only a few spaces exist, perhaps because PosEx can search faster if the next space is e.g. 10Kb after the current one. The "call" instruction cost is not very significant if PosEx is called only a few times for the whole 1Mb data...
*) Alex and Mokule functions are also faster than Russell's if only a few spaces exist. I don't know why's that, I'll look into the generated asm code later...
I ran the test on my first system, P4 2.8 GHZ. With no spaces (cleaned all spaces out of text), your code runs the fastest and mine runs the slowest. Once spaces are introduced into the code, then the results turn the other way:
Results as follows:
With 0 spaces in text:
Running with Length(text) = 1000018
AlkisgStripMultipleSpaces completed in 1206538504 CPU cycles (469 ms).
Workshop_AlexStripMultiple Spaces completed in 1714296236 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1681699584 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 1762327528 CPU cycles (687 ms).
----
With 10 double-spaces in text:
Running with Length(text) = 1000038
AlkisgStripMultipleSpaces completed in 1297215528 CPU cycles (516 ms).
Workshop_AlexStripMultiple Spaces completed in 1723582236 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1682098092 CPU cycles (657 ms).
RussellStripMultipleSpaces completed in 1455484492 CPU cycles (563 ms).
----
With 20 double-spaces in text:
Running with Length(text) = 1000058
AlkisgStripMultipleSpaces completed in 1287794540 CPU cycles (500 ms).
Workshop_AlexStripMultiple Spaces completed in 1712113628 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1692969756 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 1254475676 CPU cycles (484 ms).
----
With 30 double-spaces in text:
Running with Length(text) = 1000078
AlkisgStripMultipleSpaces completed in 1271748352 CPU cycles (515 ms).
Workshop_AlexStripMultiple Spaces completed in 1721996184 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1683155012 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 1034804192 CPU cycles (406 ms).
----
With 40 double-spaces in text:
Running with Length(text) = 1000098
AlkisgStripMultipleSpaces completed in 1287267368 CPU cycles (500 ms).
Workshop_AlexStripMultiple Spaces completed in 1710652708 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1685094780 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 874739376 CPU cycles (344 ms).
----
With 50 double-spaces in text:
Running with Length(text) = 1000118
AlkisgStripMultipleSpaces completed in 1286180728 CPU cycles (500 ms).
Workshop_AlexStripMultiple Spaces completed in 1704431776 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1686519552 CPU cycles (656 ms).
RussellStripMultipleSpaces completed in 845381592 CPU cycles (328 ms).
----
With 100 double-spaces in text:
Running with Length(text) = 1000218
AlkisgStripMultipleSpaces completed in 1286267352 CPU cycles (516 ms).
Workshop_AlexStripMultiple Spaces completed in 1735722828 CPU cycles (672 ms).
MokuleStripMultipleSpaces completed in 1701265072 CPU cycles (672 ms).
RussellStripMultipleSpaces completed in 845979476 CPU cycles (328 ms).
Results as follows:
With 0 spaces in text:
Running with Length(text) = 1000018
AlkisgStripMultipleSpaces completed in 1206538504 CPU cycles (469 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1681699584 CPU cycles (656 ms).
RussellStripMultipleSpaces
----
With 10 double-spaces in text:
Running with Length(text) = 1000038
AlkisgStripMultipleSpaces completed in 1297215528 CPU cycles (516 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1682098092 CPU cycles (657 ms).
RussellStripMultipleSpaces
----
With 20 double-spaces in text:
Running with Length(text) = 1000058
AlkisgStripMultipleSpaces completed in 1287794540 CPU cycles (500 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1692969756 CPU cycles (656 ms).
RussellStripMultipleSpaces
----
With 30 double-spaces in text:
Running with Length(text) = 1000078
AlkisgStripMultipleSpaces completed in 1271748352 CPU cycles (515 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1683155012 CPU cycles (656 ms).
RussellStripMultipleSpaces
----
With 40 double-spaces in text:
Running with Length(text) = 1000098
AlkisgStripMultipleSpaces completed in 1287267368 CPU cycles (500 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1685094780 CPU cycles (656 ms).
RussellStripMultipleSpaces
----
With 50 double-spaces in text:
Running with Length(text) = 1000118
AlkisgStripMultipleSpaces completed in 1286180728 CPU cycles (500 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1686519552 CPU cycles (656 ms).
RussellStripMultipleSpaces
----
With 100 double-spaces in text:
Running with Length(text) = 1000218
AlkisgStripMultipleSpaces completed in 1286267352 CPU cycles (516 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1701265072 CPU cycles (672 ms).
RussellStripMultipleSpaces
Tested at work on a Pentium IV dual-core 3 GHz system. XP Pro, using the binary from Alkisq
-------------------------- ---------- ---------- ---------- ---------- ---------- ---
Running with Length(text) = 1000000
AlkisgStripMultipleSpaces completed in 928273253 CPU cycles (310 ms).
Workshop_AlexStripMultiple Spaces completed in 1863291434 CPU cycles (621 ms).
MokuleStripMultipleSpaces completed in 1646885730 CPU cycles (549 ms).
RussellStripMultipleSpaces completed in 754627754 CPU cycles (251 ms).
Used the generic random text generator for above list.
-------------------------- ---------- ---------- ---------- ---------- ---------- ---
Running with Length(text) = 765490
AlkisgStripMultipleSpaces completed in 640248863 CPU cycles (213 ms).
Workshop_AlexStripMultiple Spaces completed in 1087663965 CPU cycles (363 ms).
MokuleStripMultipleSpaces completed in 1149836924 CPU cycles (383 ms).
RussellStripMultipleSpaces completed in 543709995 CPU cycles (180 ms).
The input file was an XML file with almost no spaces... :-)
-------------------------- ---------- ---------- ---------- ---------- ---------- ---
Another test, same system:Running with Length(text) = 802320
AlkisgStripMultipleSpaces completed in 1732407810 CPU cycles (577 ms).
Workshop_AlexStripMultiple Spaces completed in 1960128135 CPU cycles (653 ms).
MokuleStripMultipleSpaces completed in 1835230950 CPU cycles (611 ms).
RussellStripMultipleSpaces completed in 743925525 CPU cycles (247 ms).
This time I uses the SysUtils unit from Delphi 6 as text. Twice copied to the textfield. :-)
-------------------------- ---------- ---------- ---------- ---------- ---------- ---
Then even funnier results:
Running with Length(text) = 12207616
AlkisgStripMultipleSpaces completed in 2763672397 CPU cycles (921 ms).
Workshop_AlexStripMultiple Spaces completed in 2788594297 CPU cycles (928 ms).
MokuleStripMultipleSpaces completed in 17644017907 CPU cycles (5881 ms).
RussellStripMultipleSpaces completed in 9792352192 CPU cycles (3264 ms).
The difference? The input text had absolutely NO double spaces. Not even normal spaces... :-)
-------------------------- ---------- ---------- ---------- ---------- ---------- ---
And another funny result:
Running with Length(text) = 21725186
AlkisgStripMultipleSpaces completed in 20222208420 CPU cycles (6740 ms).
Workshop_AlexStripMultiple Spaces completed in 23716101810 CPU cycles (7905 ms).
MokuleStripMultipleSpaces completed in 29768851282 CPU cycles (9923 ms).
RussellStripMultipleSpaces completed in 11165073089 CPU cycles (3722 ms).
This only only had spaces in it... :-)
-------------------------- ---------- ---------- ---------- ---------- ---------- ---
The dual-core has not much effect on this test, if you ask me. One thing I do notice which is that Alkisq and I have the fastest solutions when the original string has no dual spaces. And Russells version is the fastest when it's only spaces. :-) Russells solution is quite fast in general, as you can see from all these tests. Apparantly it only slows down when it has lots of spaces to process. If that could be improved too then WOW! :-)
Then again, this all for just a function which replaces two spaces. It would be more interesting to do the same for a faster StringReplace solution, where you can specify additional parameters for what must be changed and to what it should be changed. :-)
I'll test it again when I'm back home on my own system. It's a Pentium III at 866 MHz so it should be reasonable interesting for comparison. Maybe the amount of clock cycles will be very different on my system at home.
--------------------------
Running with Length(text) = 1000000
AlkisgStripMultipleSpaces completed in 928273253 CPU cycles (310 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1646885730 CPU cycles (549 ms).
RussellStripMultipleSpaces
Used the generic random text generator for above list.
--------------------------
Running with Length(text) = 765490
AlkisgStripMultipleSpaces completed in 640248863 CPU cycles (213 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1149836924 CPU cycles (383 ms).
RussellStripMultipleSpaces
The input file was an XML file with almost no spaces... :-)
--------------------------
Another test, same system:Running with Length(text) = 802320
AlkisgStripMultipleSpaces completed in 1732407810 CPU cycles (577 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 1835230950 CPU cycles (611 ms).
RussellStripMultipleSpaces
This time I uses the SysUtils unit from Delphi 6 as text. Twice copied to the textfield. :-)
--------------------------
Then even funnier results:
Running with Length(text) = 12207616
AlkisgStripMultipleSpaces completed in 2763672397 CPU cycles (921 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 17644017907 CPU cycles (5881 ms).
RussellStripMultipleSpaces
The difference? The input text had absolutely NO double spaces. Not even normal spaces... :-)
--------------------------
And another funny result:
Running with Length(text) = 21725186
AlkisgStripMultipleSpaces completed in 20222208420 CPU cycles (6740 ms).
Workshop_AlexStripMultiple
MokuleStripMultipleSpaces completed in 29768851282 CPU cycles (9923 ms).
RussellStripMultipleSpaces
This only only had spaces in it... :-)
--------------------------
The dual-core has not much effect on this test, if you ask me. One thing I do notice which is that Alkisq and I have the fastest solutions when the original string has no dual spaces. And Russells version is the fastest when it's only spaces. :-) Russells solution is quite fast in general, as you can see from all these tests. Apparantly it only slows down when it has lots of spaces to process. If that could be improved too then WOW! :-)
Then again, this all for just a function which replaces two spaces. It would be more interesting to do the same for a faster StringReplace solution, where you can specify additional parameters for what must be changed and to what it should be changed. :-)
I'll test it again when I'm back home on my own system. It's a Pentium III at 866 MHz so it should be reasonable interesting for comparison. Maybe the amount of clock cycles will be very different on my system at home.
Hey, Russell...
Do you notice what I noticed? My solution seems to be at a constant speed. :-) In your test results I see: "Workshop_AlexStripMultipl eSpaces completed in 1704431776 CPU cycles (672 ms)." everywhere... :-)
Thus mine has the most constant speed. :-P
Btw, my code might improve in speed if you just remove the Pos() function from it and just start looping from the first character. But then again, it's not that big an issue anyways... :-)
Do you notice what I noticed? My solution seems to be at a constant speed. :-) In your test results I see: "Workshop_AlexStripMultipl
Thus mine has the most constant speed. :-P
Btw, my code might improve in speed if you just remove the Pos() function from it and just start looping from the first character. But then again, it's not that big an issue anyways... :-)
Alex, I'm with you for the StringReplace thing. Then we would have something useful to experiment on!
Start a question!
I would, but I'm not very proud of my English :-(
Start a question!
I would, but I'm not very proud of my English :-(
As you wish. :-)
https://www.experts-exchange.com/questions/21982965/Competition-Build-the-fastest-StringReplace-function.html for the StringReplace competition. :-)
https://www.experts-exchange.com/questions/21982971/Competition-Build-the-fastest-StringReplace-function-with-Delphi-for-NET.html for exactly the same competition but with a difference. In this one, I want a .NET solution instead of a WIN32 solution. :-)
This also makes it fun to see if .NET really performs about as well as WIN32 or that it sucks because it's way too slow... :-)
https://www.experts-exchange.com/questions/21982965/Competition-Build-the-fastest-StringReplace-function.html for the StringReplace competition. :-)
https://www.experts-exchange.com/questions/21982971/Competition-Build-the-fastest-StringReplace-function-with-Delphi-for-NET.html for exactly the same competition but with a difference. In this one, I want a .NET solution instead of a WIN32 solution. :-)
This also makes it fun to see if .NET really performs about as well as WIN32 or that it sucks because it's way too slow... :-)
ASKER
I've been out of town for the past 2 days.
interesting posts. lot of good information :)
mainly the scope of this thread is comming to it's shape. I'll wait to se if anyone has anything else to add to this and then I'll try to make a short resume and accept answers ;)
interesting posts. lot of good information :)
mainly the scope of this thread is comming to it's shape. I'll wait to se if anyone has anything else to add to this and then I'll try to make a short resume and accept answers ;)
Well, the one thing that I fins funny is the fact that while my solution isn't the fastest one, it happens to be the one that is most constant in speed. :-) Which happens to be an interesting side effect. :-)
ASKER
true :)
I was thinking about a few stuff last night and remembered that there isn't yet a top for the memory consumption :)
one other solution I thought about is to use an external application that monitors process memory and since the test applications used here are only doing one operations we can use the peek value found by teh monitoring software to see how much memory was used since the peek value will actually happen during the actual string operation.
so who's got time to throw in some code on this and alexs idea? :)
I can post something using WMI in one of my breaks later on.
I was thinking about a few stuff last night and remembered that there isn't yet a top for the memory consumption :)
one other solution I thought about is to use an external application that monitors process memory and since the test applications used here are only doing one operations we can use the peek value found by teh monitoring software to see how much memory was used since the peek value will actually happen during the actual string operation.
so who's got time to throw in some code on this and alexs idea? :)
I can post something using WMI in one of my breaks later on.
This modified version should prove to be a good "all around" performer; regardless if the data contains double spaces or not (or is all spaces). As to the whole memory question, thats kind of a moot point. Worst case scenario the algorithms will allocate a string of a length equal to the source string. Now without making the source string "by reference", the best you could do is allocate: {Length of source} - {Double space run count} * ({Double space run length} - 1). That would of course require a full prescan of the data, and in most cases would result in a higher cost time wise, albeit slightly smaller memory footprint.
Russell
--
function RussellStripMultipleSpaces
var lpszSource: PChar;
lpszDest: PChar;
dwLength: Integer;
dwMove: Integer;
begin
// Get the source length
dwLength:=Length(Value);
// If length is less than two then there is nothing to do
if (dwLength < 2) then
// Return source string
result:=Value
else
begin
// Set source pointer
lpszSource:=Pointer(Value)
// Set starting scan location
dwMove:=0;
// Scan for first set of double spaces
while (dwMove < dwLength) do
begin
// Check for space
if (lpszSource[dwMove] = #32) then
begin
// Check for following space
if (lpszSource[Succ(dwMove)] = #32) then break;
end;
// Push next
Inc(dwMove);
end;
// Check result of double space scan
if (dwMove = dwLength) then
// No double spaces in string, so just return the source string
result:=Value
else
begin
// Allocate result string
SetLength(result, dwLength);
// Set dest pointer
lpszDest:=Pointer(result);
// Process source string
while (dwLength > dwMove) do
begin
// Check for space
if (lpszSource[dwMove] = #32) then
begin
// Update move
Inc(dwMove);
// Check next byte
if (lpszSource[dwMove] = #32) then
begin
// Move the data
Move(lpszSource^, lpszDest^, dwMove);
// Update destination pointer
Inc(lpszDest, dwMove);
// Increment move counter
Inc(dwMove);
// Increment source pointer
Inc(lpszSource, dwMove);
// Decrement the length counter
Dec(dwLength, dwMove);
// Skip following spaces
while (lpszSource^ = #32) do
begin
// Increment source
Inc(lpszSource);
// Decrement length
Dec(dwLength);
end;
// Reset move count
dwMove:=(-1);
end;
end;
// Update move counter
Inc(dwMove);
end;
// Check final move
if (dwLength > 0) then
begin
// Final data move
Move(lpszSource^, lpszDest^, dwMove);
// Update dest
Inc(lpszDest, dwMove);
end;
// Truncate the result
SetLength(result, lpszDest - Pointer(result));
end
end;
end;
ASKER
russel, you are of course correct, but lets not forget that the main reason of this thread is learing.
and to provide some examples of how to measure different types of performance issues (machine cycles, times and of course memory allocation and who knows what other apsects that cannot be covered in one thread :) ).
probably this isn't exactly the best issue to work on the memory allocation performance. I don't recall seeing that kind of issues for quite some time. maybe I'll just make something up :D
until then, I started on that app I was talking above. maybe I'll find time tomorrow evening to finish it.
and to provide some examples of how to measure different types of performance issues (machine cycles, times and of course memory allocation and who knows what other apsects that cannot be covered in one thread :) ).
probably this isn't exactly the best issue to work on the memory allocation performance. I don't recall seeing that kind of issues for quite some time. maybe I'll just make something up :D
until then, I started on that app I was talking above. maybe I'll find time tomorrow evening to finish it.
Russel, I was ready to post a solution when I noticed your post. Quite fast I'd say, but the memory is not a moot issue. If the string is 2GB your proc might not work on some computyers.
As I was saying my solution I just wrote is a bit similar to your, it goes the same way, except: it does not havve memory troubles, it uses one call to SetLength and one call to Move. Plus it's shorter and I hope faster :-)
Cheers
program Project2;
{$APPTYPE CONSOLE}
procedure CutSpaces(var strInput: String);
var
ptr, spacesFound, lastpos, workLen, deleted, inLen: Integer;
begin
writeln( ' input: "'+strInput+'"');
inLen := Length(strInput); // save original length
if inLen < 2 then Exit; // nothing to do? out of here
deleted := 0; // initialize count of deleted spaces
workLen := inLen; // initialize working length
ptr := 1; // start from first char in string
repeat // start the global loop
// skip non-space stuff
while (ptr < workLen) and (strInput[ptr] <> ' ') do
Inc(ptr);
// initialize spaces counter correct
if strInput[ptr] = ' ' then
spacesFound := 1
else
spacesFound := 0;
lastpos := ptr;
while (ptr < workLen) and (strInput[ptr] = ' ') do
begin
Inc(ptr); // move to next char
if strInput[ptr] = ' ' then // if this is a space
Inc(spacesFound); // increase the counter
end;
if spacesFound > 1 then // if we detected more than one space
begin
// slide to the left exact numbers of chars left
Move(strInput[ptr], strInput[ptr-(spacesFound- 1)], workLen-ptr+1);
Inc(deleted, spacesFound-1); // update deleted chars
Dec(workLen, spacesFound-1); // update our working length
//Dec(ptr, spacesFound-1); // safe go back one char to avoid missing some spaces
end;
ptr := lastpos+1;
until (spacesFound = 0) // stop if no more spaces
or (ptr > workLen); // or current position past working length
SetLength(strInput, inLen-deleted); // one time set length of output string
writeln( 'output: "'+strInput+'" deleted ', deleted );
end;
var
S: String;
begin
S := 'The cat sat on the mat';
CutSpaces( S );
S := ' The cat sat on the mat';
CutSpaces( S );
S := ' The cat sat on the mat ';
CutSpaces( S );
S := 'The cat sat on the mat';
CutSpaces( S );
S := ' The cat sat on the mat';
CutSpaces( S );
S := ' The cat sat on the mat ';
CutSpaces( S );
S := 'The cat sat on the mat ';
CutSpaces( S );
readln;
end.
As I was saying my solution I just wrote is a bit similar to your, it goes the same way, except: it does not havve memory troubles, it uses one call to SetLength and one call to Move. Plus it's shorter and I hope faster :-)
Cheers
program Project2;
{$APPTYPE CONSOLE}
procedure CutSpaces(var strInput: String);
var
ptr, spacesFound, lastpos, workLen, deleted, inLen: Integer;
begin
writeln( ' input: "'+strInput+'"');
inLen := Length(strInput); // save original length
if inLen < 2 then Exit; // nothing to do? out of here
deleted := 0; // initialize count of deleted spaces
workLen := inLen; // initialize working length
ptr := 1; // start from first char in string
repeat // start the global loop
// skip non-space stuff
while (ptr < workLen) and (strInput[ptr] <> ' ') do
Inc(ptr);
// initialize spaces counter correct
if strInput[ptr] = ' ' then
spacesFound := 1
else
spacesFound := 0;
lastpos := ptr;
while (ptr < workLen) and (strInput[ptr] = ' ') do
begin
Inc(ptr); // move to next char
if strInput[ptr] = ' ' then // if this is a space
Inc(spacesFound); // increase the counter
end;
if spacesFound > 1 then // if we detected more than one space
begin
// slide to the left exact numbers of chars left
Move(strInput[ptr], strInput[ptr-(spacesFound-
Inc(deleted, spacesFound-1); // update deleted chars
Dec(workLen, spacesFound-1); // update our working length
//Dec(ptr, spacesFound-1); // safe go back one char to avoid missing some spaces
end;
ptr := lastpos+1;
until (spacesFound = 0) // stop if no more spaces
or (ptr > workLen); // or current position past working length
SetLength(strInput, inLen-deleted); // one time set length of output string
writeln( 'output: "'+strInput+'" deleted ', deleted );
end;
var
S: String;
begin
S := 'The cat sat on the mat';
CutSpaces( S );
S := ' The cat sat on the mat';
CutSpaces( S );
S := ' The cat sat on the mat ';
CutSpaces( S );
S := 'The cat sat on the mat';
CutSpaces( S );
S := ' The cat sat on the mat';
CutSpaces( S );
S := ' The cat sat on the mat ';
CutSpaces( S );
S := 'The cat sat on the mat ';
CutSpaces( S );
readln;
end.
Read my comments fully:
>> Now without making the source string "by reference",
Given the usage of const for the source string in all the sample functions (I assumed that was a requirement), then the point is moot. Now if a var string is allowed as the parameter, then thats a whole different story. If I did my function as (var Value) it would be hard to touch.
Russell
Oh I read your comments fully before posting, except that I did not feel any constraints. When I first saw this problem, I looked in fast-forward mode to a few solutions and I was kind of worried :-) why nobody used anything as simple as "walk and count", except for your code. That's why probably I did not feel the type of parameter was a requirement. Plus I recently had a job interview with a "string" problem in it. They wanted me to write a "ReverseString" method with much care about memory consumption. Another reason why it was so hard in my mind.
Anyway, I am curios as of how fast my method is amongst non-ASM solutions (I do not know any assembler) and I love pascal-ish solutions.
ljee
Anyway, I am curios as of how fast my method is amongst non-ASM solutions (I do not know any assembler) and I love pascal-ish solutions.
ljee
Even when passed as var, it can still be a moot point when dealing with the copy-on-write semantics of delphi strings. If B = A (both string vars), and you pass B to the procedure by pointer and modify B using code that forces UniqueString to get called, then B is allocated as a whole new string. (The assignment of B:=A; does no memory allocation in and of itself.)
As far as speed goes, well.... I'll post the informal example that I used and let you be the judge. I will say that on my system, one ran at 391ms and the other at a full 18 seconds.
Russell
---
program Project1;
{$APPTYPE CONSOLE}
{$ASSERTIONS ON}
uses
Windows,
SysUtils;
procedure CutSpaces(var strInput: String);
var
ptr, spacesFound, lastpos, workLen, deleted, inLen: Integer;
begin
inLen := Length(strInput); // save original length
if inLen < 2 then Exit; // nothing to do? out of here
deleted := 0; // initialize count of deleted spaces
workLen := inLen; // initialize working length
ptr := 1; // start from first char in string
repeat // start the global loop
// skip non-space stuff
while (ptr < workLen) and (strInput[ptr] <> ' ') do
Inc(ptr);
// initialize spaces counter correct
if strInput[ptr] = ' ' then
spacesFound := 1
else
spacesFound := 0;
lastpos := ptr;
while (ptr < workLen) and (strInput[ptr] = ' ') do
begin
Inc(ptr); // move to next char
if strInput[ptr] = ' ' then // if this is a space
Inc(spacesFound); // increase the counter
end;
if spacesFound > 1 then // if we detected more than one space
begin
// slide to the left exact numbers of chars left
Move(strInput[ptr], strInput[ptr-(spacesFound-
Inc(deleted, spacesFound-1); // update deleted chars
Dec(workLen, spacesFound-1); // update our working length
//Dec(ptr, spacesFound-1); // safe go back one char to avoid missing some spaces
end;
ptr := lastpos+1;
until (spacesFound = 0) // stop if no more spaces
or (ptr > workLen); // or current position past working length
SetLength(strInput, inLen-deleted); // one time set length of output string
end;
procedure RussellStripMultipleSpaces
var lpszSource: PChar;
lpszDest: PChar;
dwLength: Integer;
dwMove: Integer;
begin
// Get the source length
dwLength:=Length(Value);
// If length is less than two then there is nothing to do
if (dwLength > 1) then
begin
// Set source pointer
lpszSource:=Pointer(Value)
// Set starting scan location
dwMove:=0;
// Scan for first set of double spaces
while (dwMove < dwLength) do
begin
// Check for space
if (lpszSource[dwMove] = #32) then
begin
// Check for following space
if (lpszSource[Succ(dwMove)] = #32) then break;
end;
// Push next
Inc(dwMove);
end;
// Check result of double space scan
if (dwMove < dwLength) then
begin
// Make unique
UniqueString(Value);
// Reset source and dest pointer
lpszSource:=Pointer(Value)
lpszDest:=Pointer(Value);
// Process the first portion (no move required)
Inc(dwMove);
// Update destination pointer
Inc(lpszDest, dwMove);
// Increment move counter
Inc(dwMove);
// Increment source pointer
Inc(lpszSource, dwMove);
// Decrement the length counter
Dec(dwLength, dwMove);
// Skip following spaces
while (lpszSource^ = #32) do
begin
// Increment source
Inc(lpszSource);
// Decrement length
Dec(dwLength);
end;
// Reset move count
dwMove:=0;
// Process source string
while (dwLength > dwMove) do
begin
// Check for space
if (lpszSource[dwMove] = #32) then
begin
// Update move
Inc(dwMove);
// Check next byte
if (lpszSource[dwMove] = #32) then
begin
// Move the data
Move(lpszSource^, lpszDest^, dwMove);
// Update destination pointer
Inc(lpszDest, dwMove);
// Increment move counter
Inc(dwMove);
// Increment source pointer
Inc(lpszSource, dwMove);
// Decrement the length counter
Dec(dwLength, dwMove);
// Skip following spaces
while (lpszSource^ = #32) do
begin
// Increment source
Inc(lpszSource);
// Decrement length
Dec(dwLength);
end;
// Reset move count
dwMove:=(-1);
end;
end;
// Update move counter
Inc(dwMove);
end;
// Check final move
if (dwLength > 0) then
begin
// Final data move
Move(lpszSource^, lpszDest^, dwMove);
// Update dest
Inc(lpszDest, dwMove);
end;
// Truncate the result
SetLength(Value, lpszDest - Pointer(Value));
end
end;
end;
function TheText: String;
const
OneSpaceProbability = 1;
MoreSpacesProbability = 60;
var
i: Integer;
lastOneWasSpace: Boolean;
textSize: Integer;
begin
Randomize;
textSize := 100000;
SetLength(result, textSize);
i := 1;
lastOneWasSpace := false;
while i <= textSize do
begin
if i mod 82 = 81 then //insert line breaks every 80 chars.
begin
result[i] := #13;
Inc(i);
result[i] := #10;
end
else if lastOneWasSpace then
begin
if Random(100) < MoreSpacesProbability then
result[i] := ' '
else
result[i] := char(ord('a') + random(26));
end
else begin
if Random(100) < OneSpaceProbability then
result[i] := ' '
else
result[i] := char(ord('a') + random(26));
end;
lastOneWasSpace := result[i] = ' ';
Inc(i);
end;
end;
var
A: String;
B: String;
dwMark: LongWord;
dwIndex: Integer;
begin
A:=TheText + ' ';
dwMark:=GetTickCount;
for dwIndex:=1 to 1000 do
begin
B:=A;
RussellStripMultipleSpaces
Assert(Length(B) < Length(A), 'Assertion failure');
end;
dwMark:=GetTickCount - dwMark;
WriteLn(dwMark, 'ms for 1000 iterations of RussellStripMultipleSpaces
dwMark:=GetTickCount;
for dwIndex:=1 to 1000 do
begin
B:=A;
CutSpaces( B );
Assert(Length(B) < Length(A), 'Assertion failure');
end;
dwMark:=GetTickCount - dwMark;
WriteLn(dwMark, 'ms for 1000 iterations of CutSpaces');
ReadLn;
end.
thanks for posting all this. looks very weird, I tested it and you're right. I do not understand why my code is so slow, any idea? I was really thinking we practically kind of doing same thing but just slightly different. This *slightly* though is the difference between life and death. I'll allocate more time to study your code, maybe I'll see why is faster :-)
But once again, I believe your code uses twice memory as needed because it does not operate directly on the input string instead it moves it to the second one. Speed is speed, grant you that, but why didn't you write it like that? Is that the cause? I am moving *much* more bytes than your code, perhaps??
ljee
But once again, I believe your code uses twice memory as needed because it does not operate directly on the input string instead it moves it to the second one. Speed is speed, grant you that, but why didn't you write it like that? Is that the cause? I am moving *much* more bytes than your code, perhaps??
ljee
>> But once again, I believe your code uses twice memory as needed because it does not operate directly on the input string instead it moves it to the second one.
That is where you would be wrong. The code I provided operates on a pointer to the input the string. I call UniqueString once to ensure that the copy on write is done, and I don't end up modifying both B and A (in the test example). If the string is already unique (ref count is checked), then no allocation is done.
But by using pointers, I control the single call to UniqueString, vs your code which implicitly calls UniqueString (generated by the compiler) for every call to the Move(...) procedure in your code. Bottom line is that my code uses no more memory than yours, but operates roughly 45x faster.
Russell
That is where you would be wrong. The code I provided operates on a pointer to the input the string. I call UniqueString once to ensure that the copy on write is done, and I don't end up modifying both B and A (in the test example). If the string is already unique (ref count is checked), then no allocation is done.
But by using pointers, I control the single call to UniqueString, vs your code which implicitly calls UniqueString (generated by the compiler) for every call to the Move(...) procedure in your code. Bottom line is that my code uses no more memory than yours, but operates roughly 45x faster.
Russell
aaaa .... ok man! nice to learn something today! I did not know about UniqueString-Move thinggie. I won't take your word for it and I'll check all this stuff one of these days cuz it's so ... wonderful
(and stop telling me how fast your code was .... it gets to my nerves :-))
ljee
(and stop telling me how fast your code was .... it gets to my nerves :-))
ljee
Well, must say that my solution uses almost no extra memory. :-) If you scroll up (a lot!) then you'll see that I first make an uniqeu copy of the string to serve as a result and then start moving characters within that string and at the end I shorten the string. It's not really optimized but hey, the memory I need for this solution equals the size of this string plus the size of the local values and that's basically all. :-)
Btw, while you could try to keep track of memory usage, this tends to be a bit useless exercise since in a Windows application there might be plenty of other things that will claim additional memory. For example, you might have a virusscanner running and this virusscanner might work by injecting some code in every running executable to make sure it behaves nicely. But that also means the virusscanner will sometimes eat up additional memory. And there might be other reasons why an application takes up more than expected.
To keep the memory consumption as low as possible for a valid memory test you should just write a console application and avoid the use of all the VCL classes. (You can still use SysUtils and ActiveX units, though.) Furthermore, to calculate the memory usage you will have to walk through the heap of Delphi, adding up all the memory blocks it uses. This can be done and apparantly isn't even that difficult. (Just wished I knew where I'd stored the sample code I used to have for this.)
Russell's code was fast, wasn't it? :-P
Btw, while you could try to keep track of memory usage, this tends to be a bit useless exercise since in a Windows application there might be plenty of other things that will claim additional memory. For example, you might have a virusscanner running and this virusscanner might work by injecting some code in every running executable to make sure it behaves nicely. But that also means the virusscanner will sometimes eat up additional memory. And there might be other reasons why an application takes up more than expected.
To keep the memory consumption as low as possible for a valid memory test you should just write a console application and avoid the use of all the VCL classes. (You can still use SysUtils and ActiveX units, though.) Furthermore, to calculate the memory usage you will have to walk through the heap of Delphi, adding up all the memory blocks it uses. This can be done and apparantly isn't even that difficult. (Just wished I knew where I'd stored the sample code I used to have for this.)
Russell's code was fast, wasn't it? :-P
Keeping track of memory is never useless, especially when you have to run a server continuously. What other apps are doing is their own business, if they eat to much or leak ... they just go on my blacklist and get uninstalled, deleted, erased, phooo :-), I certainly don't want that to happen with my apps on other people's computers.
I made some minor changes to my code, basically to use pointers in the Move call. It became avery little bit faster, or maybe not at all. I think the "Move" statement is slow and it has to be written with care. Basically the difference between Russell's code and mine is that he moves just small part of the source string to the target string, while my code moves always the whole right side of the string after the spaces found. I wasn't thinking how slow Move can be, but this is another lesson.
(And yeah, it was damn fast :-))
I can't wait for the StringReplace replacement guys
I made some minor changes to my code, basically to use pointers in the Move call. It became avery little bit faster, or maybe not at all. I think the "Move" statement is slow and it has to be written with care. Basically the difference between Russell's code and mine is that he moves just small part of the source string to the target string, while my code moves always the whole right side of the string after the spaces found. I wasn't thinking how slow Move can be, but this is another lesson.
(And yeah, it was damn fast :-))
I can't wait for the StringReplace replacement guys
ASKER
guess it's time to close this :)
I re-read the thread (huh, that took a while) and there is a lot of usefull information and people really put their minds to this. so I am now thinking to modify the reward sheet by increasing the points to 500 and splitting more or less equally among the active participants (I hope I won't skip anybody :D).
I'll accept alexes solution as it was the first one to post :)
thanks for the input everybody, *GREAT* job. I'll maybe think of a subject for the memory usage problem. we'll see.
(for anyone wondering, limbecks solution is in the other/"original" question)
I re-read the thread (huh, that took a while) and there is a lot of usefull information and people really put their minds to this. so I am now thinking to modify the reward sheet by increasing the points to 500 and splitting more or less equally among the active participants (I hope I won't skip anybody :D).
I'll accept alexes solution as it was the first one to post :)
thanks for the input everybody, *GREAT* job. I'll maybe think of a subject for the memory usage problem. we'll see.
(for anyone wondering, limbecks solution is in the other/"original" question)
Yep. It was a *GREAT* job we all did here. :-) Very educational for everyone who is digging deeper into Delphi to find ways to improve the speed a bit more.
One thing we've all learned from this all: to speed up string handling, it's better if you just look at it as an array of character and try to avoid moving around blocks of data. :-)
One thing we've all learned from this all: to speed up string handling, it's better if you just look at it as an array of character and try to avoid moving around blocks of data. :-)