SPEED CHALLENGE - SORTING

This is, i hope, to be similar to WorkShop_Alex's challenge. The aim is to write the fastest sort possible with the following procedure:

sort(arraylength,arraymaxnum:integer); where arraylength is the length of the array and the arraymaxnum the largest integer possible in the array and the population of the array using:

 Item := VarArrayCreate([0, arraylength], varInteger);
//POPULATE ARRAY
Randomize;
For X:=0 to arraylength do
  begin
    item[x]:=random(arraymaxnum);
  end;

You must initiate the sort like so:

QueryPerformanceFrequency(freq); {Get counts/second}
QueryPerformanceCounter(start); {Get string count}
sort(10, 100) //values can be changed

Your sort must have showresult; at the end to get the details

The following unit has my testing procedure (we can configure thbis if it doesn't work properley) and my first sort algorithm for you guys to try and beat ;-). Currently my sort speed is approx 350 ticks.

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs;

type
  TForm1 = class(TForm)
    procedure FormCreate(Sender: TObject);
  private
    procedure sort(arraylength,arraymaxnum:integer);
    procedure showresult;
  public
  start,stop1,freq:int64;
  TotalTicks,NumASec: int64;
  Ticksfor1,Timefor1: real;
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure tform1.showresult;
begin
  QueryPerformanceCounter(Stop1); {Get end of loop2 count}
 TotalTicks:= (stop1-start);
  Timefor1:= TotalTicks / freq;
  NumASec:= Trunc(1{second} / Timefor1);
  If freq>0 then
     showmessage('Sort took := '+
     inttostr(TotalTicks)+
     #13+'      Ticks for 1 Call:= '+FloatToStrF(Ticksfor1,ffNumber, 6,3)+
     #13+'      Ticks per second:= '+inttostr(freq)+
     #13+'      Time in micro seconds (millionths) for 1 call:= '+FloatToStrF(1e6*Timefor1,ffNumber, 4,8))
  else showmessage('No hardware timer available');
end;

procedure tform1.sort(arraylength, arraymaxnum:integer);
var Item:variant ;
X,y,temp:Integer;
result:string;
begin
 Item := VarArrayCreate([0, arraylength], varInteger);
//POPULATE ARRAY
Randomize;
For X:=0 to arraylength do
  begin
    item[x]:=random(arraymaxnum);
  end;

//SORT
Y:=-1;
X:=arraylength+1  ;
repeat
  dec(x) ;
   y:=-1;
   repeat
    inc(y) ;
       if item[y]>item[x] then
        begin
          temp:=item[y];
            item[y]:=item[x];
          item[x]:=temp;
        end;
   until y>=x ;
until X<=0;
//End SORT

showresult;

For X:=0 to arraylength do
 begin
  result:=result + ' ' + inttostr(item[x]);
 end;
showmessage(result);
end;

procedure TForm1.FormCreate(Sender: TObject);
begin
QueryPerformanceFrequency(freq); {Get counts/second}
QueryPerformanceCounter(start); {Get string count}
sort(10, 100)

end;

end.

Regards,

Hypoviax
LVL 5
HypoviaxAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

HypoviaxAuthor Commented:
I'll just add we should test a few ranges:

1. 0-10  
2. 0-100
3. 0-1000
4. 0-10000

My results with max integer of 100 (probably grossley slow ;-) ):

1. 345
2. 7642
3. 646902
4. 127752446

Hypoviax
0
Wim ten BrinkSelf-employed developerCommented:
Hmmm. Sorting arrays the fastest way is easily done by just using a QuickSort routine, and a simple example of this routine is found in the Classes unit, for sorting stringlists. However, this Delphi source contains a recursive version of the QuickSort routine, while it would even be faster if you use a non-recursive version. And of course the QuickSort routine in the Classes unit does a callback to a compare routine, thus allowing anything to be compared. But the additional function call would slow it down again.

I also want to note that you could measure the time it it takes to finish the procedure but this speed depends heavily on the system used to run the test. The Pentium IV 2.4 GHz at work would be a lot faster at this than my Pentium III 866 MHz at home. But if you would compare processor ticks then the clockspeed would not matter much, since it will take about the same amount of ticks on every processor. The only difference is that one processor executes ticks faster than the other... :-)

It's fun though that you use a variant for the dynamic array. It would be slightly faster by using an array of integers instead. (Less pointer-handling.)

I hope my hints provided enough information to get a real fast solution here. :-) But I'm just saying that the fastest sorting solutions are already found. Just not always implemented. (And be honest, if you only have 20 records to sort, then it doesn't matter much to save one-tenth of a millisecond by optimizing the sorting routine. Actually, with a low amount of records (1-10) the quicksort might actually be a bit slower than other solutions...
0
Russell LibbySoftware Engineer, Advisory Commented:

Hypoviax,

I would agree with most of the comments made by Workshop_Alex, as the timing above is a little too machine dependant. Regarding the sorting aspect, given the fact that we have the highest number in the array, and provided that the highest number is relatively small (eg, less than 32768), then a radix type sort would actually be faster than a quicksort. Anyways, here are some examples of sorting routines, and also of variant array locking which will also help decrease the time required to sort the array.

Regards,
Russell

// Used for data handling
type
  PIntArray      =  ^TIntArray;
  TIntArray      =  Array [0..Pred(MaxInt shr 2)] of Integer;

// QuickSort - An optimal "all around" sort routine. Performs well under most, if not all
// conditions.
procedure QuickSort(Data: PIntArray; L, R: Integer);
var  i, j:       Integer;
     p, t:       Integer;
begin

  repeat
     i:=l;
     j:=r;
     p:=Data^[(l + r) shr 1];
     repeat
        while ((Data^[i]-p) < 0) do Inc(i);
        while ((Data^[j]-p) > 0) do Dec(j);
        if (i <= j) then
        begin
           t:=Data^[i];
           Data^[i]:=Data^[j];
           Data^[j]:=t;
           Inc(i);
           Dec(j);
        end;
     until (i > j);
     if (l < j) then QuickSort(Data, l, j);
     l:=i;
  until (i >= r);

end;

// Linear type sort - Not much faster than a bubble sort, but is fine for sorting small
// amounts of data (requires less memory and uses no recursion)
procedure LinearSort(Data: PIntArray; ArrayLength: Integer);
var  dwCount:       Integer;
     dwIndex:       Integer;
     dwTemp:        Integer;
     dwPos:         Integer;
begin

  for dwCount:=0 to Pred(ArrayLength) do
  begin
     dwPos:=dwCount;
     for dwIndex:=Succ(dwCount) to Pred(ArrayLength) do
     begin
        if (Data^[dwIndex] < Data^[dwPos]) then dwPos:=dwIndex;
     end;
     if (dwPos > dwCount) then
     begin
        dwTemp:=Data^[dwCount];
        Data^[dwCount]:=Data^[dwPos];
        Data^[dwPos]:=dwTemp;
     end;
  end;

end;

// Radix type sort - excellent when the highest number in the array is known and is  
// relatively small
procedure RadixSort(Data: PIntArray; ArrayLength, ArrayMaxNum: Integer);
var  lpTemp:     Array [0..32767] of Integer;
     dwIndex:    Integer;
     dwCount:    Integer;
     dwTemp:     Integer;
begin

  // Check ArrayMaxNum
  if (ArrayMaxNum < 32768) then
  begin
     // Clear the temp array
     FillChar(lpTemp, SizeOf(lpTemp), 0);
     // Count frequency
     for dwIndex:=0 to Pred(ArrayLength) do Inc(lpTemp[Data^[dwIndex]]);
     // Sort the array
     dwIndex:=0;
     for dwTemp:=0 to ArrayMaxNum do
     begin
        // Get frequency count for slot
        dwCount:=lpTemp[dwTemp];
        while (dwCount > 0) do
        begin
           Data^[dwIndex]:=dwTemp;
           Inc(dwIndex);
           Dec(dwCount);
        end;
        if (dwIndex = ArrayLength) then break;
     end;
  end;

end;

// Generic sort procedure that attempts to pick the most optimal sorting routine
procedure Sort(ArrayLength, ArrayMaxNum: Integer);
var  vArray:     Variant;
     lpArray:    PIntArray;
     dwIndex:    Integer;
begin

  // Create variant array
  vArray:=VarArrayCreate([0, Pred(ArrayLength)], varInteger);

  // Lock that bad boy ;-)
  lpArray:=VarArrayLock(vArray);

  // Resource protection
  try
     // Seed random number generator
     Randomize;
     // Fill the array
     for dwIndex:=0 to Pred(ArrayLength) do lpArray^[dwIndex]:=Random(ArrayMaxNum);
     // Determine sort routine to use
     if (ArrayLength < 20) then
        LinearSort(lpArray, ArrayLength)
     else if (ArrayMaxNum < 32768) then
        // Radix sort
        RadixSort(lpArray, ArrayLength, ArrayMaxNum)
     else
        // Quick sort
        QuickSort(lpArray, 0, Pred(ArrayLength));
  finally
     // Unlock the variant array
     VarArrayUnlock(vArray);
  end;

end;

0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

HypoviaxAuthor Commented:
Rlibby,

Your results as on my machine (AMD AthlonXP 1500)

0-10 : 211
0-100 : 856
0-1000 : 933
0-10000 : 1573
0-100000 : 8171

Very good, much much faster than my selection sort

Any improvements (what about the merge sort...) i'd like to see. I'll provide my next sort soon to see how it compares.

Regards,

Hypoviax
0
gwalkeriqCommented:
Try looking here if you want a generic high-speed qsort for Delphi arrays, I know it can sort a million integers in less than 1 second. It is not the fastest sort since it uses a function call for each comparison, but it is generic, just like the qsort you find in C libraries.

http://www.javacode.net/vb/scripts/ShowCode.asp?txtCodeId=1185&lngWId=7
0
HypoviaxAuthor Commented:
I'm working on a new algorithm for a sort, it would be interesting to see, even if they arn't super fast, new sorting based algorithms based on the parameters i set. Can anyone beat rllibby? I'll test any submission and post the results

Best regards,

Hypoviax

0
Russell LibbySoftware Engineer, Advisory Commented:

There is a a trade off for for being generic...

The routines mentioned above, while generic (which allows for sorting of all kinds of data), is about 3-4 times slower than the quicksort I already coded above which targets an integer array, and about 6-7 times slower than the radix/counting sort. note: radix/counting only applies if the the maxnumber falls within a certain range, so is a more specific implementation than quicksort.

What should be gathered from all of this is that there is no one "silver bullet" for sorting, period....

Knowledge of the data, and its size must be used to determine the best sorting approach for the current task at hand. For example, an array of 15 items can be sorted faster with an insertion sort/linear sort than it can with a quicksort, due to the overhead and recursion involved with quicksort. Now, if you start talking about 1 million items, then the story changes. This is due to the BIG "O" notation that is used to describe each of these algorithms.

eg:

Quicksort = O (n log2n)                       // Logarithmic
Bubble sort / selection sort = O (n2)     // Quadratic

We can see from above that as the number of items increases, the selection sort will keep getting exponentially worse. The numbers show this as well:

                    Selection Sort            Auto Fit   (could not tell from comments what max
                                                                   number was, so hard to tell if qsort or
                                                                   radix was used)
------------------------------------------------------------------------------
1. 0-10           345                          211     (Linear used)  
2. 0-100         7642                        856     (qsort or radix)
3. 0-1000       646902                     933     (qsort or radix)
4. 0-10000     127752446                1573   (qsort or radix)
5. 0-100000   not reported               8171   (qsort or radix)

Also, if the data falls into a certain limited range, etc, then we can optimize the sorting even further to use radix type / counting sorts which improves performance even more, **IF*** the range is relatively small. Again, this has to do with the Big "O" notation associated with each. Radix/Counting types sorts are linear, whereas quicksort and mergesort are logarithmic, and linear will overtake logarithmic after a certain point.


Regards,
Russell


0
HypoviaxAuthor Commented:
You are right Russel,

Merge and Quicksorts are N X log2 N algorithms whereas bubble, selection sorts  are n2 algorithms. Still waiting for someone to beat you Russel ;-)

Regards,

Hypoviax
0
Wim ten BrinkSelf-employed developerCommented:
Well, Russel... You could speed up that QuickSort routine by changing it from a recursive function to a non-recursive one. Too bad I'm running a bit out of time else I'd write it... :-P

There are two minor issues, of course. First of all we can and will have duplicate values in out list. Furthermore, the original question is about sorting a variant array full variants. However, since we have a known maximum value, we might even take a different approach of sorting the data... Here's the alternative version:

function RDTSC: Int64;
asm
  dw $310F  {BASM doesn't support RDTSC}
end;

procedure SortingStuff;
const
  arraylength = 100000;
  arraymaxnum = 1000;
var
  Items: Variant;
  List: array of Integer;
  I: Integer;
  Idx: Integer;
  Ticks: Int64;
begin
  Items := VarArrayCreate( [ 0, arraylength ], varInteger );
  //POPULATE ARRAY
  Randomize;
  for I := 0 to arraylength do begin
    Items[ I ] := random( arraymaxnum );
  end;
  // Create our list.
  Ticks := RDTSC;
  SetLength( List, arraymaxnum );
  for I := Low( List ) to High( List ) do
    List[ I ] := 0;
  for I := 0 to arraylength do begin
    Idx := Items[ I ];
    List[ Idx ] := List[ Idx ] + 1;
  end;
  Ticks := RDTSC - Ticks;
  WriteLn( 'This solution took ', Ticks, ' processor ticks.' );
  // Done sorting... Display...
  for I := Low( List ) to High( List ) do begin
    for Idx := 1 to List[ I ] do
      WriteLn( I );
  end;
end;

Yep... This sorting version looks soooooooooooooooooooo simple and is probably quite fast too. Never bothered to time it, though... It took 23657308 processor ticks on my system. This for about one hundred thousand values. Sometimes, sorting can be defeated by other interesting techniques...

But what if the arraymaxnum would be a lot larger? Well, integers are 4 bytes. The dynamic array would be 4 bytes x arraymaxnum so even with a million possible values, it would not really run out of memory. I've changed the value 1.000 into 1.000.000 and saw my solution become about 5 times slower. That isn't a lot, really, but it might be enough to justify a sorting routine...
0
Wim ten BrinkSelf-employed developerCommented:
Btw. If you time my solution using QueryPerformanceFrequency/QueryPerformanceCounter then you'll get 0.0 ticks! :-) This is because my solution is just too fast to be measured in ticks per second. I need around 23.7 million ticks on a system that runs at 2.4 billion ticks per second. It takes about 11.4 ticks per millisecond to sort a list of 100,000 records with a maximum value of 100,000... So, basically if you want to just sort a list of integers then my method defeats all other options... ;-)
0
Russell LibbySoftware Engineer, Advisory Commented:

Workshop_Alex,

> Well, Russel(l)... You could speed up that QuickSort routine by changing it from a recursive function to a non-recursive one. Too bad I'm running a bit out of time else I'd write it... :-P

Iv'e already done it, and it does not speed the function up by any great amount. In truth, it shaved about 15 ms off a 1 million count sort. Not worth the hassle in my opinion, but I have the code that does it for those that may be interested. I also have code for mergesort, sedgesort (qsort+insort), as well as a few others. Anyways, what I have attempted to demonstrate via code are some common examples of linear, logarthimic, and quadratic sorting functions.

> So, basically if you want to just sort a list of integers then my method defeats all other options... ;-)
 
If you take a look at the code I already provided (RadixSort) then you will see that its essentially the same as what you have provided above. It uses a bucketing array to count the number of items in the range of 0..MaxArrayNum, then takes those numbers and places them (in order) back in to the original array. And this (linear) method WILL defeat other solutions... to a point. Due to its linear nature, as the maxarraynum increases, so does the processing time. At a MaxArrayNum of about 1 million or so, the quicksort function becomes slightly faster than the radix/counting sort.

Now....
If you compare your method against my previous implementation (in the same context), what you will ALSO find is that your code is about 4-5 times SLOWER than what I provided.

Regards,
Russell


procedure SortingStuff2;
const
  arraylength    = 1000000;
  arraymaxnum    = 1000;
var
  lpTemp:     Array [0..32767] of Integer;
  vArray:     Variant;
  lpArray:    PIntArray;
  dwIndex:    Integer;
  dwCount:    Integer;
  dwTemp:     Integer;
  st, et:     LongWord;
  Ticks:      Int64;
begin

  // Create variant array
  vArray:=VarArrayCreate([0, Pred(ArrayLength)], varInteger);

  // Lock that bad boy ;-)
  lpArray:=VarArrayLock(vArray);

  // Resource protection
  try
     // Seed random number generator
     Randomize;
     // Fill the array
     for dwIndex:=0 to Pred(ArrayLength) do lpArray^[dwIndex]:=Random(ArrayMaxNum);
     // Mark time
     st:=GetTickCount;
     Ticks := RDTSC;
     // Clear the temp array
     for dwIndex:=0 to 32767 do lpTemp[dwIndex]:=0;
     // Count frequency
     for dwIndex:=0 to Pred(ArrayLength) do Inc(lpTemp[lpArray^[dwIndex]]);
     // Sort the array
     dwIndex:=0;
     for dwTemp:=0 to ArrayMaxNum do
     begin
        // Get frequency count for slot
        dwCount:=lpTemp[dwTemp];
        while (dwCount > 0) do
        begin
           lpArray^[dwIndex]:=dwTemp;
           Inc(dwIndex);
           Dec(dwCount);
        end;
        if (dwIndex = ArrayLength) then break;
     end;
  finally
     // Unlock the variant array
     VarArrayUnlock(vArray);
  end;

  // Mark time
  Ticks := RDTSC - Ticks;
  et:=GetTickCount-st;

  ShowMessage(Format('This solution took %d processor ticks (%d ms).', [Ticks, et]));

end;

------------------

procedure SortingStuff1;
const
  arraylength = 1000000;
  arraymaxnum = 1000;
var
  Items: Variant;
  List: array of Integer;
  I: Integer;
  Idx: Integer;
  st, et:     LongWord;
  Ticks:      Int64;
begin

  Items := VarArrayCreate( [ 0, arraylength ], varInteger );
  //POPULATE ARRAY
  Randomize;
  for I := 0 to arraylength do begin
    Items[ I ] := random( arraymaxnum );
  end;
  // Create our list.

  st:=GetTickCount;
  Ticks := RDTSC;
  SetLength( List, arraymaxnum );
  for I := Low( List ) to High( List ) do
    List[ I ] := 0;
  for I := 0 to arraylength do begin
    Idx := Items[ I ];
    List[ Idx ] := List[ Idx ] + 1;
  end;
  Ticks := RDTSC - Ticks;
  et:=GetTickCount-st;

  ShowMessage(Format('This solution took %d processor ticks (%d ms).', [Ticks, et]));

end;





0
Wim ten BrinkSelf-employed developerCommented:
Well, logical that my solution is slower than yours. I use a dynamic array while you use a static one. I also didn't lock the variant like you did. (Although you should unlock it before you start measuring time and lock it again after you started your timer, to be fair! ;-)

I also wonder why you don't use ZeroMemory(@lpTemp, SizeOf(lpTemp)) to clear your array.

So, okay... You're fast. My solution was just to show the principle. But now you made me mad and I created something that's sometimes faster than yours... :P Check out this unit:

unit untSorting;

interface

uses
  ActiveX;

function RandomArray( MaxLength, MaxValue: Integer ): Variant;
procedure SortingStuffWA( var Items: Variant; MaxLength, MaxValue: Integer );
procedure SortingStuffRussell( var Items: Variant; MaxLength, MaxValue: Integer );
procedure DumpArray( Items: Variant );

implementation

type
  PIntArray = ^TIntArray;
  TIntArray = array[ 0..Pred( MaxInt shr 2 ) ] of Integer;

function RDTSC: Int64;
asm
  dw $310F  {BASM doesn't support RDTSC}
end;

function RandomArray( MaxLength, MaxValue: Integer ): Variant;
var
  I: Integer;
begin
  Result := VarArrayCreate( [ 0, MaxLength ], varInteger );
  RandSeed := 0;
  Randomize;
  for I := VarArrayLowBound( Result, 1 ) to VarArrayHighBound( Result, 1 ) do
    Result[ I ] := random( MaxValue );
end;

procedure DumpArray( Items: Variant );
var
  I: Integer;
begin
  for I := VarArrayLowBound( Items, 1 ) to VarArrayHighBound( Items, 1 ) do begin
    Write( Items[ I ]: 8 );
    if ( ( I mod 10 ) = 9 ) then WriteLn;
  end;
  WriteLn;
end;

procedure SortingStuffWA( var Items: Variant; MaxLength, MaxValue: Integer );
var
  List: array of Integer;
  I, J, K: Integer;
  Ticks: Int64;
  lpArray: PIntArray;
begin
  Ticks := RDTSC;
  lpArray := VarArrayLock( Items );
  SetLength( List, MaxValue );
  for I := Low( List ) to High( List ) do
    List[ I ] := 0;
  for I := 0 to MaxLength do
    inc( List[ lpArray^[ I ] ] );
  K := 0;
  for I := Low( List ) to High( List ) do begin
    for J := 1 to List[ I ] do begin
      lpArray^[ K ] := I;
      Inc( K );
    end;
  end;
  VarArrayUnlock( Items );
  Ticks := RDTSC - Ticks;
  WriteLn( Ticks: 18, ' processor ticks required for Alex''s solution.' );
end;

procedure SortingStuffRussell( var Items: Variant; MaxLength, MaxValue: Integer );
var
  lpTemp: array[ 0..32767 ] of Integer;
  lpArray: PIntArray;
  dwIndex: Integer;
  dwCount: Integer;
  dwTemp: Integer;
  Ticks: Int64;
begin
  // Mark time
  Ticks := RDTSC;
  lpArray := VarArrayLock( Items );
  // Clear the temp array
  for dwIndex := 0 to 32767 do
    lpTemp[ dwIndex ] := 0;
  // Count frequency
  for dwIndex := 0 to MaxLength do
    Inc( lpTemp[ lpArray^[ dwIndex ] ] );
  // Sort the array
  dwIndex := 0;
  for dwTemp := 0 to MaxValue do begin
    // Get frequency count for slot
    dwCount := lpTemp[ dwTemp ];
    while ( dwCount > 0 ) do begin
      lpArray^[ dwIndex ] := dwTemp;
      Inc( dwIndex );
      Dec( dwCount );
    end;
    if ( dwIndex = MaxLength ) then break;
  end;
  // Unlock the variant array
  VarArrayUnlock( Items );
  // Mark time
  Ticks := RDTSC - Ticks;
  WriteLn( Ticks: 18, ' processor ticks required for Russell''s solution.' );
end;

end.

In another unit, I use this code to run both tests:

procedure TestSorts;
var
  Items1, Items2: Variant;
  MaxLength, MaxValue: Integer;
begin
  MaxLength := 1000000;
  MaxValue := 1000;
  Items1 := RandomArray( MaxLength, MaxValue );
  Items2 := RandomArray( MaxLength, MaxValue );
  SortingStuffWA( Items1, MaxLength, MaxValue );
  SortingStuffRussell( Items2, MaxLength, MaxValue );
  DumpArray( Items1 );
  DumpArray( Items2 );
end;

Now, with MaxLength = 1,000,000 and MaxValue = 1,000 your method performs faster. Results:
          27075284 processor ticks required for Alex's solution.
          26375604 processor ticks required for Russell's solution.

With MaxLength = 100,000 and MaxValue = 1,000 then suddenly my method gets faster!
           1969700 processor ticks required for Alex's solution.
           2451068 processor ticks required for Russell's solution.

And with MaxLength = 1,000 and MaxValue = 1,000 then suddenly my method seems extremely fast!
             67336 processor ticks required for Alex's solution.
            232072 processor ticks required for Russell's solution.

But with MaxLength = 10,000,000 and MaxValue = 100 you clearly win again.
         281199180 processor ticks required for Alex's solution.
         230968664 processor ticks required for Russell's solution.

The difference? Well, we both get exactly the same random data to sort. (That's what the RandSeed is for.) When the data is dumped, this is clearly visible. I've compared it, and it's identical every time. (I hope you agree with how I modified your code a bit, though.) The difference is mainly the fact that you're filling a large array, even when you only need to fill a few values. And I happen to use a double for-loop when setting the integer data back in the variant, while you use a for-loop and a while-loop. For-loops are faster, in general. They optimize better.
And of course the additional if-then-break command you use is a bit much.

And guess what? If you change:
  for dwIndex := 0 to 32767 do
into:
  for dwIndex := 0 to MaxValue do
Then your code becomes lightning fast again. But the use of ZeroMemory doesn't...

I think both solutions could use a lot of optimization, though... So, solution 3:

procedure SortingStuffWA2( var Items: Variant; MaxLength, MaxValue: Integer );
var
  I, J, K: Integer;
  Ticks: Int64;
  lpArray: PIntArray;
  lpList: PIntArray;
  ListSize: Integer;
begin
  Ticks := RDTSC;
  lpArray := VarArrayLock( Items );
  ListSize := ( MaxValue + 1 ) * SizeOf( Integer );
  GetMem( lpList, ListSize );
  for I := 0 to MaxValue do
    lpList^[ I ] := 0;
  for I := 0 to MaxLength do
    inc( lpList^[ lpArray^[ I ] ] );
  K := MaxLength;
  for I := MaxValue downto 0 do begin
    for J := 1 to lpList^[ I ] do begin
      lpArray^[ K ] := I;
      Dec( K );
    end;
  end;
  VarArrayUnlock( Items );
  FreeMem( lpList, ListSize );
  Ticks := RDTSC - Ticks;
  WriteLn( Ticks: 18, ' processor ticks required for Alex''s second solution.' );
end;

This solution is about as fast as yours with large lists but rediculously fast with small lists. (About 200 times faster with a list of 10 items, with MaxValue=1000.)

This just shows that the best sorting method depends on the data type and the number of elements, though. Your solution is faster with 10 million records and MaxValue of 10, though... The speed gain is mostly from the initialization of the list[MaxValue] and the moving of the data back to the variant.

There's one other problem, though... My solution with a dynamic array allows quite large values while in your solution, (and my second one) the MaxValue cannot be larger than Pred( MaxInt shr 2 ). (Thus, a MaxValue of 250,000 will fail...)
0
Wim ten BrinkSelf-employed developerCommented:
MaxLength = 10 and MaxValue= 10.
              8532 processor ticks required for Alex's solution.
            189044 processor ticks required for Russell's solution.
              2652 processor ticks required for Alex's second solution.
Wow...

MaxLength = 10,000 and MaxValue= 10,000.
            677996 processor ticks required for Alex's solution.
            651448 processor ticks required for Russell's solution.
            574812 processor ticks required for Alex's second solution.
Okay...

MaxLength = 1,000,000 and MaxValue= 10,000.
          29780776 processor ticks required for Alex's solution.
          28292756 processor ticks required for Russell's solution.
          28211212 processor ticks required for Alex's second solution.
Uhoh...

MaxLength = 10 and MaxValue= 25000.
            603976 processor ticks required for Alex's solution.
            363492 processor ticks required for Russell's solution.
            407044 processor ticks required for Alex's second solution.
@#$%&...

Well, playing with the possible values is fun. This method is costly, though, when the value of the items can be very different. This is because the system needs to allocate more memory for the larger sort-table.
0
Russell LibbySoftware Engineer, Advisory Commented:
To reiterate again what I have already stated:

What should be gathered from all of this is that there is no one "silver bullet" for sorting, period....

Knowledge of the data, and its size must be used to determine the best sorting approach for the current task at hand.


Regards,
Russell
0
aikimarkCommented:
I'm sorry to have arrive late to this discussion.  If not too late for the questioner, I would like to add/suggest the following:

1. copy the data to a tIntList and use the object's .Sort method that might offer some optimization

2a. reconfigure the bytes in the integer data from littleEndian (Windows/Intel format) to bigEndian (everyone else's format) and use a byte radix sort -- bottom up solution.

2b. use a variant of the American Flag sort on each byte (or two byte combinations) of the integer data, modifying this to optimize for maxvalue -- top down solution

3. for the maxvalues shown in the discussion thread, you should probably use a counting sort.  You only need 4*maxvalue additional bytes for the counting array and may only need 2*maxvalue bytes, depending on the number of items you are sorting.  This is an O(N) sort that is very fast.

4. dynamically create an optimized ASM object stream in memory and execute it instead of the busiest part of your Delphi loops.  I read an interesting article about this method in a Delphi Informant article in the late 1990s.  I think the author was a developer with N-sort.
0
HypoviaxAuthor Commented:
This is what i like to see...

...raw competition. I'll test the solutions on my machine to get a controlled result (unless WorkShop_Alex you want to test). I don't mind variations in using the variant. If you need to speed up solutions you can use the array of integers instead.

Hypoviax
0
Russell LibbySoftware Engineer, Advisory Commented:

The competition aspect of this is good, I agree...

But what I have been trying to emphasize is that the performace "testing" at this point is very subjective. For utlra small data sets a quadratic (but tightly written) sort will beat other sorts. For reasonable value ranges, a counting/radix type (linear) sort will beat others. For large amounts of data a logarithmic sort like qsort will outperform the others.

Without a clear and defined testbed, anyone can make ANY of the above sort routines look like the "best", but in fact:

1.) They are all good at certain types of sorting
2.) None of them could be considered perfect for ALL scenarios

Speaking only for myself; I believe it is more important to have the knowledge of "when" to use the different types of sorts, what works well, and when.

----

Regards,
Russell

0
HypoviaxAuthor Commented:
I will use WorkShop_Alex's unit for testing as it seems this is the best way:

Results maxvalue 1000:

0-10 :

Alex Solution1 : 21938
Russel : 260745
Alex Solution2 : 10640

0-100 :

Alex Solution1 : 25513
Russel : 271152
Alex Solution2 : 12382

0-1000 :

Alex Solution1 : 43384
Russel : 291959
Alex Solution2 : 29299

0-100000 :

Alex Solution1 : 1603945
Russel : 1817554
Alex Solution2 :1547393

0-1000000 :

Alex Solution1 : 20439284
Russel : 20438833    
Alex Solution2 :18273701

Hypoviax
0
aikimarkCommented:
BTW...the article is by Jay Cole and appeared in the October 97 issue:
http://www.delphizine.com/features/1997/10/di199710jc_f/di199710jc_f.asp
0
aikimarkCommented:
since a prior version of Nitrosort was open-sourced, you might benefit from some of their techniques
0
HypoviaxAuthor Commented:
Russel,

I agree. But it would be interesting if there was a good allround sort - especially if the size and maxnum is not known. You can use the original method you posted (the combination)  -  if that is faster than your newer solution. Is there a better way to test?

Regards,

Hypoviax
0
HypoviaxAuthor Commented:
Sorry, its a members only article aikimark,

Hypoviax
0
HypoviaxAuthor Commented:
I'll just add as a comparison my simple selection sort results:

0-10 : 58274
0-100 : 3752968
0-1000 :694607400
0-10000 : still waiting....
0-100000 : not even game to try

Regards,

Hypoviax
0
aikimarkCommented:
Even with a million as maxvalue, the counting sort only requires 4MB for its array.

Usually, the most highly optimized Quicksort algorithms revert to a less efficient sort, O(N^2), when the list size is trivial.  I usually think of trivial lists of from 1 to 7 items, although I've seen this definition stretched up to 35 items.

========================
There are potential problems with measurements and comparisons.  You've already identified differences in CPU speeds and processes that finish too quickly to measure.  Also, Windows has a limited granularity in it's time measurements (ticks is a small unit, but still not granular enough to be considered 'atomic'.  The timer units are woefully large at 55ms).

One potential solution is to use an external clock that is accurate in the nano-second range.  We could use an external feed of highly accurate clock times from one of the worldwide public atomic clock feeds.

Another problem is the operating system itself.  Getting accurate timings from a multi-tasking OS can be tricky at best.  Windows mades this even more difficult with limited isolation and prioritization options.  You might want to recompile the application as a DOS application and run it without the baggage and overhead of Windows.  I'm not sure about the options under Linux, but you might be able to get decent isolation and timing data with a Kylix compilation.
0
aikimarkCommented:
Hypoviax,

>Sorry, its a members only article aikimark,

I'm not sure I understand your statement.  Aren't we all EE 'members'?!?
0
Russell LibbySoftware Engineer, Advisory Commented:
Aikimark, the link you provided was for a "members" only section (of which I am one also).

--

Hypoviax,

If the max number is not known, and the count is not known, then quicksort/mergesort/sedge sort is the best way to go. But take for example an array that has 10 items in it. Insertion Sort and Shell sort will beat the other sorts out. When compared with radix/counting types, the disparity grows even larger as the Maxvalue grows, eg:

10 items, max value = 100000.

It has to do with the memory allocation and all that wasted walking for the counting sort. Now, does that make the insertion sort a better algorithm? In this case, yes. If the numbers were reversed, and it was:

100000 items, max value = 10

Then the counting sort would run circles around all other sorts out there. Nothing else could come close. But then again, this is a very specific scenario. Does that make the counting sort the fastest? In this case yes; in other cases, no....

Mix the numbers again, for example:

10000 items, max value = 1000000

And then quicksort/sedgesort becomes faster than the others. I won't ask the question that you know is coming next... ;-)

Anyways, my final point is that it just DEPENDS on what you are dealing with (max value), and how much of it (max length) you are dealing with.

This is my final post, and contains some examples that others may find useful. Best of luck in trying to figure out what the best sort is, because IMHO it just "depends"

Russell

unit IntSort;
////////////////////////////////////////////////////////////////////////////////
//   A collection of sorting routines for integer arrays
////////////////////////////////////////////////////////////////////////////////
interface

////////////////////////////////////////////////////////////////////////////////
//   Include units
////////////////////////////////////////////////////////////////////////////////
uses
  Windows,
  SysUtils,
  ActiveX;

////////////////////////////////////////////////////////////////////////////////
//   Data types
////////////////////////////////////////////////////////////////////////////////
type
  PIntArray         =  ^TIntArray;
  TIntArray         =  Array[0..Pred(MaxInt shr 2)] of Integer;

////////////////////////////////////////////////////////////////////////////////
//   Sorting procedures
////////////////////////////////////////////////////////////////////////////////
procedure  BubbleSort(Data: PIntArray; ArrayLength: Integer);
procedure  ExchangeSort(Data: PIntArray; ArrayLength: Integer);
procedure  InsertionSort(Data: PIntArray; ArrayLength: Integer);
procedure  MergeSort(Data: PIntArray; ArrayLength: Integer);
procedure  SedgeSort(Data: PIntArray; L, R: Integer);
procedure  ShellSort(Data: PIntArray; ArrayLength: Integer);
procedure  RadixSort(Data: PIntArray; ArrayLength, ArrayMaxNum: Integer);
procedure  QuickSort(Data: PIntArray; L, R: Integer); overload;
procedure  QuickSort(Data: PIntArray; ArrayLength: Integer); overload;

////////////////////////////////////////////////////////////////////////////////
//   Generic sorting procedure
////////////////////////////////////////////////////////////////////////////////
procedure  Sort(var Items: Variant; MaxLength, MaxValue: Integer);

implementation

procedure Sort(var Items: Variant; MaxLength, MaxValue: Integer);
var  lpArray:    PIntArray;
     dwIndex:    Integer;
begin

  // Access the array
  SafeArrayAccessData(TVariantArg(Items).parray, Pointer(lpArray));

  // Sort the array based on maxlength and maxvalue
  if (MaxLength < 32) then
     // Insertion sort
     InsertionSort(lpArray, MaxLength)
  else if (MaxLength < 128) then
     ShellSort(lpArray, MaxLength)
  else if (MaxValue < 32768) then
     // Radix sort
     RadixSort(lpArray, MaxLength, MaxValue)
  else
     // Quick sort based
     SedgeSort(lpArray, 0, Pred(MaxLength));

  // Unaccess the array
  SafeArrayUnaccessData(TVariantArg(Items).parray);

end;

procedure BubbleSort(Data: PIntArray; ArrayLength: Integer);
var  i:             Integer;
     j:             Integer;
     t:             Integer;
begin

  for i:=0 to Pred(ArrayLength) do
  begin
     for j:=Succ(i) to Pred(ArrayLength) do
     begin
        if (Data^[i] > Data^[j]) then
        begin
           t:=Data^[i];
           Data^[i]:=Data^[j];
           Data^[j]:=t;
        end;
     end;
  end;

end;

procedure ExchangeSort(Data: PIntArray; ArrayLength: Integer);
var  p:             Integer;
     h:             Integer;
     k:             Integer;
     t:             Integer;
begin

  for p:=0 to ArrayLength-2 do
  begin
     h:=p;
     for k:=Succ(p) to Pred(ArrayLength) do
     begin
        if (Data^[k] < Data^[h]) then h:=k;
     end;
     if (p = h) then Continue;
     t:=Data^[p];
     Data^[p]:=Data^[h];
     Data^[h]:=t;
  end;

end;

procedure InsertionSort(Data: PIntArray; ArrayLength: Integer);
var  i:             Integer;
     j:             Integer;
     t:             Integer;
begin

  for i:=1 to Pred(ArrayLength) do
  begin
     j:=i;
     t:=Data^[j];
     while (j > 0) and (Data^[j-1] > t) do
     begin
        Data^[j]:=Data^[j-1];
        Dec(j);
     end;
     Data[j]:=t;
  end;

end;

procedure LinearSort(Data: PIntArray; ArrayLength: Integer);
var  i:             Integer;
     j:             Integer;
     k:             Integer;
     t:             Integer;
begin

  for i:=0 to Pred(ArrayLength) do
  begin
     j:=i;
     for k:=Succ(i) to Pred(ArrayLength) do
        if (Data^[k] < Data^[j]) then j:=k;
     if (j > i) then
     begin
        t:=Data^[i];
        Data^[i]:=Data^[j];
        Data^[j]:=t;
     end;
  end;

end;

procedure MergeSort(Data: PIntArray; ArrayLength: Integer);
var  lpTemp:        PIntArray;

  procedure SubMergeSort(L, R: Integer);
  var   dwLength:   Integer;
        dwPivot:    Integer;
        dwIndex:    Integer;
        m1, m2:     Integer;
  begin

     if (R > L) then
     begin
        dwLength:=Succ(R-L);
        dwPivot:=(L+R) shr 1;
        SubMergeSort(L, dwPivot);
        SubMergeSort(Succ(dwPivot), R);
        Move(lpTemp^, Data^[L], dwLength * SizeOf(Integer));
        m1:=0;
        m2:=Succ(dwPivot-L);
        for dwIndex:=0 to Pred(dwLength) do
        begin
           if (m2 < dwLength) then
           begin
              if (m1 <= dwPivot-L) then
              begin
                 if (lpTemp^[m1] > lpTemp^[m2]) then
                 begin
                    Data^[dwIndex+L]:=lpTemp^[m2];
                    Inc(m2);
                 end
                 else
                 begin
                    Data^[dwIndex+L]:=lpTemp^[m1];
                    Inc(m1);
                 end;
              end
              else
              begin
                 Data^[dwIndex+L]:=lpTemp^[m2];
                 Inc(m2);
              end
           end
           else
           begin
              Data^[dwIndex+L]:=lpTemp^[m1];
              Inc(m1);
           end;
        end;
     end;

  end;

begin

  if (ArrayLength > 0) then
  begin
     lpTemp:=AllocMem(ArrayLength * SizeOf(Integer));
     try
        SubMergeSort(0, Pred(ArrayLength));
     finally
        FreeMem(lpTemp);
     end;
  end;

end;

procedure RadixSort(Data: PIntArray; ArrayLength, ArrayMaxNum: Integer);
var  lpTemp:     PIntArray;
     dwIndex:    Integer;
     dwCount:    Integer;
     dwTemp:     Integer;
begin

  GetMem(lpTemp, Succ(ArrayMaxNum) * SizeOf(Integer));
  for dwIndex:=0 to ArrayMaxNum do lpTemp^[dwIndex]:=0;
  try
     for dwIndex:=0 to Pred(ArrayLength) do Inc(lpTemp[Data^[dwIndex]]);
     dwIndex:=0;
     for dwTemp:=0 to ArrayMaxNum do
     begin
        for dwCount:=1 to lpTemp[dwTemp] do
        begin
           Data^[dwIndex]:=dwTemp;
           Inc(dwIndex);
        end;
     end;
  finally
     FreeMem(lpTemp);
  end;

end;

procedure ShellSort(Data: PIntArray; ArrayLength: Integer);
var  i:             Integer;
     j:             Integer;
     p:             Integer;
     m:             Integer;
     t:             Integer;
     f:             Boolean;
begin

  p:=ArrayLength;
  while (p > 1) do
  begin
     p:=p div 2;
     m:=ArrayLength-p;
     repeat
        f:=False;
        for j:=0 to Pred(m) do
        begin
           i:=j+p;
           if (Data^[j] > Data^[i]) then
           begin
              t:=Data^[i];
              Data^[i]:=Data^[j];
              Data^[j]:=t;
              f:=True;
           end;
        end;
     until not(f);
  end;

end;

procedure QuickSort(Data: PIntArray; ArrayLength: Integer);
begin

  if (ArrayLength > 1) then QuickSort(Data, 0, Pred(ArrayLength));

end;

procedure QuickSort(Data: PIntArray; L, R: Integer);
var  i, j:       Integer;
     p, t:       Integer;
begin

  repeat
     i:=L;
     j:=R;
     p:=Data^[(L + R) shr 1];
     repeat
        while (Data^[i] < p) do Inc(i);
        while (Data^[j] > p) do Dec(j);
        if (i <= j) then
        begin
           t:=Data^[i];
           Data^[i]:=Data^[j];
           Data^[j]:=t;
           Inc(i);
           Dec(j);
        end;
     until (i > j);
     if (L < j) then QuickSort(Data, L, j);
     L:=i;
  until (i >= R);

end;

procedure SedgeSort(Data: PIntArray; L, R: Integer);
var  i, j:       Integer;
     p, t:       Integer;
begin

  repeat
     i:=L;
     j:=R;
     p:=Data^[(L + R) shr 1];
     repeat
        while (Data^[i] < p) do Inc(i);
        while (Data^[j] > p) do Dec(j);
        if (i <= j) then
        begin
           t:=Data^[i];
           Data^[i]:=Data^[j];
           Data^[j]:=t;
           Inc(i);
           Dec(j);
        end;
     until (i > j);
     if (L < j) then
     begin
        if ((j-L) > 32) then
           SedgeSort(Data, L, j)
        else
           InsertionSort(@Data^[L], Succ(j-L));
     end;
     L:=i;
  until (i >= R);

end;

end.


0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
aikimarkCommented:
Russell,

Thanks for the clarification.  And, as always, I defer to your superior Delphi coding skills and experience.

One last thing to suggest for a Quicksort optimization is the "middle of three" pivot selection.  In this variation, three items in the list are randomly selected and the middle one is used as the pivot for that round.  You can get a 14% (avg) improvement over the single-pivot versions of Quicksort.

Also, beware of bad, O(N^2), Quicksort performance when there are a large number of duplicate values.  This might occur with a much larger Maxlength value compared to MaxValue.  In this case, you would do well to check the current pivot value with the previous iteration's pivot point value.  If there are lots of duplicate values, a special pass of the items might separate the items into three ranges (<, =, >) instead of two ranges.

One such (three range) algorithm is the terinary sort described by Jon Bentley.  Jon used this algorithm to quickly and efficiently sort string items.  I think it could be used to sort integer data.

========================
For future readers of this discussion thread, you might want to pick up a copy of Rod Stephens book "Ready to Run Delphi 3.0 Algorithms."  There's a good section on sorting and source code on the companion CD.  If you were a reader of Delphi Informant, you might have already read some of the content of this book when reading Rod's published articles.
0
HypoviaxAuthor Commented:
This is my second attempt for a faster solution. It will often outperform when the maxnum is quite high:

procedure Hypoviaxsort( var Items: Variant; MaxLength, MaxValue: Integer );
var i, j : integer;
tempr : Integer;
flag : boolean;
Ticks: Int64;
begin
Ticks := RDTSC;
for i:=maxlength-1 downto 0 do
  begin
    tempr := items[i];
        j := i+1;
        flag := true;
      while (j<=maxvalue) and flag do
        if tempr > items[j] then
            begin
            items[j-1] := items[j];
            j := j+1
            end
        else
        flag := false;
        items[j-1] := tempr
end ;
Ticks := RDTSC - Ticks;
  WriteLn( Ticks: 18, ' processor ticks required for Hypoviax''s second solution.' );


  end;

I will read through everybodys comments soon. I have run out of time today.

Best Regards,

Hypoviax
0
aikimarkCommented:
Hypoviax,

In reading your most recent comment, I will recommend that you should replace the inner loop with a single 'block copy' (Delphi copy function or maybe the memcopy or RtlMoveMemory API calls) that will copy the contiguous block of integers one location, making room for the moved item.  You will still need to do the comparison of items to find the correct position.

http://www.delphibasics.co.uk/RTL.asp?Name=Copy

===============================
For the O(N^2) algorithms, you might have to do many runs and project times for very large maxvalue or maxlength values using statistical extrapolation.

===============================
Missing from my earlier comment are several contributors to performance differences beyond clock speed:
  * CPU cache size and speed
  * memory speed
  * bus speed
  * architecture (Intel vs. AMD vs. transmeta)
  * temperature -- Intel automatically slows down if it gets dangerously warm
0
HypoviaxAuthor Commented:
Russel's results for the last post:

0-10: 3933
0-100:23888
0-1000:16650
0-100000:18579267
0-1000000:20363031

Alex's

0-10 :Alex Solution1 : 21938
         Alex Solution2 : 10640

0-100 :Alex Solution1 : 25513
           Alex Solution2 : 12382

0-1000 :Alex Solution1 : 43384
            Alex Solution2 : 29299

0-100000 :Alex Solution1 : 1603945
               Alex Solution2 :1547393

0-1000000 :Alex Solution1 : 20439284
                Alex Solution2 :18273701

Looks like there are 2 winners!

Regards,

Hypoviax
0
HypoviaxAuthor Commented:
My results for my last sort:

0-10 : 11291
0-100: 92825
0-1000: 585722
0-100000 :71662595
0-1000000: 781265785

A substantial increase in speed for me, but nothing comparable to you guys. Russel's sort seems to be the fastest for 0-10 and 0-1000 and Alex's sort (solution2) seems to be the fastest from 0-100, 0-100000 and 0-1000000.

Unless there are further posts regarding solutions (I'm happy to wait, i have plenty of time now), i will split the points up appropriately. Please tell me when you guys believe there is no more to be posted.

Best Regards,

Hypoviax
0
HypoviaxAuthor Commented:
I might add, when the maxnum is 10000 it looks like my solution is the fastest from 0-10 and 0-100 ;-) (other than that it is not comparable)

regards,

Hypoviax
0
aikimarkCommented:
Hypoviax,

To extend the competitive range of your insertion sort, there are several things to try...
1. After your sorted list becomes non-trivial, perform a binary search for the correct position of the next item to be inserted, rather than the linear search you are doing.  Alternatively, you might also do an interpolative search, which is a bit faster than binary list searches.  This reduces the number of comparisons from O(N/2) down to O(logN) or less.

2a. use a list object for your inserts.  The tradeoff here is the additional space required for the list (8*N for and integer list -- data and pointers) against the ability to directly insert a new object without having to move contiguous array items out of the way (the multi-item move/copy performance issue has already been addressed in an earlier comment).  Using a list is practical when you can preallocate and initialize the list object, eliminating the new item memory allocation operations.  

2b. The most practical implementation would be to simulate a list by allocating a same sized array as a Next-in-List item locator.

2c. The cost of insertions can be eliminated if the target (sorted list) array has enough room to accomodate the inserted items without the need to move an item out of the way (or at least minimize this.

2d. You can also reduce the cost of a linear search new-item insert by selecting the closest (sorted list) end point.

=======================
Russell,

You can improve the performance of your Shellsort with a different gap-distance formula (H values in the following quoted text):
"D. L. Shell, the originator of the algorithm, took H=N%2 as the first value of H and used the formula H=H%2. A sequence which has been shown empirically to do well is

H=...,1093,364,121,40,13,4,1

i.e H=(3**K-1)/2 which can be generated by the formula H=3*H+1 with the initial value H=1. The number of operations required in all is of order O(N**(3/2)) for the worst possible ordering of the original data. In average the operations count goes approximately as O(N**1.25). "
From: http://www.geocities.com/SiliconValley/Garage/3323/aat/a_shel.html
~~~~~~~~~
Notes:
* Knuth's increments: 1,4,13,...,(3^k-1)/2.  
* Hibbard's increments: 1, 3, 7, ..., 2^k-1.
* Sedgewick's increments: 1, 5, 19, 41, 109, ..., with each term having the form of either 9*4k - 9*2k + 1 or 4k - 3*2k + 1

==========================
My Observation:
I find it interesting that some of the best performance boosts for algorithm implementations is to use direct memory pointers in order to avoid data movement.  However, the .Net framework does not fully support pointers in managed code.  So it seems that some of our fastest processes might be slowed by 'progress'.
0
Wim ten BrinkSelf-employed developerCommented:
@Hypoviax, it's your contest, you should do the testing. However, please try to measure the time used with the RDTSC function from my code since it's more exact than those performance counters. RDTSC are just the exact number of processor cycles. Not some value per second...
Also keep in mind that Russel has a flaw in his code. He's clearing too much records in his code, if the MaxValue is too low. If he would only clear the real amount of memory then his solution would be a lot faster.

And you know the size of the list. If you don't know the maximum number then you could always walk just once through the array to find that maximum value. It does slow down things a bit more but it would save on speed if you get a reasonably low MaxValue account since you don't have to allocate that much memory in that case. It might be a consideration to find the minimum and maximum values in the list prior to running the proper sorting routine. That way, you could select the most optimal sorting routine. The counting sort is great with a low maximum value while a QuickSort might be better for larger lists with a high maximum value.

@Russell, you might speed up some of your sorting routines that use recursion simply by removing the recursive calls and use a non-recursive version of those sort routines. A recursive routine is a bit expensive in performance simply because Delphi must do a lot of function calls in those cases.

@Hypoviax, you might make the test even more interesting by using a test list that does NOT contain any duplicate values. (Which means that MaxValue must be larger than MaxLength.) The counting sorts get it's biggest advantage just because there are quite a few duplicates in the file.

Now, something interesting. A counting sort that doesn't know anything about the length or maximum value. It's not the fastest solution but it's an interesting example showing how to do it.

procedure SortingStuffWA3(var Items: Variant);
var
  List: array of Integer;
  I, J, K: Integer;
  Ticks: Int64;
  lpArray: PIntArray;
  MaxLength, MaxValue: Integer;
  Value: Integer;
begin
  Ticks := RDTSC;
  MaxLength := VarArrayHighBound(Items, 1) - VarArrayLowBound(Items, 1);
  MaxValue := -1;
  lpArray := VarArrayLock(Items);
  for I := 0 to MaxLength do begin
    Value := lpArray^[I];
    if (Value > MaxValue) then begin
      SetLength(List, Value + 1);
      while (MaxValue < Value) do begin
        Inc(MaxValue);
        List[MaxValue] := 0;
      end;
    end;
    inc(List[Value]);
  end;
  K := MaxLength;
  for I := High(List) downto Low(List) do begin
    for J := 1 to List[I] do begin
      lpArray^[K] := I;
      Dec(K);
    end;
  end;
  VarArrayUnlock(Items);
  Ticks := RDTSC - Ticks;
  WriteLn(Ticks: 18, ' processor ticks required for Alex''s third solution.');
end;

This method is slow when it's already sorted from low to hugh and fast if the original list is ordered from high to low. The reason for this is because this function just allocates as much memory as needed and adds more if it needs more. The biggest advantage is that this sort method just needs the variant array to work with and nothing more. But this flexibility comes with a price which is a slower speed. The speed of this routine is highly dependant of the data it contains and thus not reliable if you want a fast sorting routine. But it is a fast one if you don't know the maximum value...
0
HypoviaxAuthor Commented:
Workshop_Alex,

 I have been using your RDTSC function when measuring the performance of code for every solution. It seems however, we have lost our other major competitor (he said he has made his last post). So we can either go ahead and try and do one more round with the fact of not knowing max length or num or leave it here. It would be interesting to see performance differences, however.

regards,

Hypoviax
0
aikimarkCommented:
Alex,

What impact does the existence of (Intel) hyperthreading have on the RDTSC measurements?

In fact, would this contest be any more interesting if we were to spawn another thread on the other processor(s), actual or virtual, and sort with parallel processes?

I wonder, with your most recent counting sort example, if the performance figures you see aren't really a result of the use of a variant array instead of a pure integer array.  Please excuse my ignorance if I have misunderstood the code.
0
Wim ten BrinkSelf-employed developerCommented:
Hmmm. Hyperthreading is one thing that troubles me so I actually don't know. The same is true about checking the performance on a multi-processor system. It could be even worse on a dual-hyperthreading-proceesor system. :-) But in general you can specify that your application will only run on one specific processor, thus a processor with hyperthreading could be told to use only one "emulated" processor for the application.
Then again, if all tests are done on a hyperthreading or multi-processor system then the comparisons will still be fair values.

For my latest counting sort I did have to do what the rules told me to do. I had to sort items in a variant array. If I had used a dynamic array of integers, the speed would be better. A static array would perform even faster. But data in a variant array can be locked so you can access it as a large block of data. And that is what is used in most sort examples given above. The performance decrease however comes from the dynamic array that I might have to resize once in a while. But if the first element in the array has the highest value then it will perform at it's top speed.

Furthermore, don't forget that a counting sort doesn't always be the fastest solution. It also has another flaw since it can only sort positive numbers. If you would also want it to sort negative numbers then you'd also have to determine the minimum value. And you'd have to create an array with values from the Minimum value to the Maximum value.

People have been experimenting with finding the best sorting routines even before the computer existed. The problem however is that the sorting speed is highly dependant of the data that is used to sort with.

@Hypoviax, my last version isn't meant to be part of a new speed test. It's only there to show how to solve the problem of an unknown maximum value. And of course the problem you could encounter if you're trying to use it. Say, the data you want to sort is this: 9, 9, 8, 6, 5, 3, 3, 2, 2, 0. Thus, when my latest version tries to process this, it will set up a count list for (9+1) elements, since the first element is used as MaxValue. When it continues to read the data, it will not find any new value greater than the MaxValue so the array is set at the optimum length from the beginning.
But if the data was actually sorted in opposite order: 0, 2, 2, 3, 3, 5, 6, 6, 8, 9 then it will be very slow. Because the first MaxValue is 0, thus it creates an array for (0+1) elements. The next element raises MaxValue to 2 so it needs to add two more elements to the dynamic array. Then another to raise it for MaxValue 3, two more for MaxValue 5, etc, until it finally reaches it's optimal size at the end. All this resizing consumes a lot of time so in such a case it is pretty inefficient. A QuickSort would be more efficient since it would not have to resize any arrays..

Russell is just correct when he says that the sorting method used is highly dependant of the data that you want to sort in the first place. The count sort is actually only useful if you're sorting just numbers and don't have a high maximum value and no negative values. It is therefore quite limited. The QuickSort routine is very popular since it can be used for all kinds of data and it will perform quite fast in general.

If you really want to have a fast sorting solution then you need to know quite a lot of information about the data you want to sort. Preferably the data type, the variance in different values, the amount of data elements and also the number of duplicate values. And to execute a good test of the sorting routines you will need lots and lots of testdata to test each and every solution with. You would, for example, need a list that is already sorted and contains no duplicates. And another list without duplicates that is sorted in reverse order. A list with only one single value. A list with two values that both occur a lot of times. You need small lists, large lists, lists with a low maximum value, lists with a high maximum value. All in all, you need to generate quite a few lists to do a good test with.
Furthermore, you should test every method with all those lists more than once per list. Preferably 5 times, dropping the highest and lowest result and calculating the average over the other three results. Maybe display it all in some graph too. Very complex to do it right. :-)
0
HypoviaxAuthor Commented:
Alex,

I understood, your last comment as not being part of the speed test, it is most interesting however. I will award points after 2-3 days unless people want to continue. I will do a final test on all solutions (including mine and your last one as a comparison only) taking into account your comments; i will do the test 5 times and caculate an average. My machine is an AMD so the HT discussion will not apply to my results

Regards,

Hypoviax
0
aikimarkCommented:
Alex,

I didn't see anywhere that the items had to be sorted "in place" and couldn't be copied to another data structure for sorting and then copied back to the variant array.  It just seemed that some of the algorithms' performance figures might benefit from a different data structure.

Since we are trying to use the optimal algorithm to fit a variety of data, some of the performance profiles (similar to Russell's code) might be used to gauge the appropriateness/desireability of the counting sort.  Since we want to best the Quicksort profile, O(N logN), we need to predict the likely counting sort performance, based on the following factors:
1. the number of additions (=N) during the counting
2. the number of additions (=MaxValue+1) during output
3. the number of comparisons during the output phase (=MaxValue+N)
4. relative cost of comparison and addition to similar 'atomic' operations in the other algorithms.
0
HypoviaxAuthor Commented:
Final Results:

AVERAGES:

TEST: MaxLength of 10 with a max value of 1000

RUSSEL  : 5610
ALEX    : 19483
HYPOVIAX: 16477

WINNER : RUSSEL

TEST: MaxLength of 100 with a max value of : 1000

RUSSEL  : 24450
ALEX    : 23500
HYPOVIAX: 98640

WINNER : ALEX

TEST: MaxLength of 1000 with a max value of : 1000

RUSSEL  : 46500
ALEX    : 31515
HYPOVIAX: 959132

WINNER : ALEX

TEST: MaxLength of 100000 with a max value of : 1000

RUSSEL  : 1522955
ALEX    : 1300142
HYPOVIAX: 110902858

WINNER : ALEX

TEST: MaxLength of 1000000 with a max value of : 1000

RUSSEL  : 38756156
ALEX    : 17860298
HYPOVIAX: 554318985

WINNER : ALEX

From these averages (5 tests for each length change) the winner looks to be Alex. However,

Much the same ranks result when the max value is 100

Regards,

Hypoviax
0
Wim ten BrinkSelf-employed developerCommented:
Personally, I consider Russell the winner because he came with several different and interesting sorting mechanisms. You should actually try them all in this competition and then you might notice that sometimes, other sorting methods will win. I found my copy of "Ready to run Delphi 3.0 Algorithms" again and the book shows some interesting comparisons. The book makes it clear too that the sorting speed is highly dependant of the data it contains.

@aikimark, the best sorting mechanism is the counting sort, since it's a simple O(M+N) system, compared to O(NxLog(N)). And M is the maximum value in the list and N the number of items. Thus, a large N and a small M will be fast. A small N and a large M might actually be very slow. Slower than a Quicksort for the same value of N. But in it's turn, a Quicksort is slow when you have many duplicate values, in which case it might be beaten by a bubblesort. Thus, the best sorting mechanism depends on the data you need to sort. But in general, a QuickSort will do the trick quite nicely.

However, I wished someone would come up with a non-recursive QuickSort routine, which should be even faster... :)
0
aikimarkCommented:
Aren't all Delphi developers "winners"?!?   :-)

@Alex, you deserve some credit for the RDTSC function.  It's pretty slick.
0
HypoviaxAuthor Commented:
I think i will hand the points out now. Thanks greatly for participating in such an interesting competition. I have learnt heaps and hope others have too.
0
HypoviaxAuthor Commented:
If i could only set the points higher i would. Thanks to all participants

Regards,

Hypoviax

p.s will look for the non-recursive Quick sort. If found i'll post it here.
0
Wim ten BrinkSelf-employed developerCommented:
@aikimark, Naaah... The RDTSC is a well-known function by now for timing purposes. It's just easy to forget again. :-) It was introduced with the first Pentium processors but never become part of Delphi. It would be the same as rewarding someone for inventing the lightswitch.

@Hypoviax, I have some good ideas about how to write a non-recursive QuickSort routine. Basically, it's similar to enumerating folders in a non-recursive way. You do this by using a dynamic list. Each list entry should contain the index of the first and last item of the list that needs to be sorted. You add Low() and High() as first values to that list and whenever you would call the recursive Quicksort routine, you would now just add a new entry to the list. All you have to do is walk through the list from the first to the last element and then the list you needed to sort is sorted.
When you would do a recursive sort, you're doing something similar. The only difference is that you don't use a list to keep track of what needs to be sorted but you call the function again with new parameters that end up on the stack. It's quite similar.

In http://www.experts-exchange.com/Programming/Programming_Languages/Delphi/Q_21195960.html I have added a non-recursive routine that enumerates files, folders and subfolders. If I had time, I would write something similar for the QuickSort. It would be faster than the recursive Quicksort since you don't have the overhead of the additional function calls but the final code would look quite complex. Using a list to sort another list. Interesting... :-)
0
HypoviaxAuthor Commented:
Not sure i understand the code but here is a sample of the non-recursive quicksort which may interest you Alex:

http://www.bsdg.org/SWAG/SORTING/0066.PAS.html

Regards,

Hypoviax
0
aikimarkCommented:
Hypoviax,

Thank you for the points...quite unexpected for not having written any Delphi code.  Kibbutz points :-)
0
HypoviaxAuthor Commented:
Well, you did make significant input into the maths of it all, and made worthy suggestions of improvement to Russel, Alex and myself so i thought you should benefit from some of your efforts.

Regards,

Hypoviax
0
Russell LibbySoftware Engineer, Advisory Commented:

Been away for a few days, but glad to see that this question is now closed. ;-)
(thanks for the points)

I also see that there were a few comments directed towards me, and how I could make certain improvements to my code. To be truthful, I would be more inclined to research/implement/etc these "improvements" had there been code that went along with the suggestions. Talking is one thing, doing is another; No offense directed towards anyone...

I will also re-iterate what I previously stated on the recursive functions vs their iterative versions:

-- There is little to be gained by going this route --

for example,

MaxLength = 100,000
MaxValue  = 1,000,000

100 Iterations using SedgeSort (recursive)

RDTSC / ms
6686967768 : 2656ms to complete 100 iterations
6694915724 : 2656ms to complete 100 iterations
6569867360 : 2610ms to complete 100 iterations

100 Iteartions using SedgeSortNR (iterative)

RDTSC / ms
6702981572 : 2656ms to complete 100 iterations
6678711928 : 2641ms to complete 100 iterations
6519984612 : 2594ms to complete 100 iterations

In most cases there is minimal (and sometimes no) performance gain, and the code becomes much more complex. The fact is, a recursive call in Delphi is not as expensive as one would think. A good write up on using recursion, and when to use iterative functions can be found here:

http://66.102.7.104/search?q=cache:w0mycdadNDYJ:www.cs.unc.edu/~weiss/COMP114/BOOK/06Recursion.doc+%2Brecursion%2Biterative%2Bquicksort%2Bspeed%2Bdifference&hl=en

At the end the author makes the following statement, with which I fully agree.

<quote>
So when should one use iteration, and when recursion? The first answer is, avoid using recursion if it doesn't make the code significantly easier to write and understand. In that light, many of our examples, including factorial, selection sort, and finding the largest entry in an array, must be considered convenient pedagogical exercises rather than good code. We classify the code for binary search and quicksort as examples where the gain in clarity and ease of programming justify the small degradation in execution speed much of the time.
<quote>

As will soon be seen, the iterative versions of quick sort /sedge sort are much more complex than their recursive counterparts.

Anyways, here's the code for both those algo's done in iterative fashion. Feel free to modify/improve on them, but please don't suggest that I be the one to do it <g>

Regards to all,
Russell


procedure SedgeSortNR(Data: PIntArray; ArrayLength: Integer);
var  Stack:            Array [0..1023] of Integer;
     dwStack:          Integer;
     dwSplit:          Integer;
     dwSwap:           Integer;
     dwTemp:           Integer;
     dwFirst:          Integer;
     dwLast:           Integer;
     dwHi:             Integer;
     dwLo:             Integer;
begin

  Stack[0]:=0;
  Stack[1]:=Pred(ArrayLength);
  dwStack:=2;

  while (dwStack > 0) do
  begin
     Dec(dwStack, 2);
     dwFirst:=Stack[dwStack];
     dwLast:=Stack[dwStack+1];
     repeat
        dwLo:=dwFirst;
        dwHi:=dwLast;
        dwSplit:=Data^[(dwFirst + dwLast) shr 1];
        while True do
        begin
           while (Data^[dwLo] < dwSplit) do Inc(dwLo);
           while (Data^[dwHi] > dwSplit) do Dec(dwHi);
           if (dwLo > dwHi) then break;
           dwSwap:=Data^[dwLo];
           Data^[dwLo]:=Data^[dwHi];
           Data^[dwHi]:=dwSwap;
           Inc(dwLo);
           Dec(dwHi);
        end;
        if ((dwHi-dwFirst) < (dwLast-dwLo)) then
        begin
           if (dwLo < dwLast) then
           begin
              if ((dwLast-dwLo) < 32) then
              begin
                 for dwSplit:=Succ(dwLo) to dwLast do
                 begin
                    dwSwap:=dwSplit;
                    dwTemp:=Data^[dwSwap];
                    while (dwSwap > dwLo) and (Data^[dwSwap-1] > dwTemp) do
                    begin
                       Data^[dwSwap]:=Data^[dwSwap-1];
                       Dec(dwSwap);
                    end;
                    Data[dwSwap]:=dwTemp;
                 end;
              end
              else
              begin
                 Stack[dwStack]:=dwLo;
                 Stack[dwStack+1]:=dwLast;
                 Inc(dwStack, 2);
              end;
           end;
           dwLast:=dwHi;
        end
        else
        begin
           if (dwFirst < dwHi) then
           begin
              if ((dwHi-dwFirst) < 32) then
              begin
                 for dwSplit:=Succ(dwFirst) to dwHi do
                 begin
                    dwSwap:=dwSplit;
                    dwTemp:=Data^[dwSwap];
                    while (dwSwap > dwFirst) and (Data^[dwSwap-1] > dwTemp) do
                    begin
                       Data^[dwSwap]:=Data^[dwSwap-1];
                       Dec(dwSwap);
                    end;
                    Data[dwSwap]:=dwTemp;
                 end;
              end
              else
              begin
                 Stack[dwStack]:=dwFirst;
                 Stack[dwStack+1]:=dwHi;
                 Inc(dwStack, 2);
              end;
           end;
           dwFirst:=dwLo;
        end;
     until (dwFirst >= dwLast);
  end;

end;

procedure QuickSortNR(Data: PIntArray; ArrayLength: Integer);
var  Stack:            Array [0..1023] of Integer;
     dwStack:          Integer;
     dwSplit:          Integer;
     dwSwap:           Integer;
     dwFirst:          Integer;
     dwLast:           Integer;
     dwHi:             Integer;
     dwLo:             Integer;
begin

  Stack[0]:=0;
  Stack[1]:=Pred(ArrayLength);
  dwStack:=2;

  while (dwStack > 0) do
  begin
     Dec(dwStack, 2);
     dwFirst:=Stack[dwStack];
     dwLast:=Stack[dwStack+1];
     repeat
        dwLo:=dwFirst;
        dwHi:=dwLast;
        dwSplit:=Data^[(dwFirst + dwLast) shr 1];
        repeat
           while (Data^[dwLo] < dwSplit) do Inc(dwLo);
           while (Data^[dwHi] > dwSplit) do Dec(dwHi);
           if (dwLo <= dwHi) then
           begin
              dwSwap:=Data^[dwLo];
              Data^[dwLo]:=Data^[dwHi];
              Data^[dwHi]:=dwSwap;
              Inc(dwLo);
              Dec(dwHi);
           end;
        until (dwLo > dwHi);
        if ((dwHi-dwFirst) < (dwLast-dwLo)) then
        begin
           if (dwLo < dwLast) then
           begin
              Stack[dwStack]:=dwLo;
              Stack[dwStack+1]:=dwLast;
              Inc(dwStack, 2);
           end;
           dwLast:=dwHi;
        end
        else
        begin
           if (dwFirst < dwHi) then
           begin
              Stack[dwStack]:=dwFirst;
              Stack[dwStack+1]:=dwHi;
              Inc(dwStack, 2);
           end;
           dwFirst:=dwLo;
        end;
     until (dwFirst >= dwLast);
  end;

end;


0
aikimarkCommented:
Russell,

No offense taken.  I was a 'Johnny-come-lately' to this discussion thread and most of the code had alread been shown.  I still consider myself unworthy of submitting original code to compete with you and Alex, so I relegated my time to the more pedagogical and esoteric topics. I wasn't trying to make more work for anyone.  I did think your examples' performance numbers might have bettered Alex and Hypoviax code with a little tweaking.  

However, it was more of a comment for future readers than an expectation of your time.  I've included a little of my own code for everyones' amusement (below).
_______________________
Thank you for including the Stephen Weiss reference.  Steve just stepped down as chair of the UNC Chapel Hill Comp Sci Dept after 15 years.  He is a GREAT teacher, friend, and colleague.
_______________________

My Delphi code example (don't laugh too hard)

// pick a middle-of-three values for the pivot
var
  Pvts: Array [1..3] of Integer;

Randomize;
Pvts[1] := Data^[RandomRange(L, R)];
Pvts[2] := Data^[RandomRange(L, R)];
Pvts[3] := Data^[RandomRange(L, R)];
InsertionSort(Pvts, 3);  //for this small of a list, an insert sort is a good choice

// Pvts[2] contains the middle-of-three value which improves
// quicksort performance by about 14%
0
Russell LibbySoftware Engineer, Advisory Commented:
aikimark ,

Glad no offense was taken, because none was meant...

And you are MORE than worthy of submitting code, as a way for us to all better our skills. The topics that you brought up, with which I am familiar with some (but not all), were pointed and valid, but my problem was in trying to cover such a broad topic as "sorting". Lets face it, given this "contest", and the ranges of data size/max values, there could be no one function that could win for every scenario.

So, what I ended up doing in the end was trying to cover a multitude of sorting functions; linear, log, and quadratic, discussing when to use each, and when not to use them, etc. But, this unfortunately did not leave me with the time required to fine tune each function. Perhaps at a later date I might post a library of two dozen or so sorting functions.... after i clean and tune them all that is. ;-)

Regarding Stephen Weiss; while I do not know him personally, I can say based on his writings that he IS a great teacher. He is well spoken, and is able to apply his topics of teaching to practical coding (vs those that speak totally about theory)

I envy you for having such colleagues...

Best regards and well wishes,
Russell

0
aikimarkCommented:
Russell,

If you get to NC, please contact me through the RTP Delphi Interest Group and I'll be glad to introduce you to Steve and we'll take a tour through the Comp Sci labs (neat VR stuff).

Steve used Mr. Wizard (Don Herbert) as his teaching role model.  I used both Steve and Mr. Wizard (and others) as my teaching role models.

I use Delphi experts, such as you, to assimilate and better my Delphi programming skills.  I'm standing on the shoulders of giants, as it were.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Delphi

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.