Speed optimization of method

The following class is meant to return any combination of P items of a collection of N items. It needs to be fast. In paticular it needs to be as fast as (or faster then) the for-loop equivalent. Currently it is about 3 times as slow. Maybe with some assembly it is possible to get it faster. I'm stuck though, I can't get it any faster. You might need to change the times.

type
  TForRange = record
    Current: Integer;
    Min: Integer;
    Max: Integer;
  end;
  PForRange = ^TForRange;

  TSuperPermut = class
  private
    fLoops: array of TForRange;
    fLen: Integer;
    fLast: Cardinal;
    fFirst: Cardinal;
  public
    constructor Create(N, P: Integer);
    function Next: Boolean;
  end;

{ TSuperPermut }

constructor TSuperPermut.Create(N, P: Integer);
var
  liLoop: Integer;
begin
  fLen := P;
  SetLength(fLoops, fLen);
  for liLoop := fLen - 1 downto 0 do
  begin
    fLoops[liLoop].Max     := N - liLoop;
    fLoops[liLoop].Min     := N - liLoop;
    fLoops[liLoop].Current := fLoops[liLoop].Max;
  end;
  fLoops[fLen - 1].Min := 1;
  fFirst := Cardinal(@fLoops[0]);
  fLast  := Cardinal(@fLoops[fLen-1]);
end;

function TSuperPermut.Next: Boolean;
var
  liMin: Integer;
  lLoop: PForRange;
begin
  Result := True;
  lLoop  := Pointer(fFirst);
  Dec(lLoop.Current);
  while (lLoop.Current < lLoop.Min) do
  begin
    Inc(lLoop);
    if Cardinal(lLoop) > fLast then
    begin
      Result := False;
      Exit;
    end;
    Dec(lLoop.Current);
  end;
  while Cardinal(lLoop) > fFirst do
  begin
    liMin := Succ(lLoop.Current);
    Dec(lLoop);
    lLoop.Min := liMin;
    lLoop.Current := lLoop.Max;
  end;
end;

The test can be done using the following code:

procedure TForm1.Button6Click(Sender: TObject);
var
  liCount: Integer;
  time: TDateTime;
begin
  liCount := 0;
  time := Now;
  with TSuperPermut.Create(31, 10) do
  try
    repeat
      Inc(liCount);
    until not Next;
  finally
    Free;
  end;
  Memo1.Lines.Add(IntToStr(liCount) + ' - ' + FormatDateTime('hh:nn:ss:zzz', Now-time));
end;

The for-loop equivalent:

procedure TForm1.Button4Click(Sender: TObject);
var
  liCount: Integer;
  time: TDateTime;
  i01, i02, i03, i04, i05, i06, i07, i08, i09, i10: Integer;
begin
  time := Now;
  liCount := 0;
  for i01 := 22 downto 1 do
    for i02 := 23 downto i01+1 do
      for i03 := 24 downto i02+1 do
        for i04 := 25 downto i03+1 do
           for i05 := 26 downto i04+1 do
             for i06 := 27 downto i05+1 do
               for i07 := 28 downto i06+1 do
                 for i08 := 29 downto i07+1 do
                   for i09 := 30 downto i08+1 do
                     for i10 := 31 downto i09+1 do
                       Inc(liCount);
  time := Now - time;
  Memo1.Lines.Add(IntToStr(liCount) + ' - ' + FormatDateTime('hh:nn:ss:zzzz', time))
end;

Every run should give a line like this in the memo:
44352165 - 00:00:01:082

If the count is incorrect the algoritm is broken.

Regards Jacco
LVL 10
JaccoAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

JaccoAuthor Commented:
Sorry, you might need to change the types. (correction to last sentence of the Q)
0
ZhaawZSoftware DeveloperCommented:
Do you need to return combinations or count of combinations?
0
ZhaawZSoftware DeveloperCommented:
If you need a count of combinations, then there's much easier way of calculating it (and it takes 0.00 secs on my amd athlon xp 1.6+, because there are only some loops and simple mathematical operations).

  function combinations ( n {total}, k {used} : integer ) : int64;
  var
    loop : integer;
    tmp1, tmp2 : integer;
  begin
  tmp1 := k;
  tmp2 := n - k;
  if tmp1 < tmp2 then begin
    tmp1 := tmp1 xor tmp2;
    tmp2 := tmp1 xor tmp2;
    tmp1 := tmp1 xor tmp2;
  end;
  result := 1;
  for loop := tmp1 + 1 to n do result := result * loop;
  for loop := 2 to tmp2 do result := result div loop;
  end;


  function variations ( n {total}, k {used} : integer ) : int64;
  var
    loop : integer;
  begin
  result := 1;
  for loop := n - k + 1 to n do result := result * loop;
  end;


Use combinations() to get count of combinations, if order of items is not significant, and variations(), if order of items is significant.
There's combinations() used in your example.
0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

aikimarkCommented:
ZhaawZ,

The count can be calculated from the formula.  This question arose from the need to enumerate the items.

http://www.experts-exchange.com/Programming/Programming_Languages/Delphi/Q_21354398.html
Formula:  Combinatorial Count = N! / (P! * ((N-P)!) )

I made the brash (and wrong) statement that there was a faster way.  I then opened a new question about bitcounting and fastest Combinatorial enumeration that wasn't hard coded (as was the case in the selected answer to the above question).
http://www.experts-exchange.com/Programming/Programming_Languages/Delphi/Q_21355411.html

===================
If you are a wiz at recursive algorithms and methods, I would be interested in seeing that approach to the enumeration problem in my discussion thread above.
0
ZhaawZSoftware DeveloperCommented:
aikimark, I know that formula. Used it in function (with some modifications to make it much easier for CPU).
0
JaccoAuthor Commented:
Hi ZhaawZ,

I need the combinations themselves although this is not in the code. I am still optimising for speed. After every call to Next fLoops[liLoop].Current has part of the combination.

Hi aikimark,

Recursive algorithm tend to be slower because they consume stack space with every level. A very common methjod to optimise recursive methods is to remove the recursion.

Regards Jacco

0
aikimarkCommented:
Jacco,

I think I now have a better understanding of your implementation.  You are using the Current as a replacement for a recursion parameter.  I'm not a Delphi wiz and didn't fully understand your latest code.
0
Russell LibbySoftware Engineer, Advisory Commented:

Jacco,

Its impossible to get the class code to match the speed of the inline code. Running the original code you gave resulted in:

188ms for the inline for loops.
560ms for calling the  Next method of the class instance.

Im sure you realize that there is a small hit that is taken when calling a method, but to measure it, I replaced the Next method code with this (FCount is private to the class):

function TSuperPermut.Next: Boolean;
begin

  Inc(FCount);
  result:=not(FCount = 44352165);
 
end;

The result?

288ms for calling the Next method, even though the method did very little. When you consider the method is called 44 million some odd times, its not too suprising. So while your class is flexible, it gains this flexibility at the price of speed.

Regards,
Russell


0
JaccoAuthor Commented:
Thanks rlibby,

I am still hoping for some speed gain. I would settly for 376 ms on your machine (twice that of for loops). You don't see any chance of improving the speed? Can't I unroll some loop or use ar different type somewhere?

Regards Jacco

aikimark: you're right. After every next the Current values of the ForRange array has the values of the indices of the combination. They could be returned very fast.
0
Russell LibbySoftware Engineer, Advisory Commented:
Jacco,

I can help you somewhat, but you also have to be realistic with this. ;-)

Using your original code, the timing was between 560-580ms. (10 run avg of 577ms / run). Using the code given below, the timing dropped to between 500-515ms (10 run avg of 510ms / run). A small speed gain of ~67 ms, but a gain none the less.

To put it all into perspective though, I added a stubbed method called Test that returned a boolean, and then called this 44352165 times from the button click code. The time for this was between 170-180ms. This is the MINIMUM overhead that you must be willing to accept, in order to have a flexible class (vs hard coded FOR statements) that can have its indices checked at any point. When you take that into consideration:

510ms-175ms = 335ms (time to execute the algorithm)

The ACTUAL algorithm code execution time is only 1.8x slower than the inline coded function (based on 188ms time).

Regards,
Russell

type
  PForRange      =  ^TForRange;
  TForRange      =  packed record
     Current:    Integer;
     Min:        Integer;
     Max:        Integer;
  end;
  PForRangeArray =  ^TForRangeArray;
  TForRangeArray =  Array [0..Pred(MaxInt shr 4)] of TForRange;
  TSuperPermut   =  class
  private
     // Private declarations
     FLoops:     PForRangeArray;
     FCount:     Integer;
  public
     // Public declarations
     constructor Create(N, P: Integer);
     destructor  Destroy; override;
     function    Next: Boolean;
     function    Test: Boolean;
  end;

constructor TSuperPermut.Create(N, P: Integer);
var  dwLoop:        Integer;
     dwValue:       Integer;
begin

  // Inherited
  inherited Create;

  // Save item count
  FCount:=P;

  // Allocate memory for loop array
  FLoops:=AllocMem(SizeOf(TForRange) * P);

  // Set array data
  for dwLoop:=Pred(P) downto 0 do
  begin
     dwValue:=N-dwLoop;
     FLoops^[dwLoop].Max:=dwValue;
     FLoops^[dwLoop].Min:=dwValue;
     FLoops^[dwLoop].Current:=dwValue;
  end;

  // Set min value for first item
  FLoops^[Pred(P)].Min:=1;

end;

destructor TSuperPermut.Destroy;
begin

  // Free loop array memory
  FreeMem(FLoops);

  // Inherited
  inherited Destroy;

end;

function TSuperPermut.Next: Boolean;
var  lpLoop:        PForRange;
     dwValue:       Integer;
     dwLoop:        Integer;
begin

  // Get pointer to first item in the array
  lpLoop:=@FLoops[0];

  // Set current loop index and count
  dwLoop:=0;
  dwValue:=FCount;

  // Walk the array
  while (dwLoop < dwValue) do
  begin
     // Compare current and min
     if (lpLoop^.Current > lpLoop^.Min) then break;
     // Next item
     Inc(lpLoop);
     Inc(dwLoop);
  end;

  // Check for finish
  if (dwLoop < dwValue) then
  begin
     // Decrement the current value for the item
     Dec(lpLoop^.Current);
     // Walk the array back
     while (dwLoop > 0) do
     begin
        // Get current value
        dwValue:=Succ(lpLoop^.Current);
        // Previous item
        Dec(lpLoop);
        Dec(dwLoop);
        // Update the min value
        lpLoop^.Min:=dwValue;
        // Update the current value
        lpLoop^.Current:=lpLoop^.Max;
     end;
     // Success
     result:=True;
  end
  else
     // Finished
     result:=False;

end;

function TSuperPermut.Test: Boolean;
begin

  // Just return true
  result:=True;

end;


0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
aikimarkCommented:
Russell & Jacco,

As always, I'm humbled in the presence of truely advanced Delphi coding expertise.  You are welcome in my (preceding) question as well (Q_21355411).
0
aikimarkCommented:
Russell,

What if we used the TBit class?
0
JaccoAuthor Commented:
Hi rlibby,

Thanks for trying and making clear there is almost no gain possible. I tried code along the way of your solution too but thought the improvements to be so small. I didn't think the call to Next would be that time consuming. I will add methods to inspect the indices now and consider the project finished :-)

Thanks again!

Regards Jacco
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Delphi

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.