Speeding up Delphi floating point variable operations

drtester
drtester used Ask the Experts™
on
I'm a Delphi beginner, and messing with a program that uses floating point variables.  However, when timing part of my code, I'm seeing a simple instruction like:

 x := x + 1

taking literally thousands of CPU cycles to execute.  Looking at the compiled code, I see this line turned into:

 fld tbyte ptr [ebp-$20]
 fadd dword ptr [$0045fa98]  //I'm guessing this points to a "1" in FP,
                             //as the data is 3f 80 00 00
 fstp tbyte ptr [ebp-$20]
 wait

Open in new window


Why must I 'wait' for this to execute?  Shoudn't floating point on modern processors be quick?  Is there some config option I need to turn on in Delphi 7 to fix this?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
You're waiting for the FPU, more precisly:

"The fwait instruction pauses the system until any currently executing FPU instruction completes. This is required because the FPU on the 80486sx and earlier CPU/FPU combinations can execute instructions in parallel with the CPU. Therefore, any FPU instruction which reads or writes memory could suffer from a data hazard if the main CPU accesses that same memory location before the FPU reads or writes that location. The fwait instruction lets you synchronize the operation of the FPU by waiting until the completion of the current FPU instruction. This resolves the data hazard by, effectively, inserting an explict "stall" into the execution stream."

Author

Commented:
Ok, but my application will always be running on a P4 or higher, where I assume this wait is no longer necessary.  Is this not true?  

But even if it were waiting, why would it wait for hundreds or thousands of cycles?

Can this somehow be disabled?  

Is there a way within the debugger to change the wait into a noop, at least for testing?
Top Expert 2010
Commented:
you can insert assembler commands to your pascal code like this:

procedure TForm1.Button5Click(Sender: TObject);
var
  x, v: Double;
begin
  x:= 1;
  v:= 1;
  asm
    fld  x;
    fadd v;
    fstp x;
  end;
  ShowMessage( FloatToStr(x) );
end;
Angular Fundamentals

Learn the fundamentals of Angular 2, a JavaScript framework for developing dynamic single page applications.

Unfortunately I can only give speculation. But since Delphi 2010 gives the same compiled code I assume it is necessary.
Top Expert 2010

Commented:
comparison for 100 millions iterations: 1st = 1250, 2nd=1295
it seems not too faster

procedure TForm1.Button5Click(Sender: TObject);
var
  x, v: Double;
  i: Integer;
  Ticks1, Ticks2: Cardinal;
begin
  x:= 1;
  v:= 1;

  Ticks1:= GetTickCount;
  for i:= 0 to 100000000 do
    x:= x + v;
  Ticks1:= GetTickCount - Ticks1;

  x:= 1;
  Ticks2:= GetTickCount;
  for i:= 0 to 100000000 do
    asm
      fld  x;
      fadd v;
      fstp x;
    end;
  Ticks2:= GetTickCount - Ticks2;

  ShowMessage( Format('1st time %d, 2nd time %d', [Ticks1, Ticks2] ) );
end;
Geert GOracle dba
Top Expert 2009

Commented:
using downto instead of counting up, it does make a difference

---------------------------
Project3 -- counting down instead of up
---------------------------
1st time 328, 2nd time 313
---------------------------
OK  
---------------------------

in Delphi 2010,
i end up with 329 for this asm

  Ticks1:= GetTickCount;
  asm
    mov eax,$05F5E101;
    @startloop:
    fld  x;
    fadd v;
    fstp x;
    dec eax
    jz @next;
    jmp @startloop;
    @next:
  end;
  Ticks1:= GetTickCount - Ticks1;

when debugging you can see the asm code, in right click, debug, cpu view

Unless we start using QueryPerformance functions, the differences between the GetTickCounts are normal.

Author

Commented:
Looks like I'll have to code everything in asm.  Ok, thanks for the tip!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial