Solved

Is % faster than /

Posted on 2004-04-22
473 Views
Last Modified: 2010-04-15
I just wanted to know:

Is the mod operation on integers faster than division on integers?
And the reasons for either answer.

Just out of curiosity.
0
Question by:ankuratvb
• 7
• 6
• 5
• +4
26 Comments

LVL 4

Expert Comment

ID: 10887435
ankuratvb,

No, not on IA32.  They both use idiv.  The modulo goes to edx and quotient goes to eax.

b.

0

LVL 2

Expert Comment

ID: 10887495
Just curious......
Even a SAR will do division by shifting the bits right......correct ?

When is an IDIV used and when is a SAR used ?

Cheers
0

LVL 4

Accepted Solution

booki earned 30 total points
ID: 10887496
ankuratvb,

That is.. they are the same in terms of speed.  For unsigned integers div is used for both operations.

b.
0

LVL 9

Author Comment

ID: 10887545
Thats what i was also thinking.For mod,you have to do division.

Would it also be the same for floating point numbers as well.
Other than the overhead of the function call for fmod(),would they both execute at the same speed.
0

LVL 4

Expert Comment

ID: 10887547
anupvijay,

>> Even a SAR will do division by shifting the bits right......correct ?

Yes but only for divisors that are powers of two.

>> When is an IDIV used and when is a SAR used ?

SAR/SHR is used when division is by a divisor that is a power of two and it is an immediate value (constant) or for the '>>' operator.  SAR is used for signed integers.  SHR is used for unsigned integers.

b.
0

LVL 4

Expert Comment

ID: 10887696
ankuratvb,

>> Would it also be the same for floating point numbers as well.

No, fprem determines a 'partial remainder' through iterative subtraction and may require multiple execution. While division is done with a single instruction, fdiv.  In other words, division will likely be faster.

b.
0

LVL 46

Expert Comment

ID: 10887714
Mod always with IDIV?? I thought that a%32 is equivalent to a&31...
0

LVL 22

Expert Comment

ID: 10887731

Be careful, doing mod or div with masks and shifts works a bit differently for negative numbers!

0

LVL 9

Author Comment

ID: 10887734
Thanx.I have one clarification.

I couldnt get this part:

>*fprem* determines a 'partial remainder' through iterative subtraction and may require >multiple execution.

0

LVL 4

Expert Comment

ID: 10887796
sjef_bosman,

Yes, you are correct.  With immediate values that are powers of 2, the compiler can optimize and use 'bitwise and'.  Similarly, for arithmetic (integer) operations '*' and '/' that involve immediate values that are powers of two, the compiler can optimize.  However, this is not possible for the general case.

b.
0

LVL 12

Assisted Solution

stefan73 earned 30 total points
ID: 10887994
Hi sjef_bosman,
> Mod always with IDIV?? I thought that a%32 is equivalent to a&31...

That's an optimizer issue. Sure it replaces those by & (%) and >> (/).

Modulo is basically the same as

#define MOD(a,b) (a-(b*(a/b)))

...so division is faster.

Cheers,
Stefan
0

LVL 4

Expert Comment

ID: 10888023
ankuratvb,

// fprem, is implemented something like the following:

bool fprem(float *a, float b) {
int d = a.exponent - b.exponent; // pseudo-code
if (d < 64) {
*a -=  b*(int)(a/b);
return 0;
} else {
*a -= b*(int)(a/b); // not exact.. see intel manual for exact formula
return 1;
}
}

// so fmod may need to execute fprem more than once.

float fmod (float a, float b) {
float r;
r = a;
while (fprem(&r,b));
return r;
}

main () {
float a,b,rem;
a = VERYBIG;
b = VERYSMALL;
rem = fmod(a,b);
}

b.
0

LVL 46

Expert Comment

ID: 10888476
stefan73,

Division faster than AND? What can be faster than an AND in this case? Microcode division stuff is usually a repeated subtraction, the MOD comes for free.

Btw, I only reacted on the statement that "For mod, you have to do division." Just meant to show that if DIV can be done by shifting, MOD can be done using AND.

Sjef
0

LVL 2

Expert Comment

ID: 10888675
Assuming that divisor is in powers of 2

Is IDIV faster or SAR ?
Or is AND faster, if only MOD is required ?

I am not an expert in low level programming...so if that looks stupid please ignore :-))
May be I am talking of CPU cycles here.....but I do not know :-)

Cheers.
0

LVL 9

Author Comment

ID: 10888978
>For mod, you have to do division

What i meant was to calculate the mod value,you'd have to perform the *division* operation(not necessarily IDIV),whether it does so by IDIV or SAR is irrevelant.
0

LVL 4

Expert Comment

ID: 10892717
Run this test program. Div is always faster than mod on my PC. When in doubt, I always use a timed benchmark.

------------------------------------------------------------------------------------------------------------------------
#include <iostream>
#include <time.h>
using namespace std;

int repetitions=10000;

clock_t start_time;                        // time at test start
clock_t time_elapsed;                  // time at test end
int c;

void main()
{
start_time=clock();                  // test mod time

for(int i=1; i<repetitions; i++)
{
for(int j=1; j<repetitions; j++)
{
c = i%j;
}
}
time_elapsed = clock()-start_time;

cout << "total time for mod: " << time_elapsed << endl;

start_time=clock();                  // test div time

for(i=1; i<repetitions; i++)
{
for(int j=1; j<repetitions; j++)
{
c = i/j;
}
}
time_elapsed = clock()-start_time;

cout << "total time for div: " << time_elapsed << endl;

}
---------------------------------------------------------------------------------------------
0

LVL 4

Expert Comment

ID: 10892767
oops you dont need the "using namespace" heh. Anyway, no amount of reading theory will help you with really specific performance questions like this (at least IMHO). Make test programs =)
0

LVL 4

Expert Comment

ID: 10894825
anupvijay,

Assuming IA32...

>> Is IDIV faster or SAR ?

..SAR is much faster.

>> Or is AND faster, if only MOD is required ?

..there is no MOD instruction.  Generally IDIV/DIV is used for integer modulo and 'AND' is much faster.

>> May be I am talking of CPU cycles here.....

Yes, clock cycles.

Fippy_Darkpaw,

>> ..no amount of reading theory will help you with really specific performance questions like this

Though benchmarking does have its place in comparative perfomance, I'm afraid you're sorely underestimating the value of reading.

ankuratvb,

Adding to my previous post on fprem, I understand that fprem rarely requires multiple execution so the main reason floating point division is faster than a floating point modulo operation is that the modulo requires a division, multiplication and a subtraction as pointed out earlier by stefan73.

b.
0

LVL 9

Author Comment

ID: 10895853
Fippy_Darkpaw,

I had tried timing programs on this but i got variable results that couldnt conclusively say which is faster.

Try running your program multiple times and see the results:

Here is what i got:

reps=10000
total time for mod: 495
total time for div: 504
total time for mod: 495
total time for div: 492
total time for mod: 497
total time for div: 499

reps=100
total time for mod: 5
total time for div: 5
total time for mod: 5
total time for div: 5

reps=5000
total time for mod: 128
total time for div: 123
total time for mod: 124
total time for div: 123
total time for mod: 125
total time for div: 124
total time for mod: 123
total time for div: 123

reps=10000
total time for mod: 497
total time for div: 495
total time for mod: 497
total time for div: 494
total time for mod: 493
total time for div: 493

0

LVL 9

Author Comment

ID: 10895886
>reps=100
should be
reps=1000
0

LVL 46

Expert Comment

ID: 10897411
The only way to find out what's happening is to disassemble the executable, or to generate assembly. Probably, most likely, in both cases IDIV will be used, and the total number of instructions executed won't be different. If there is a difference, it can only be that additional tests are required for negative numbers or so. In this case, differences are only due to the OS, handling interrupts etc. when the loop is busy. If run under DOS, the results should be more reliable.

And don't use an optimizing compiler, a good one might remove these loops completely since the results are discarded immediately so the loops can be considered useless.

By the way, do you think that the next line is a valid C statement?
12345;
0

LVL 9

Author Comment

ID: 10907825
>12345;
Yes.

The compiler treats the 12345 as a constant value.

All of these will work:
'a';
123.4;
"abcdef";

The compiler would create temp. variables for these constant values and on replacing would look something like:

Taking 12345; as an example:
the compiler 'd create a temp variable say temp_t.

So,the statement would become:

temp_t;

which is a perfectly valid C statement.
0

LVL 46

Expert Comment

ID: 10908182
Don't confuse the language C and a compiler for the language C. Indeed
12345;
is a valid C statement, for it is an expression followed by a semicolon, or, in BNF, it is accepted according to the rule
<expression> ';'
just like in fact a function call like
func();
Since all functions return a result, that may be discarded, this line is also an expression followed by a semicolon, hence a valid statement.

At least, in the original C (K&R) language. The current C language, with a lot more typing rules in it, might refuse the statement or at least warn about it. Another ideosyncrasy: I suppose nobody uses the comma as statement separator? Yeah, only in for-statements probably, but the following lines are also valid in C:
12345,6789;
12345+6789;

:)
0

LVL 4

Expert Comment

ID: 11038184
--- Fippy_Darkpaw,
---
---- I had tried timing programs on this but i got variable
---- results that couldnt conclusively say  which is faster.

Wow what comp did you run that on? I ran it on my P4 1.8 laptop with 512 RAM. Division is always substantially faster (like 20% or more). DIv vs. mod must be architecture dependent as well.

0

LVL 4

Expert Comment

ID: 11038192
Oh yeah. And what compiler did you use? I used VC++ 6.0. Perhaps a different compiler optimized differently?
0

LVL 46

Expert Comment

ID: 11038444
I changed the code above to fit a standard C compiler. I ran it on Linux, but the results were unpredictable because of the other processes running.

Nevertheless, did you try to assemble the program only? I.e. cc -S bookmark.c or comparable? If you look at the code generated for the div or the mod, it looks like this (Linux):

For the div:
.L7:
movl      -8(%ebp), %eax
cmpl      repetitions, %eax
jle      .L10
leal      -4(%ebp), %eax
incl      (%eax)
jmp      .L3
.p2align 2
.L10:
movl      -4(%ebp), %edx
leal      -8(%ebp), %eax
movl      %eax, -12(%ebp)
movl      %edx, %eax
movl      -12(%ebp), %ecx
cltd
idivl      (%ecx)
movl      %edx, c
leal      -8(%ebp), %eax
incl      (%eax)
jmp      .L7

And for the mod:
.L15:
movl      -8(%ebp), %eax
cmpl      repetitions, %eax
jle      .L18
leal      -4(%ebp), %eax
incl      (%eax)
jmp      .L11
.p2align 2
.L18:
movl      -4(%ebp), %edx
leal      -8(%ebp), %eax
movl      %eax, -12(%ebp)
movl      %edx, %eax
movl      -12(%ebp), %ecx
cltd
idivl      (%ecx)
movl      %eax, c
leal      -8(%ebp), %eax
incl      (%eax)
jmp      .L15

What's the difference? Nothing! Except for two move instructions, but they are absolutely equivalent. Please assemble your code only, and study the differences. I'm not convinced anyway ;)

Just a suggestion: put another outer loop over both loops, so you can eliminate program startup-stuff (if any).
0

Join & Write a Comment Already a member? Login.

Suggested Solutions

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ouâ€¦
Preface I don't like visual development tools that are supposed to write a program for me. Even if it is Xcode and I can use Interface Builder. Yes, it is a perfect tool and has helped me a lot, mainly, in the beginning, when my programs were smallâ€¦
The goal of this video is to provide viewers with basic examples to understand opening and writing to files in the C programming language.
The goal of this video is to provide viewers with basic examples to understand how to use strings and some functions related to them in the C programming language.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!