You might consider using a profiler. Here is a link to a PAQ
http://www.experts-exchang
Main Topics
Browse All TopicsI am trying test different algorithms for various math algorithms and I need to time them so I can use the fastest ones. I cant find what I need to do this.. on what I have I just get 0. I am using only a square root function on this example. I know it to be accurate, but I want to see if it is faster than the math library function sqrt().
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
You might consider using a profiler. Here is a link to a PAQ
http://www.experts-exchang
>> rtime = (end - start/ CLOCKS_PER_SEC)/1000;
If you change this line to :
rtime = (((double) (end - start)) / CLOCKS_PER_SEC) * 1000.0;
you'll get the elapsed time in milliseconds. However, it won't be with millisecond precision, since the system clock usually doesn't have that kind of precision (it's often in the order of 20 ms).
What is often done, is to run the same code 1000 times, and then use a timer with at least second precision, and divide the result by 1000
The case could be that your function is really fast. Then repeat it say 100000 times. To do timing corrections, implement the same function with simpler body (basically empty but ensure its call is not optimized out) and run the same test (the same times). Then you can compare the results.
The above snippet tailored to your code is shown below (the repetition not done here).
For sqrt from <math.h> it gives me on my computer
|The square root of 100000 is 316.228
|
|0.000209524 ms (3 cnt)
Notice that you should not believe the precision of 0.000209524 here. The 3 ticks of the counter is much more accurate view here. In other words, the function should be tested in the loop to get more accurate comparison.
When testing your function against the existing function, you want to have some relation, not neccessarily the exact time. If you only want to determine "slower/faster", then the looped testing is easier (no need to make a correction for the code that implements the loop).
>>>> If you change this line to :
>>>> rtime = (((double) ) / CLOCKS_PER_SEC) * 1000.0;
On 32 bit Windows it simply is
int diff = (end - start);
and gives milliseconds.
The accuracy of clock() is between 15 and 16 milliseconds.
>>>> What is often done, is to run the same code 1000 times,
Normally you would need a million times or more on modern hardware to get run times greater 15 milliseconds.
And there is some more issue that the compiler will optimize such a loop if the results do not change in the loop or if the result calculated in the loop was not used below later. So, you might need to sum up the sqare root within the loop and print the sum later to not being trapped by the optimizer.
>> why not do it properly, and make the code correct for any compiler/platform ?
Well, it is possible to make the code nicely written and portable. On the other hand, intrigue63 wants to test a function that takes much less time than the clock() resolution. If one is making an application for Windows... The pragmatic approach is "if it can be done better, do it better". The QueryPerformanceCounter() has much smaller resolution. It can be obtained using QueryPerformanceFrequency(
The call of the QueryPerformanceCounter() takes about 11 ticks -- hence the correction for the call inside the ElapsedTimeAsString().
Infinity08: Yes, I know. You have cited him. ;) I usually try to write towards the asker, not against some other reasoning or comments. I only wanted to emphasize that it could be reasonable to use the platform-specific solution. And I wanted to show that usage of performance counters on Windows can be usefull in this case, and I wanted to illustrate why it is usefull.
>>>> That would only make sense of CLOCKS_PER_SEC has the value 1000. Is that the case ?
I would assume so cause I never did it differently in the last 15 years ;-)
>>>> why not do it properly, and make the code correct for any compiler/platform ?
It is properly on Windows while the first approach of intrig63 was wrong. IMO, for testing purposes you should make it stupid simple and don't spoil the tests with portability considerations that are currently off-topic.
>> IMO, for testing purposes you should make it stupid simple and don't spoil the tests with portability considerations that are currently off-topic.
It's not about portability. It's about correctness. It's not because your current platform uses the value 1000, that the next version will use that value, nor does it mean that intrigue63's platform uses that value.
This value (1000) is documented nowhere (at least I couldn't find it), and seems to be an assumption you made, probably based on an observation of the behavior for your platform(s).
If the code will only ever run on Microsoft platforms, there still is no guarantee that CLOCKS_PER_SEC equals 1000, now and/or in the future.
Here's the MSDN reference for the clock function :
http://msdn.microsoft.com/
CLOCKS_PER_SEC is explicitly mentioned, and explicitly used in the example code (as it should). But there is no mention of CLOCKS_PER_SEC being equal to 1000.
Imo, giving a correct solution (that is guaranteed to always work) is better than giving a solution that might work right now on a specific platform, but might fail tomorrow.
After all, learning it the right way is easier (and more efficient) than having to un-learn bad habits.
itsmeandnobodyelse: >> Is it available with standard installation of VC Pro?
The functions are in the Windows kernel (kernel32.dll) and are available since Windows 95 and Windows NT 3.1. Here is the documentation: http://msdn.microsoft.com/
>> Does the timer itself make any impact on the measured time?
The documentation for the QueryPerformanceFrequency(
But definitely, getting the counter value takes some time. Because of that I did the simple correction for the QueryPerformanceCounter() function call inside my implementation of the ElapsedTimeAsString() -- see http:#23865957 lines 46-52 and 57. The truth is that it is rather sloppy implementation because it does not use the higher part of the LARGE_INTEGER. This way it occasionally can produce very surprising (and very wrong) result -- when the first QueryPerformanceCounter() was called before overflow of the highest bit from lower part and the second QueryPerformanceCounter() after incrementing the higher part.
Infinity08: I may be wrong, but I guess that the 1000 comes from converting seconds to miliseconds. If CLOCKS_PER_SEC means what it says then clock_ticks / CLOCKS_PER_SEC means the number of seconds. Multiplied by 1000 means number of miliseconds (invariantly on what is the value of the CLOCKS_PER_SEC. Basically, it is exactly the same in principle as using the high resolution timer values and the frequency obtained via QueryPerformanceFrequency(
Of course, pepr, but this is still about Alex's suggestion that :
int diff = (end - start);
gives a value in milliseconds. If CLOCKS_PER_SEC equals 1000, then it does. If CLOCKS_PER_SEC has a different value, it doesn't. My point is that you shouldn't assume that CLOCKS_PER_SEC equals 1000, since there's no guarantee for that. Not even in the Microsoft documentation.
>>>> This value (1000) is documented nowhere (at least I couldn't find it), and seems to be an assumption you made,
Sorry, the statement I posted did work over ten or even fifteen years. I didn't make any assumption cause I used the clock function before knowing of the CLOCKS_PER_SEC macro. If it ever wouldn't work on a platform I easily would recognize that the result is wrong if for example the result would be suddenly in microseconds. It would be strange but surely I would be able to correct it. Hence, correctness never was in danger ;-)
There are more dangerous assumptions than assuming clock will return milliseconds on Windows platform. For example any bit shift operation on a unsigned integer probably will assume a 32bit integer. Also I don't know any non-trivial program which wouldn't have problems with a 16bit char type. What is with header files where each compiler may have its own philosophy and where there a only a few standards cross-platform? Shouldn't we set the focus to the real problems but being stubborn on searching problems where there are none?
Alex, there's no point in arguing over this, since you have your idea, and I have mine.
Let me just state that using CLOCKS_PER_SEC to interpret the value returned by clock() is the only correct way to do things. Anything else relies on assumptions, and is thus not reliable.
Since this question can be read by many people with different platforms/environments, now and for years to come, I maintain that suggesting to ignore CLOCKS_PER_SEC, and assume that it has the value 1000 is dangerous.
Furthermore, it might be true that Microsoft platforms have CLOCKS_PER_SEC set to 1000, but that is documented nowhere, and could change with any next version of their compiler and/or C runtime. So, even on Windows, it's not recommended to make assumptions about its value.
That's it. Unless intrigue63 wants to clarify his choice of the accepted reply, I'm out of here ;)
>>>> I maintain that suggesting to ignore CLOCKS_PER_SEC, and assume that it has the value 1000 is dangerous.
I maintain that using a function like clock not necessarily implies to making any thoughts about CLOCKS_PER_SEC or considering methods to normalise the return value. If your program requires such a normalisation things are different. But I couldn't see such a requirement for a simple test program. Hence, it is just a 'doing things where no problem exists' what indeed is dangerous (as the sample proves).
Ok this is what works on Linux:
#include <iostream>
#include <time.h>
#include <math.h>
#include "mymath.h"
using namespace std;
int main() {
clock_t start, end;
double rtime = 0;
double num;
double Number = 100000;
//c-lib sqrt(num)
start = time(0);
num = sqrt(Number);
end = time(0);
rtime = (end - start / CLOCKS_PER_SEC);
//printout
cout << "sqRoot(" << Number << ") is " << num << "\n\n";
cout<< rtime <<'\n';
rtime = 0;
//my sqrt1(n)
start = time(0);
num = mymath::sqrt1(Number);
end = time(0);
rtime = (end - start / CLOCKS_PER_SEC);
//printout
cout << "sqRoot1(" << Number << ") is " << num << "\n\n";
cout<< rtime <<'\n';
output:
charybdis[12]% ./run
sqRoot(100000) is 316.228
1.23731e+09
sqRoot1(100000) is 316.228
>> Ok this is what works on Linux:
It's not really correct though :)
For example, time(0) returns a time_t, not a clock_t. It's clock() that returns a clock_t.
In this line :
>> rtime = (end - start / CLOCKS_PER_SEC);
division has higher priority than subtraction, so what you do is this :
rtime = (end - (start / CLOCKS_PER_SEC));
which is not really what you wanted.
So, it seems that you don't have your answer yet, even though you've already closed the question. You can request to get the question re-opened (click the "Request Attention" link, and explain that you want to re-open the question), so we can help you further, and ultimately, you can accept those posts that answered your question.
Allow me to refer back to my earlier post http:#23865554 where I showed the correct use of clock() to get an elapsed time in milliseconds. Just replace that line in your original code (the code you posted in your question), and see what result you get.
ok with
rtime = (end - start / CLOCKS_PER_SEC);
output:
charybdis[21]% make
g++ -c -g -Wall main.cpp
g++ -o run main.o mymath.o
charybdis[22]% ./run
sqRoot(100000) is 316.228
1.23731e+09
sqRoot1(100000) is 316.228
1.23731e+09
charybdis[23]%
output with line: rtime = (end - (start / CLOCKS_PER_SEC));
charybdis[24]% ./run
sqRoot(100000) is 316.228
1.23731e+09
sqRoot1(100000) is 316.228
1.23731e+09
charybdis[25]%
output with line rtime = (((double) (end - start)) / CLOCKS_PER_SEC) * 1000.0;
sqRoot(100000) is 316.228
0
sqRoot1(100000) is 316.228
1.23731e+09
charybdis[27]%
hmmmm
btw thanks for checking this out
>> ok with
>> rtime = (end - start / CLOCKS_PER_SEC);
Is incorrect as I said earlier.
>> output with line: rtime = (end - (start / CLOCKS_PER_SEC));
Is exactly the same, so it's also incorrect.
>> output with line rtime = (((double) (end - start)) / CLOCKS_PER_SEC) * 1000.0;
Is only correct if end and start are values returned by clock(). Are they ?
Did you use your original code as I suggested ?
Note also that if you only measure one execution of the sqrt function, it might be over too fast to be measurable. Try putting the sqrt call in a loop that executes 1000 or 1000000 times, and measure how long it takes to execute the entire loop.
thanks Infinity08, I overlooked something simple I changed the time(0) to clock() I forgot to do that and I put everythin g in a for loop as you suggested. and changed my number to calculate random numbers. a coleague of mine said that the compiler cached the results in optimization and that was another issue.
so this is what I did:
using namespace std;
int main() {
clock_t start, end;
double rtime = 0;
double num;
double Number;
int i;
srand(time(NULL));
//c-lib sqrt(num)
start = clock();
for(i=0; i < 1000000; i++){
Number = (rand() / (RAND_MAX + 1.0)) * 102000000.0;
num = sqrt(Number);
}
end = clock();
rtime = ((((double) end - start) / CLOCKS_PER_SEC) * 1000.0);
//printout
cout << "sqRoot(random number) ran 1000000 times " << '\n';
cout<< "ran in " << rtime << " units of measurement." << '\n';
rtime = 0;
//my sqrt1(num)
start = clock();
for(i=0; i < 1000000; i++){
Number = (rand() / (RAND_MAX + 1.0)) * 102000000.0;
num = mymath::sqrt1(Number);
}
end = clock();
rtime = ((((double) end - start) / CLOCKS_PER_SEC) * 1000.0);
//printout
cout << "sqRoot(random number) ran 1000000 times " << '\n';
cout<< rtime <<'\n';
//my sqrt2(num)
start = clock();
for(i=0; i < 1000000; i++){
Number = (rand() / (RAND_MAX + 1.0)) * 102000000.0;
num = mymath::sqrt1(Number);
}
end = clock();
rtime = ((((double) end - start) / CLOCKS_PER_SEC) * 1000.0);
//printout
cout << "sqRoot(random number) ran 1000000 times " << '\n';
cout<< rtime <<'\n';
rtime = 0;
return 0;
output for three different square root functions:
sqRoot(random number) ran 1000000 times
ran in 10 units of measurement.
sqRoot(random number) ran 1000000 times
240
sqRoot(random number) ran 1000000 times
230
>> sqRoot(random number) ran 1000000 times
>> ran in 10 units of measurement.
>> sqRoot(random number) ran 1000000 times
>> 240
>> sqRoot(random number) ran 1000000 times
>> 230
So, they both finished in around 230-240 ms, while the standard sqrt function finished in around 10 ms (quite a bit faster it seems).
>> a coleague of mine said that the compiler cached the results in optimization and that was another issue.
Yes, that's possible. To avoid that, try compiling the code without optimization (change the compiler flags for that).
Business Accounts
Answer for Membership
by: rafael_accPosted on 2009-03-11 at 02:35:22ID: 23855091
The problem is that clock() returns a clock_t type, which I believe is a long int (I think!).
Try this instead:
time_t start, end;
double dif;
time (&start); -> Here you get the time (before your code starts)
/* YOUR CODE HERE */
time (&end); -> Here you get the time (after your code ends)
dif = difftime (end, start);
I think this code will work