I wrote two programs that sums the series 1 (1/2) + (1/2) +... + (1/2)
in left-to-right order, and in right-to-left order. Are the results the same or different?
Explain why.
My programs:
First Program
#include <iostream.h>
main(){
double Sum = 0;
long double Denominator = 1.2676506E30;
double Numerator = 1;
int ValuesProcessed = 0;
double Fraction;
while (ValuesProcessed < 100)
{
Fraction = Numerator / Denominator;
Sum += Fraction;
Denominator /= 2;
++ValuesProcessed;
}
cout << "Sum is: " << Sum << endl;
cout << "Values is: " << ValuesProcessed << endl;
return 0;
}
Second program
#include <iostream.h>
main(){
double Sum = 0;
long double Denominator = 1;
double Numerator = 1;
int ValuesProcessed = 0;
double Fraction;
while (ValuesProcessed < 100)
{
Fraction = Numerator / Denominator;
Sum += Fraction;
Denominator *= 2;
++ValuesProcessed;
}
cout << "Sum is: " << Sum << endl;
cout << "Values is: " << ValuesProcessed << endl;
return 0;
}
I assume this is an accademic question. We cannot provide answers to accademic questions. That is unethical. However we can give you some limited assistance, like you might get from your teacher.
I can tell you this much, according to the rules of mathematics, there should be no difference in the results, because as ntdragon said, you can sum numbers in any oorder and get the same results (communitive property of addition). However when you do this on a computer you will definitely get a difference. There is a good reason for this. (good int he sense that it makes sense for this to happen, bad in the sense that it would be best if it didn't happen.) Think about it.
nietod
what accademic question ??
jcardenas has a bug in his prog
and i helped him to find it
now he have to fix it
0
With monday.comâ€™s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.
Since the mantissas are only 63 bits for long doubles, the differences will start occurring when you get to the 63rd term of the sum. I'm assuming here that everything is done in assembly code AND not by the compiler.
For example, 1/2^100 has a mantissa of all zeroes. BUT, when you add 1/2^99 then the mantissa looks like
1000.... and the pattern continues (add the next term in the sum and add an additional 1). You run out after 63 sums.
SO, since errors are introduced, forget about doing any coding, why would one expect the sums to be the same?
On the other hand, if you programmed the Intel coprocessor yourself, and, as ntdragon correctly pointed out, if the sums were actually trying to compute the exact same sum, you would get identical results. I have to stress that I'm assuming you don't let the compiler generate the asm code, you do it yourself!
Glenn
>> if you programmed the Intel coprocessor yourself,
>> and, as ntdragon correctly pointed out, if the sums
>> were actually trying to compute the exact same sum,
>> you would get identical results
No you would not.
There is no difference between writting floating point assembly directly or having the compiler write it for you. (Well there is one difference--Its easier to have the compiler do it.).
Guys, pay attention.
A) this is an accademic question. The person that invented the situation already knew the answer. That means we can help, but wecan't give away the answer.
B) This is not some sort of bug or compiler inadequacy. This is a fundamental issue of floating point calculations.
0
jcardenasAuthor Commented:
Ok, friends thanks for your help, is a good job locate any bug in my programs, please i need your help to fix my programs. Later will use this code into the mathematical application.
There is a bug, ntdragon described it. But fixing that is not enough to get the two algorithms to agree.
What was the assignment you were given? What was the material you've been studying on?
0
jcardenasAuthor Commented:
Sorry, this is not a assignment, this is not study material, this is a sample code to test some calculations. We are working with some local mathematical application to help our geological group to manage the geological samples; and somebody asked the question over sum of series of big numbers left to right and right to left; i am novice with C++ and donÂ´t have the clear answer.
This is a Expert Group, and we are looking for help not for Police investigation or academic trial.
summing from the smallest absolute values first will generally be more accurate, but if you care about the accuracy of your result, you should also analyse the numerical stability of your algorithms with an understanding of the error characteristics of the numbers on your machine.
Sorry for the mixup, but this is a frquently asked accademic question. (It occurs in text books etc.).
The answer is that floating point numbers have very limited precision, a typical double will store about 21 decimal digits of precision. (In this case "decimal "means base 10, not "decimal" as in after the decimal point. in a floating point number, the number stores the most significatt digit it needs to and about 21 digits after that, so if a number is very large, like 1234567890.0 there are 10 digits before the decimal point so it can store only 21 digits after that first digt, so it can store only about 11 digits after the decimal point. but in a small number, like 0.000000001234567890 it can store about 21 digits after that first 1, so it can store many more decimal digits in this case. In other words, it can store about 21 decimal digits starrting from the must significant digit it has to store. This means that when the number it is storing is small, it is storeing more precise number than when the number is big.
Okay?
Well if you start adding with the small numbers first (like !/2)^100 which is very small you will be working with small numbers that have much greater absolute precission (they have more digits behind the decimal point). as you add these numbers you will get bigger and bigger values and start loosing absolute precission (number of digits after the decimal point) However the contribution of this final terms (small numbers) is not completely lost, it will just be rounded a little. So the result is a little rouinded or a little innacurate. But he result is not seriously incorrect.
However if you add startng with the small numbers you get much worse results In this case you start out with big numbers that have lower absolute precission. When you stat reaching the smaller numbers the last dgit stored of the total is actually more than the first digit stored on the small number, so the small number is just rounded to 0. So what happens is that most of the smaller numbers in the series just get ignored, because they are too small to be added into the larger total.
Just because a number, in base 10, requires 18+ digits does NOT mean it can't be represented in the FPU exactly!
Two simple examples are
1. 0.1 (base 10) can't be stored exactly but yet only needs 1 digit to represent it.
2. 2^100 requires 31 digits (base 10) but yet CAN be represented EXACTLY.
That was my earlier point. ALL the numbers he wants to sum are represented exactly and, although the eventual answers are identical, it is still off. BUT one has to note the very particular sum he is summing - change 2 to 5 and everything is off.
Glenn
>> does NOT mean it can't be represented in
>> the FPU exactly!
??? I think you mean it "can" not "can't", right? If so, I would agree, but that still is not what he is detecting. This is not a rounding error in individual terms.
I'm not following you.
The problem occurs because FP is unable to add large magnitude numbers to small magnitude numbers accurately. When the difference in magnitude exceeds about 20 orders, the result simply is the large number unchanged. That is the source of the problem.
Right. I agree that some numbers can be represented exactly. I agree that some numbers can't be represented exactly.
But no number that has more than about 21 orders of magnitude between its least significatn and most significant digit can be represented exactly.
This problem is this solution has about 31 orders of magnitiude between the first and last significant digit. Using assembly isn't going to change that.
>> 2^100, eventhough is 31 digits long,
But it is only 1 significant digit (in both binary and decimal).
0
jcardenasAuthor Commented:
Friends, this discussion is very academic,excellent and very clear for our group.
>>Using assembly isn't going to change that.
Nietod talk about assembly, my question is: The errors are presented only with hi-level languages as our question?? Or with assembly exist the same errors???
>> The errors are presented only with hi-level languages
>> as our question?? Or with assembly exist
>> the same errors?
This is a fundamental issue of floating point numbers. It has nothing to do with the language. So it will exist the same in assembly as in C++ or in any other language. I'm not sure why GlennDean said that, because it clearly is not true and he does not seem to be one to make that sort of mistake.
The only solution is to use FP numbers that have sufficiently large enough precission (which often is not a choice). Otherwiseo try to structure the mathematical operatiosn to give you the best results. In the case of sums, you should sort the numbers to be summed according to their absolute value before adding. This will yeild the most accurate result.
Since you probably have a scientific background, you are probably familiar with the rules for addition (and other operations) for significant digits. right? I could have explained it with that in mind. When you add two numbers you have to round the result to the position of the least significant digit of the number that has the highest least significant digit. The same is true with FP numbers. However FP numbers don't vary the number of significant digits they store. An 80 bit FP numbers stores about 21 significant (decimal) digits. So when a number is large, its least signficant digit is also large, (about 21 digits down from the most dignificant digit.) So when you add a large number and a small number, the rounding occurs at the most least digit of the large number, which can actually be BEFORE (bigger) than the most significant digit of the litle number. It is like adding
1.0*10^1 + 1.0*10^-2
The result is still 1.0*!0^1.
0
jcardenasAuthor Commented:
Ok, our solution is to use mathematical operations, with the enough precission.
right??
>> Ok, our solution is to use mathematical
>> operations, with the enough precission.
That woudl be ideal, but you probably don't have the option. Most PCs have only 80 bit floating point support, which is not enough for this problem. Mainframes may have 128 or even 256 bit floating point numbers, which would be enough for this case--although not for all cases.
The only other choice is to not use FP. There are ways of repredenting numbers that can potentially avoid this problem, like BCD, but they require much more work and tend to be substancially slower.
Assuming you are limited to 80 bit FP (or any FP) your best bet is to insure the math is done by starting with the numbers with the lost absolute value and proceeding to the highest. Your one algorithm does that automatically. The other does the opposite, which is the worst thing you can do.
0
Question has a verified solution.
Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.
but i know what is your mistake in your func
first
2^100=12676506002282294014
second your func's goes only till 2^99
i mean 1/2^99
so in the first you do
1/2^100+1/2^99+...+1/2
and in the second you do
1+1/2+...+1/2^99
and in the first you are wrong about
the Denominator cause 2^100=12676506002282294014
and you rounded it
to:12676506000000000000000