Solved

Exception when converting float to double...

Posted on 2000-05-17
13
522 Views
Last Modified: 2008-03-17
(this problem is similar to the previous "So what IS the range of double?")

consider:

float F = 1.0E-44;
double D = (double)F;

Running through this code results in a "Float invalid operation" exception. Notice that F receives an "unsafe" float value (i.e. outside of the designated range).

Thing is, F receives it's value as a result of some *legal* calculations. What is going on here, should one test the value of a float for legality after each float calculation? Something is obviously weird here... please advise!
0
Comment
Question by:gil_mo
  • 5
  • 5
  • 2
  • +1
13 Comments
 
LVL 4

Expert Comment

by:abancroft
ID: 2818152
What processor & OS are you using?

In Win32, you can tell the co-processor to handle these exceptions in hardware, so you don't have to handle them.
0
 
LVL 4

Expert Comment

by:wylliker
ID: 2818156
If you know that your calculations are going to exceed the range of a float than you should be using double or long double as your type.

In C - you are screwed if you exceed the float range - unpredicatable results at best, crash at worst.

In C++ the same will occur but you can trap the exception and try to control the situation.

FYI for those who are unaware of the difference between a float and a double.

Type float

Floating-point numbers use the IEEE (Institute of Electrical and Electronics Engineers) format. Single-precision values with float type have 4 bytes, consisting of a sign bit, an 8-bit excess-127 binary exponent, and a 23-bit mantissa. The mantissa represents a number between 1.0 and 2.0. Since the high-order bit of the mantissa is always 1, it is not stored in the number. This representation gives a range of approximately 3.4E–38 to 3.4E+38 for type float.

Type double

Double precision values with double type have 8 bytes. The format is similar to the float format except that it has an 11-bit excess-1023 exponent and a 52-bit mantissa, plus the implied high-order 1 bit. This format gives a range of approximately 1.7E–308 to 1.7E+308 for type double


0
 

Author Comment

by:gil_mo
ID: 2820350
abancroft: Win32, Pentium III.

wylliker: there is no "exceeding" for small numbers - the value should simply become 0.0 !
You are wrong about the ranges: e.g. float can go down to a magnitude of 1.0E-45.

The real question is, Why does this cast cause an exception ?
0
 
LVL 4

Expert Comment

by:wylliker
ID: 2820837
Well, I guess the IEEE is wrong about the range for a float then - I just took that info from the MSDN docs.


You said ...

float F = 1.0E-44;
double D = (double)F;

Running through this code results in a "Float invalid operation" exception. Notice that F receives an "unsafe" float value (i.e. outside of the designated range).

....
Pay attention to your parenthetical statement about out of range.



0
 
LVL 3

Expert Comment

by:sburck
ID: 2820968
I have a few comments:

I tried this using Borland C++ under DOS, and there was no problem.  However,

1.  There is a value in IEEE floating point known as a NaN (not a number); I don't have time to check, but I believe that values very close to zero outside of precision can cause it to occur.

2.  There is also an exception for loss of acceptable precision, which occurs also very close to zero.

3.  The assember code created by this snippet looked like this:

   ;            float F = 1.0E-44;
   ;      
    mov      word ptr [bp-2],0; this won't cause an exception
    mov      word ptr [bp-4],7; so putting an illegal value might be 'OK'
   ;      
   ;            double D;
   ;      
   ;            D = (double)F;
   ;      
    fld      dword ptr [bp-4]    ; one of these lines, where you use the FP,
    fstp      qword ptr [bp-12]  ; will generate an FP exception.
    fwait      

Since you have a "float invalid", I assume your compiler generated a NaN. rather than the value that the BCC compiler gave, which was tiny but not too tiny.
0
 
LVL 4

Expert Comment

by:abancroft
ID: 2821250
In VC++ (I don't know what you're using), the runtime libraries mask all floating-point exceptions by default (from MSDN) - so this exception should be handled by the FP co-processor.

Are you unmasking the FP exceptions anywhere?

In VC++, _controlfp() (or _control87()) is used to mask/unmask FP exceptions.
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:gil_mo
ID: 2821477
wylliker, sbruck: in VC++, when viewing the bit representation for F in the watch window using (long *)(&F), it is easily seen that you can assign values to F as low as 1.0E-44 . This is NOT presented as NaN. When trying to go below this, you get a 0.0 . Thus it seems that the range is as low as 1.0E-44, not 3.4E-38.

abancroft: These exceptions are indeed unmasked, not by me (I'm only a plug-in) but by the host application. Still, why is this operation considered an exception?
0
 
LVL 4

Expert Comment

by:abancroft
ID: 2821529
It generates an invalid value, which in "normal" operation shouldn't happen - it indicates an error condition (either in the input or in the calculations).

Why don't you bracket all your enrty point functions with code to get/set/reset the FP mask?

e.g.

void EntryPointFunc(void)
{
  UINT uiFPOld = _controlfp(0,0);
  (void)_controlfp(_CW_DEFAULT, _MCW_EM);

  // Your code here

  (void)_controlfp(uiFPOld, _MCW_EM);
}
0
 

Author Comment

by:gil_mo
ID: 2821664
why does it generate an invalid value rather than generating 0.0? Say,

float F = 3.4E-38 // 'minimal'
F /= 2.0;

If the result of this is invalid, how come I can view F contents as 1.7E-38 which is the operation result?
If it is valid, however, why the exception?
0
 
LVL 4

Accepted Solution

by:
abancroft earned 75 total points
ID: 2821758
> why does it generate an invalid value rather than generating 0.0? Say,
Sorry, I've misled you somewhat. The operation won't always generate an "invalid" value. But the outcome of the operation may be invalid. e.g. overflow or underflow. In your example, it underflows. F still has a value, but the processor is signalling that the operation generated an invalid value, which couldn't be stored in F.

>If the result of this is invalid, how come I can view F contents as 1.7E-38 which is the operation result?
F is a location in memory and can therefore be represented as a numeric value by the debugger.

When thinking about this, try to seperate the 4 bytes in memory that represent the float and the result of an operation.

Think of it this way:
  void Division(float a, float b, float *pfResult) throw;

  float F;
  Division(3.4E-38, 2.0, &F);
Even if Division() throws an underflow exception, F will still have a value.
0
 

Author Comment

by:gil_mo
ID: 2821912
Thanks.
The reason for me NOT re-masking the control word is because, just as I wrote before, I'm simply a plug-in and cannot interfere with the host application's decisions.

So in any case I'm in a kinda mess.
0
 
LVL 4

Expert Comment

by:abancroft
ID: 2821949
I don't know much about plug-ins, but can't you use the technique I outlined earlier to get/set/reset the control word?

That way you won't interfere with the host app.

You could even wrap it up in a class to simplify it.

e.g.

e.g.

class CFPControlWord
{
  CFPControlWord(UINT uiNew=_CW_DEFAULT, UINT uiMask=_MCW_EM)
  {
    m_uiFPOld = _controlfp(0,0);
    m_uiMask = uiMask;
    (void)_controlfp(uiNew, uiMask);
  };

  ~CFPControlWord()
  {
    (void)_controlfp(m_uiFPOld, m_uiMask);
  };

protected:
 UINT m_uiFPOld;
 UINT m_uiMask;
};

void EntryPointFunc(void)
{
  CFPControlWord FPControl;

  // Your code here
}
0
 

Author Comment

by:gil_mo
ID: 2822043
A very nice idea, I must admit; but this would enforce me to use this method in ALL the entry-point functions in all the plug-ins' DLLs (and there are many...). Either that or limit this method to only the functions that might be including some kind of hazardous float operation. Determining this is even more work!

Rather than that, I've queried the host application's programmers, hoping they would supply some explanation :)
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Suggested Solutions

Preface I don't like visual development tools that are supposed to write a program for me. Even if it is Xcode and I can use Interface Builder. Yes, it is a perfect tool and has helped me a lot, mainly, in the beginning, when my programs were small…
Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
The goal of this video is to provide viewers with basic examples to understand and use structures in the C programming language.
The goal of this video is to provide viewers with basic examples to understand how to create, access, and change arrays in the C programming language.

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now