asked on

printf format specifiers: outputting long as int gives odd result!

Ah hello.

Please consider the following code:

printf("unsigned long long int: %llu \n", 0);
printf("unsigned long long int: %llu \n", (unsigned long long int)0);

Open in new window

On 64-bit Windows (VS 2005), this outputs

unsigned long long int: 9109355120595828736
unsigned long long int: 0

Open in new window

On a second run, it outputs

unsigned long long int: 6927490199261806592
unsigned long long int: 0

Open in new window

Which shows it is pretty random.

On 64-bit Linux (Netbeans), the output is as expected:

unsigned long long int: 0 
unsigned long long int: 0

Open in new window

I am assuming here the printf is exercising its right to exhibit undefined behaviour, since I have broken my "promise" that I am going to give it an unsigned long long integer by passing 0.

1) Am I correct in my assumption; if so, can someone point me at some documentation stating this?
2) In my real code, I am passing variables to printf whose underlying type depends on a typedef; it may be an int or it may be an unsigned long long. Am I guaranteed safety by always casting to the larger of those two types (unsigned long long here), which seems to work in my code above?
3) Any comments on the difference between Windows and Linux?

TIA

SOLUTION

Guy Hengel [angelIII / a3]

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Guy Hengel [angelIII / a3]

so, the difference between windows and linux is first the compiler, and second the hosting OS ...

mrwad99

ASKER

Thanks: so is casting to the larger type always safe?

Guy Hengel [angelIII / a3]

it depends; if you know what you are doing (and what data you have), the explicit casting is perfectly Ok.
the value "0" can be casted to ANY numerical data type (afaik) without any issue.

SOLUTION

evilrix

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER CERTIFIED SOLUTION

evilrix

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

Guy Hengel [angelIII / a3]

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

mrwad99

ASKER

>> what will eventually happen is that as printf is expecting 64 bits, it will take 64 bits... your 32 bits (from the signed 0) and the next 32 bits also... which might be "just anything"

That is very interesting, but also a bit alarming, kind of like a buffer overflow, and even a possibility to execute malicious code I guess...

Hi Rx! Thanks for popping in on this one :)

>> The solution is to explicitly cast and then it should work because then printf gets the correct type

Excuse my possible dumbness, but isn't that what I am already doing?

evilrix

>> That is very interesting, but also a bit alarming, kind of like a buffer overflow, and even a possibility to execute malicious code I guess...

BINGO!!! Hence C++ introduced type-safe streams :)

s/printf are evil and are, historically, a source of many exploits. Put simply, if you can avoid using them do so.

>> Excuse my possible dumbness, but isn't that what I am already doing?
Exactly, that's why I said, "This is exactly what we see in all the test cases, no?". As in, the cases where you cast it works as you'd expect, no?

evilrix

>> Hi Rx! Thanks for popping in on this one :)
No hay problema, señor wad99 :)

evilrix

Just thinking you should upgrade your username from mrwad99 to mrwad++ :)

mrwad99

ASKER

Fantastic as always, many thanks both :)

evilrix

De nada, amigo!

sarabande

it will take 64 bits... your 32 bits (from the signed 0) and the next 32 bits also... which might be "just anything" on the heap. you might even go into a "bad memory address" crash of your app ...

the printf makes a cast to the expected type at the address given. as the address is a valid 32-bit address, the cast would not fail beside on the most far end of the virtual memory space.

I have doubts that Linux always would return the expected output when a wrong 32-bit argument was given. I would guess the results you got are by accident and perhaps depending on debug or release mode or on the usage of heap memory before. the vs debugger explicitly writes non-zero contents to freed heap storage what increases the probability that uninitialized memory contains "garbage".

I made a little test and checked for the memory contents right "behind" of valid integer constants (vs10). it contains 0xcccccccc what is memory cleared by the debugger.

Sara

mrwad99

ASKER

Thanks for participating even though the question has been answered Sara! I don't quite understand what you are saying though with

"as the address is a valid 32-bit address, the cast would not fail beside on the most far end of the virtual memory space. "

Do you mean that the address at which *that zero* is stored is a valid 32 bit memory address, but the next 32 bits in memory are uninitialised, and that casting the 0, combined with value in the next 32 bits (which is uninitialised as far as we know) will fail?

evilrix

>> the printf makes a cast to the expected type at the address given. as the address is a valid 32-bit address, the cast would not fail beside on the most far end of the virtual memory space.

No one is saying anything about casts failing. What we're saying is that if you have a 32 bit type and try to used it as a 64bit type the result is undefined and the value you get back is meaningless.

Semantically, it's the same as creating a union of an int32_t and an int64_t, initialised the 32 bit member and reading the 64 bit member. The value you'll get will probably be garbage but will definitely be undefined.

The printf function knows nothing about the type of the original variable, all it knows is what you tell it in the format specifier. If this doesn't match the type expected the C99 standard is clear, "If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined"

>> Do you mean that the address at which *that zero* is stored is a valid 32 bit memory address

If that is what's being said it's an incorrect assertion. It may be on the compiler you use but all the C99 standard guarantees is that it will default to using a type int if it can represent the value using that type. If it can't, the standard has a well defined order of progressively bigger types that it will try; it will use the first one to match.

The size of an int; however, is platform specific and so you cannot make any assumptions such as this. More specifically, the standards states:

"A ‘‘plain’’ int object has the natural size suggested by the
architecture of the execution environment (large enough to contain any value in the range
INT_MIN to INT_MAX as de¿ned in the header <limits.h>)."

>> but the next 32 bits in memory are uninitialised, and that casting the 0, combined with value in the next 32 bits (which is uninitialised as far as we know) will fail?

If you use an explicit cast when passing the value to printf a temporary l-value will be created on the printf stack-frame. This temporary will be of the type as defined by the cast will be initialised according to the the C99 well defined behaviour for integer promotion:

6.3.1.3 Signed and unsigned integers

1 When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.

2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.49)

3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-de¿ned or an implementation-de¿ned signal is raised.

phoffric

Whenever you start seeing strange decimal outputs, it is helpful to see what the hex equivalent is:
9109355120595828736 = 0x7E6A EE38 0000 0000
6927490199261806592 = 0x6023 63A2 0000 0000
This often gives a clue by observing a pattern. In this case the most significant bits are garbage, and you actually do get your 32-bit 0's.

evilrix

Good point, well made, Paul :)