Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 765
  • Last Modified:

printf format specifiers: outputting long as int gives odd result!

Ah hello.

Please consider the following code:

printf("unsigned long long int: %llu \n", 0);
printf("unsigned long long int: %llu \n", (unsigned long long int)0);

Open in new window


On 64-bit Windows (VS 2005), this outputs

unsigned long long int: 9109355120595828736
unsigned long long int: 0

Open in new window


On a second run, it outputs

unsigned long long int: 6927490199261806592
unsigned long long int: 0

Open in new window


Which shows it is pretty random.

On 64-bit Linux (Netbeans), the output is as expected:

unsigned long long int: 0 
unsigned long long int: 0 

Open in new window


I am assuming here the printf is exercising its right to exhibit undefined behaviour, since I have broken my "promise" that I am going to give it an unsigned long long integer by passing 0.

1) Am I correct in my assumption; if so, can someone point me at some documentation stating this?
2) In my real code, I am passing variables to printf whose underlying type depends on a typedef; it may be an int or it may be an unsigned long long.  Am I guaranteed safety by always casting to the larger of those two types (unsigned long long here), which seems to work in my code above?
3) Any comments on the difference between Windows and Linux?

TIA
0
mrwad99
Asked:
mrwad99
  • 8
  • 4
  • 4
  • +2
4 Solutions
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
you should check the compiler, and enable this flag:
-Wformat

which should result in the following warning:
format ‘%llu’ expects type ‘long long unsigned int’, but argument 2 has type ‘int’

I read that you can be "sloppy" on data types that have "int" size or less (char ...), but not on longer type.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
so, the difference between windows and linux is first the compiler, and second the hosting OS  ...
0
 
mrwad99Author Commented:
Thanks: so is casting to the larger type always safe?
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
it depends; if you know what you are doing (and what data you have), the explicit casting is perfectly Ok.
the value "0" can be casted to ANY numerical data type (afaik) without any issue.
0
 
evilrixSenior Software Engineer (Avast)Commented:
Am I right in thinking you're coding using C99 (since unsigned long long is a C99 type)? If so, C99 provides macros for format specifiers that might prove useful in trying to solve your problem.

Each of the following object-like macros185) expands to a character string literal
containing a conversion speci¿er, possibly modi¿ed by a length modi¿er, suitable for use
within the format argument of a formatted input/output function when converting the
corresponding integer type. These macro names have the general form of PRI (character
string literals for the fprintf and fwprintf family) or SCN (character string literals
for the fscanf and fwscanf family),186) followed by the conversion speci¿er,
followed by a name corresponding to a similar type name in 7.18.1. In these names, N
represents the width of the type as described in 7.18.1. For example, PRIdFAST32 can
be used in a format string to print the value of an integer of type int_fast32_t.
2 The fprintf macros for signed integers are:

PRIdN PRIdLEASTN PRIdFASTN PRIdMAX PRIdPTR
PRIiN PRIiLEASTN PRIiFASTN PRIiMAX PRIiPTR

Meanwhile, naked zero is a signed int type where as the format specifier of llu% expects a a 64 bit unsigned type. I'm not surprised the values you are getting are a little odd. Passing incorrect format specifiers to printf is defined as undefined behaviour.


7.19.6.1

9. If a conversion speci¿cation is invalid, the behavior is unde¿ned. If any argument is
not the correct type for the corresponding conversion speci¿cation, the behavior is
unde¿ned.

It's also a C++11 type, and supported by std::ostream.
0
 
evilrixSenior Software Engineer (Avast)Commented:
>> the value "0" can be casted to ANY numerical data type (afaik) without any issue.
Unfortunately, casting isn't what happens when you use printf. The printf function is not typesafe - all it knows about the types are what you tell it in the format specifier.

The values are passed using the va_arg framework, which does little more than read everything passed to it from memory as per the format specifier. Since 0 is a signed int (normally 32 but) and unsigned long long is a 64 bit unsigned type the result of trying to use printf is almost certainly going to result in undefined behaviour.

The solution is to explicitly cast and then it should work because then printf gets the correct type. This is exactly what we see in all the test cases, no?
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
> going to result in undefined behaviour.
what will eventually happen is that as printf is expecting 64 bits, it will take 64 bits... your 32 bits (from the signed 0) and the next 32 bits also... which might be "just anything" on the heap. you might even go into a "bad memory address" crash of your app ...

and I agree that printf is not casting the types, hence the "issue".
0
 
mrwad99Author Commented:
>> what will eventually happen is that as printf is expecting 64 bits, it will take 64 bits... your 32 bits (from the signed 0) and the next 32 bits also... which might be "just anything"

That is very interesting, but also a bit alarming, kind of like a buffer overflow, and even a possibility to execute malicious code I guess...

Hi Rx!  Thanks for popping in on this one :)

>> The solution is to explicitly cast and then it should work because then printf gets the correct type

Excuse my possible dumbness, but isn't that what I am already doing?
0
 
evilrixSenior Software Engineer (Avast)Commented:
>> That is very interesting, but also a bit alarming, kind of like a buffer overflow, and even a possibility to execute malicious code I guess...


BINGO!!! Hence C++ introduced type-safe streams :)

s/printf are evil and are, historically, a source of many exploits. Put simply, if you can avoid using them do so.

>> Excuse my possible dumbness, but isn't that what I am already doing?
Exactly, that's why I said, "This is exactly what we see in all the test cases, no?". As in, the cases where you cast it works as you'd expect, no?
0
 
evilrixSenior Software Engineer (Avast)Commented:
>> Hi Rx!  Thanks for popping in on this one :)
No hay problema, señor wad99 :)
0
 
evilrixSenior Software Engineer (Avast)Commented:
Just thinking you should upgrade your username from mrwad99 to mrwad++ :)
0
 
mrwad99Author Commented:
Fantastic as always, many thanks both :)
0
 
evilrixSenior Software Engineer (Avast)Commented:
De nada, amigo!
0
 
sarabandeCommented:
it will take 64 bits... your 32 bits (from the signed 0) and the next 32 bits also... which might be "just anything" on the heap. you might even go into a "bad memory address" crash of your app ...
the printf makes a cast to the expected type at the address given. as the address is a valid 32-bit address, the cast would not fail beside on the most far end of the virtual memory space.

I have doubts that Linux always would return the expected output when a wrong 32-bit argument was given. I would guess the results you got are by accident and perhaps depending on debug or release mode or on the usage of heap memory before. the vs debugger explicitly writes non-zero contents to freed heap storage what increases the probability that uninitialized memory contains "garbage".

I made a little test and checked for the memory contents right "behind" of valid integer constants (vs10). it contains 0xcccccccc what is memory cleared by the debugger.

Sara
0
 
mrwad99Author Commented:
Thanks for participating even though the question has been answered Sara!  I don't quite understand what you are saying though with

"as the address is a valid 32-bit address, the cast would not fail beside on the most far end of the virtual memory space. "

Do you mean that the address at which *that zero* is stored is a valid 32 bit memory address, but the next 32 bits in memory are uninitialised, and that casting the 0, combined with value in the next 32 bits (which is uninitialised as far as we know) will fail?
0
 
evilrixSenior Software Engineer (Avast)Commented:
>> the printf makes a cast to the expected type at the address given. as the address is a valid 32-bit address, the cast would not fail beside on the most far end of the virtual memory space.

No one is saying anything about casts failing. What we're saying is that if you have a 32 bit type and try to used it as a 64bit type the result is undefined and the value you get back is meaningless.

Semantically, it's the same as creating a union of an int32_t and an int64_t, initialised the 32 bit member and reading the 64 bit member. The value you'll get will probably be garbage but will definitely be undefined.

The printf function knows nothing about the type of the original variable, all it knows is what you tell it in the format specifier. If this doesn't match the type expected the C99 standard is clear, "If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined"

>> Do you mean that the address at which *that zero* is stored is a valid 32 bit memory address

If that is what's being said it's an incorrect assertion. It may be on the compiler you use but all the C99 standard guarantees is that it will default to using a type int if it can represent the value using that type. If it can't, the standard has a well defined order of progressively bigger types that it will try; it will use the first one to match.

The size of an int; however, is platform specific and so you cannot make any assumptions such as this. More specifically, the standards states:

"A ‘‘plain’’ int object has the natural size suggested by the
architecture of the execution environment (large enough to contain any value in the range
INT_MIN to INT_MAX as de¿ned in the header <limits.h>)."

>> but the next 32 bits in memory are uninitialised, and that casting the 0, combined with value in the next 32 bits (which is uninitialised as far as we know) will fail?

If you use an explicit cast when passing the value to printf a temporary l-value will be created on the printf stack-frame. This temporary will be of the type as defined by the cast will be initialised according to the the C99 well defined behaviour for integer promotion:

6.3.1.3 Signed and unsigned integers

1 When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.

2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.49)

3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-de¿ned or an implementation-de¿ned signal is raised.
0
 
phoffricCommented:
Whenever you start seeing strange decimal outputs, it is helpful to see what the hex equivalent is:
9109355120595828736 = 0x7E6A EE38 0000 0000
6927490199261806592 = 0x6023 63A2 0000 0000
This often gives a clue by observing a pattern. In this case the most significant bits are garbage, and you actually do get your 32-bit 0's.
0
 
evilrixSenior Software Engineer (Avast)Commented:
Good point, well made, Paul :)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 8
  • 4
  • 4
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now