Solved

What is the compiler doing?

Posted on 2003-11-30
20
436 Views
Last Modified: 2012-06-21
0
Comment
Question by:Dexstar
  • 8
  • 7
  • 4
  • +1
20 Comments
 
LVL 22

Expert Comment

by:grg99
ID: 9849948
What's happening is:

The compiler "knows" that num is supposed to be a constant,
so it doesnt even allocate any memory for its value (since it cant change).
Any time you mention "num", the compiler uses an immediate constant, not a reference to a memory address.

What's really interesting is that the compiler lets you store into "num" without throwing a fatal error!



0
 
LVL 3

Expert Comment

by:terageek
ID: 9854864
My best guess is that in the first pass, the compiler sees the constant num is 20, and then uses 20 everywhere it sees num or (int &) num as the input to an expression.  Then in the second pass it is modifying num to be 10, and then using 10 from then on, like when it sees (long &) num, where more work is needed to interpolate the value.

There are 2 problems with this.  First, the compiler is making an optimization that it shouldn't.  Second, you are trying to modify a constant, which should give you a compile error.

1) Do you get the same comiled code if you explicitly turn off all optimizations?

2) How does the compiled code change if you change the order of the cout statements?

3) Why are you trying to change the value of a constant?  Either it shouldn't be a constant, or you shouldn't modify it.
0
 
LVL 19

Author Comment

by:Dexstar
ID: 9859619
Yes, I agree the compiler should give you an error, but it isn't.

1) No, without the optimizations, it looks more like this:
                  const int num = 20;
      00401004  mov         dword ptr [num],14h
                  (int &)num=10;
      0040100B  mov         dword ptr [num],0Ah
                  cout<<(long &)num;
      00401012  mov         eax,dword ptr [num]
      00401015  push        eax  
      00401016  mov         ecx,offset std::cout (42AB04h)
      0040101B  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (4013A0h)
                  cout<<(int &)num;
      00401020  push        14h  
      00401022  mov         ecx,offset std::cout (42AB04h)
      00401027  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (401040h)
                  cout<<num;
      0040102C  push        14h  
      0040102E  mov         ecx,offset std::cout (42AB04h)
      00401033  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (401040h)

It seems to be using a temporary variable for the first call, and then the hardcoded value for "num" for the other 2.

2) It doesn't change much.  I loved the call where it was using the "updated" value to the end, and it STILL uses the updated value.  It always uses "10" for (long&)num and 10 for the other 2.
                  const int num = 20;
                  (int &)num=10;
                  cout<<(int &)num;
      00401385  mov         esi,offset std::cout (428B5Ch)
      0040138A  push        14h  
      0040138C  mov         ecx,esi
      0040138E  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (40129Eh)
                  cout<<num;
      00401393  push        14h  
      00401395  mov         ecx,esi
      00401397  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (40129Eh)
                  cout<<(long &)num;
      0040139C  push        0Ah  
      0040139E  mov         ecx,esi
      004013A0  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (40129Eh)

3) Yes, I know.  It's a purely academic exercise.  :)

0
 
LVL 3

Expert Comment

by:terageek
ID: 9860236
My first explination makes the most sense to me.  The compiler seems to use a constant 20 in it's first pass through the code (even with optimizations off apparently), replacing "num" and "(int &) num" with 20 wherever it can.  In the second pass it is doing some more interpolation.  Without the optimizations, it is actually creating the variable and using it.  With the optimizations turned on, it is realizing that it has just set num to a constant 10, and that is what is used.

I think that you have successfully confused the compiler.  Good job ;-)
0
 
LVL 19

Author Comment

by:Dexstar
ID: 9860383
Okay, so I'm still trying to understand why the compiler sometimes picks 10 and sometimes it picks 20.  So, I did another experiment.

It uses 10 for these:
      cout << (unsigned int&)num;
      cout << (long&)num;
      cout << (unsigned long&)num;
      cout << (short&)num;
      cout << (unsigned short&)num;

It uses 20 for these:
      cout << num;
      cout << (int&)num;
      cout << (const int&)num;
      cout << (const unsigned int&)num;
      cout << (const long&)num;
      cout << (const unsigned long&)num;
      cout << (const short&)num;
      cout << (const unsigned short&)num;

What in the world is going on!?

Here is the code:
            12:             cout << num;
      00401950  push        14h  
      00401952  mov         ecx,offset std::cout (429B5Ch)
      00401957  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (401630h)
            13:             cout << (int&)num;
      0040195C  push        14h  
      0040195E  mov         ecx,offset std::cout (429B5Ch)
      00401963  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (401630h)
            14:             cout << (const int&)num;
      00401968  push        14h  
      0040196A  mov         ecx,offset std::cout (429B5Ch)
      0040196F  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (401630h)
            15:             cout << (unsigned int&)num;
      00401974  push        0Ah  
      00401976  mov         ecx,offset std::cout (429B5Ch)
      0040197B  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (4017C0h)
            16:             cout << (const unsigned int&)num;
      00401980  push        14h  
      00401982  mov         ecx,offset std::cout (429B5Ch)
      00401987  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (4017C0h)
            17:
            18:             cout << (long&)num;
      0040198C  push        0Ah  
      0040198E  mov         ecx,offset std::cout (429B5Ch)
      00401993  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (401630h)
            19:             cout << (const long&)num;
      00401998  push        14h  
      0040199A  mov         ecx,offset std::cout (429B5Ch)
      0040199F  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (401630h)
            20:             cout << (unsigned long&)num;
      004019A4  push        0Ah  
      004019A6  mov         ecx,offset std::cout (429B5Ch)
      004019AB  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (4017C0h)
            21:             cout << (const unsigned long&)num;
      004019B0  push        14h  
      004019B2  mov         ecx,offset std::cout (429B5Ch)
      004019B7  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (4017C0h)
            22:             
            23:             cout << (short&)num;
      004019BC  push        0Ah  
      004019BE  mov         ecx,offset std::cout (429B5Ch)
      004019C3  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (401300h)
            24:             cout << (const short&)num;
      004019C8  push        14h  
      004019CA  mov         ecx,offset std::cout (429B5Ch)
      004019CF  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (401300h)
            25:             cout << (unsigned short&)num;
      004019D4  push        0Ah  
      004019D6  mov         ecx,offset std::cout (429B5Ch)
      004019DB  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (4014A0h)
            26:             cout << (const unsigned short&)num;
      004019E0  push        14h  
      004019E2  mov         ecx,offset std::cout (429B5Ch)
      004019E7  call        std::basic_ostream<char,std::char_traits<char> >::operator<< (4014A0h)
0
 
LVL 17

Expert Comment

by:rstaveley
ID: 9860699
This might be better back in C++... I notice that the C++ reinterpret_cast<> operator doesn't work for the 10s and it does work for the 20s.... HOWEVER.... it makes the 20s 10!!!

//   cout << reinterpret_cast<unsigned int&>(num);
//   cout << reinterpret_cast<long&>(num);
//   cout << reinterpret_cast<unsigned long&>(num);
//   cout << reinterpret_cast<short&>(num);
//   cout << reinterpret_cast<unsigned short&>(num);

     cout << num;
     cout << const_cast<int&>(num);
     cout << reinterpret_cast<const int&>(num);
     cout << reinterpret_cast<const unsigned int&>(num);
     cout << reinterpret_cast<const long&>(num);
     cout << reinterpret_cast<const unsigned long&>(num);
     cout << reinterpret_cast<const short&>(num);
     cout << reinterpret_cast<const unsigned short&>(num);
0
 
LVL 17

Expert Comment

by:rstaveley
ID: 9860700
> doesn't work

I mean won't compile
0
 
LVL 17

Expert Comment

by:rstaveley
ID: 9860714
Presumably this is a consequence of using C-style casts with C++ (as well as tricking the compiler into changing a const).
0
 
LVL 17

Assisted Solution

by:rstaveley
rstaveley earned 250 total points
ID: 9860760
The error messages if you uncomment the 10s are (for example)...

num2.cpp(26) : error C2440: 'reinterpret_cast' : cannot convert from 'const int'
 to 'unsigned int &'
        Reason: cannot convert from 'const int *' to 'unsigned int *'
        Conversion loses qualifiers

...i.e. It is complaining about losing the const.

Presumably the C-style casts on the 10s effectively do the equivalent of the following:

      cout << reinterpret_cast<unsigned int&>(const_cast<int&>(num));
      cout << reinterpret_cast<long&>(const_cast<int&>(num));
      cout << reinterpret_cast<unsigned long&>(const_cast<int&>(num));
      cout << reinterpret_cast<short&>(const_cast<int&>(num));
      cout << reinterpret_cast<unsigned short&>(const_cast<int&>(num));

Perhaps the additional const_cast<> is what tricks the compiler???

0
 
LVL 19

Author Comment

by:Dexstar
ID: 9860975
Yeah, when you use reinterpret_cast<> it ALWAYS uses the "new" value instead of the original value.  I still don't get how the compiler picks which one to use.

Dex*
0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 3

Expert Comment

by:terageek
ID: 9861012
Still looks like a multi-step comile issue.  When you declare num as a const int, it will search for all "num" and "(int &) num" and change them to 20 whenever it appears as an input.  Also, when you type cast num as a "const <anything>", the compiler is being told that this is a constant and it is using the "constant value" 20.

In the second pass, the compiler performs the more complex interpolations of changing the int to a long or unsigned int, or using the c-style reinterpret_cast, but this time it uses num as a variable, (as it should have in the first place).  It will also generate the code which modifys the constant value.

The code is then optimized when the compiler looks at the assembly and realizes that you have moved a constant into a variable and then use that variable.  The optimizer decides to use the new constant value instead and eliminate the variable.

By the way, I just noticed that you weren't the one to open the original question.  Sorry for questioning "your" intentions of changing a constant.
0
 
LVL 17

Expert Comment

by:rstaveley
ID: 9861059
All but one of the 20s are almost certainly:

//      cout << static_cast<int&>(num);
      cout << static_cast<const int&>(num);
      cout << static_cast<const unsigned int&>(num);
      cout << static_cast<const long&>(num);
      cout << static_cast<const unsigned long&>(num);
      cout << static_cast<const short&>(num);
      cout << static_cast<const unsigned short&>(num);

The only one I can't figure out is:

      cout << (const int&)num;
0
 
LVL 17

Expert Comment

by:rstaveley
ID: 9861173
I've snagged on cout << (const int&)num, but otherwise I reckon the C-casts work as follows:

--------8<--------
#include <iostream>
using namespace std;

int main()
{
      const int num = 20;
      (int &)num=10;

      cout << "\nThese C-casts:\n";
      cout << (unsigned int&)num;
      cout << (long&)num;
      cout << (unsigned long&)num;
      cout << (short&)num;
      cout << (unsigned short&)num;
      cout << '\n';

      cout << num;
      cout << (int&)num;
      cout << (const int&)num;
      cout << (const unsigned int&)num;
      cout << (const long&)num;
      cout << (const unsigned long&)num;
      cout << (const short&)num;
      cout << (const unsigned short&)num;
      cout << '\n';

      cout << "\nAre almost the same as:\n";
      cout << reinterpret_cast<unsigned int&>(const_cast<int&>(num));
      cout << reinterpret_cast<long&>(const_cast<int&>(num));
      cout << reinterpret_cast<unsigned long&>(const_cast<int&>(num));
      cout << reinterpret_cast<short&>(const_cast<int&>(num));
      cout << reinterpret_cast<unsigned short&>(const_cast<int&>(num));
      cout << '\n';
      cout << num;
      cout << static_cast<int&>(const_cast<int&>(num)); // <- Can't get this one!!!
      cout << static_cast<const int&>(num);
      cout << static_cast<const unsigned int&>(num);
      cout << static_cast<const long&>(num);
      cout << static_cast<const unsigned long&>(num);
      cout << static_cast<const short&>(num);
      cout << static_cast<const unsigned short&>(num);
      cout << '\n';

}
--------8<--------
0
 
LVL 3

Expert Comment

by:terageek
ID: 9861330
That line confused me a bit too.

Perhaps the compiler is first trying to do...

    reinterpret_cast<int&>(const_cast<int&>(num))

But is somehow "simplifies it" when it sees that both casts are of a type int&.

    reinterpret_cast<int&>(const_cast<int&>(num))
    reinterpret_cast<const int&>(num)
    static_cast<const int&>(num)
0
 
LVL 3

Expert Comment

by:terageek
ID: 9861446
By the way, if you look back at the original posts, you will see that different compilers gave different results for that line.  Both GCC and Borland C++ will give you 10, but VC7.1 gives you 20.  So it looks like a VC specific optimization.
0
 
LVL 17

Expert Comment

by:rstaveley
ID: 9862901
I meant of course cout << (int&)num, as you quite rightly picked up from the flagged line in the code snippet.

It occurs to me that it could be because the cast mirrors the initial cast used in the LHS of the assignment.

    (int &)num=10;
0
 
LVL 3

Expert Comment

by:terageek
ID: 9863324
Your theory could be tested by trying (long &)num = 10 instead, and see if anything changes.

One way or another, VC is making an optimization by converting (int &)num on the RHS into a static 20 early on while other compilers leave it as a variable, and then optimize the variable out, changing it to the last assigned static value of 10.
0
 
LVL 17

Expert Comment

by:rstaveley
ID: 9868532
> could be tested by trying..

Reunited with VC7.1, I've tried this at last and...it doesn't. I guess VC7.1 is even quirkier than that.

GCC3.1 uses the adjusted value 10 for all of the C-style casts, and only leaves the uncasted cout << num as 20. Curiously, it also differs in its handling of the static_cast<> for...

      cout << static_cast<const int&>(num);

...which it treats as a 10 and VC7.1 treats it as a 20. All the other static_cast<>s get 20 in GCC.

I think this has exhausted my curiosity now. It does make me realise that it would be a tough job to write your own C++ compiler from scratch :-)
0
 
LVL 3

Accepted Solution

by:
terageek earned 250 total points
ID: 9871250
My final answer:

Each compiler has different optimizations which will immediately replace some variables declared as const with a static value despite the fact that there appears to be a sneaky way to change const variables without getting a compiler error.  In the instances that the compiler cannot figure out what static value to use, the compiler will allocate space for a variable and use it, which it may or may not later optimize into a constant.  As a result, sometimes you will get the initialized value, and other times you will get your sneaky modified value.
0
 
LVL 19

Author Comment

by:Dexstar
ID: 9875726
Thank you, rstaveley and terageek and everyone, for helping me work through this.  I guess the final lesson is this:   The compiler is your friend, and you shouldn't play mean tricks on your friends.

Dex*
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Problems and Best Bios settings for i5 2500K on a Asus P8P67-M Motherboard 74 1,018
Binary BombLab Phase 2 16 3,783
Buffer bomb nitro 1 4,279
File Not Found Exception 12 106
A Short Story about the Best File Recovery Software – Acronis True Image 2017
What is Backup? Backup software creates one or more copies of the data on your digital devices in case your original data is lost or damaged. Different backup solutions protect different kinds of data and different combinations of devices. For e…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now