Link to home
Start Free TrialLog in
Avatar of rickhill11
rickhill11

asked on

why "." vs "->"

Im an old, emphasis on old, K&R guy, so I am very well versed in accessing members of structures, unions, and now classes with "." notation vs "->" notation.  I was asked this morning "why the difference?"  My answer started with "That's easy," and then every instance that I could come up with I ended up deciding that a modern compiler could easily overcome the issues.  

So, given a structure or class foo with a single member "a"

I understand that
       struct foo *pfoo, afoo;
       pfoo=&afoo;
       pfoo->a would be the appropriate call

OR
      struct foo afoo;
      afoo.a would be the appropriate call

but why are they separate.  In case 1, why can't the compiler sort out pfoo.a or in the second afoo->a.  There must be a case where this behavior would be unacceptable, but I am trying to fathom what it is.
Avatar of Karrtik Iyer
Karrtik Iyer
Flag of India image

-> is an indicator to compiler to do a level of indirection before invoking the method in your example.
Since pfoo is a pointer it only contains address of afoo.
Say when you created afoo was created at an address of 0x1000, so the memory for foo object is allocated at this address, so when you do afoo. the compiler knows that no need to jump or go to another address to get members of foo which is allocated for afoo.
But pfoo contains address of afoo, so say pfoo is created at 0x2000, and it contains a values of 0x1000, so when you pfoo. compiler cannot find memory allocated for members of afoo starting at 0x2000, instead it has to jump or indirect itself to 0x1000 to find the members of afoo hence for pfoo - > is required. When you debug the program see the value of pfoo it shall be equal to address of afoo..also try to print the size of pfoo versus size of afoo you shall understand what I am trying to explain.
Avatar of Kent Olsen
Hi Rick,

C wasn't designed from the ground up.  Actually, it evolved from a now dead language called B and B used the asterisk to designate a pointer.  C just carried on that practice.

I've long maintained that modern compilers don't need separate operators for struct and pointer to struct.  And for a single object that's true.  But a single operator can get really messy when you're mixing pointers and structures in an array.  Every reference would have to be explicitly cast to (struct) or (pointer), unless operator precedence was established to designate a default.  And even that could wind up being some very ugly source code!


Good Luck!
Kent
Avatar of rickhill11
rickhill11

ASKER

Kent,,

I'm trying to understand your array example.  Can you be more specific?  Assuming that the array contained some sort of mish mash of data types, then a union or explicit cast would have to be used.

For instance ((mystruct *)db)->element is required, but why couldn't the compiler sort out ((mystruct *)db).element?  It seems to me that the keepers of the keys either wanted to keep these separate for simply historical reasons, or there is some place where a pointer dereferenced with a ".", or a structure member accessed with a "->" leads to unwanted side effects.  Since the syntax for structures, unions, and classes are so intertwined, the reason could easily be related to any one of the three.

This is not a burning issue, but I just like to understand the "why" of things.

Rick
Since pointers are a source of many code errors, I like the idea that I can tell that a variable is a pointer by its use of p->a or (*p).a. (I also like the idea of having some prefix to indicate pointers or references.
Start with the basics.  Does a dynamic array contain an array of structs or an array of pointers?  Is data within a struct in an array another struct or is it a pointer to a struct?


It gets ugly....
Phoffric,

I get your point, and agree to a large extent, but it still doesn't answer why the language enforces this.

Rick
Kent,

Ugly it may be, but regardless of the type, whether an array of pointers, or an array of structures, the compiler will throw an error if "->" is used on an element of the array of structures and also if "." is used on an element of an array of pointers.

Take a linked list where you want access to  mystru->ptr->ptr->ptr.element.  I have no problem with doing it; I've been doing so for over 30 years, but I still wonder why the compiler can't understand mystru.ptr.ptr.ptr.element.  Again, I'm not arguing the syntax, I am simply trying to explain the "why" to a colleague.
Let's break the question into two parts.

Why did C originally have different syntax for structures and pointers?  Because Dennis Ritchie designed the language that way.  (Though they are closely related, referencing an object by name or by address really are different things and C was intended to be an operating system - unix - implementation language.  As a former O/S designer/developer, trust me when I say that you don't want that kind of ambiguity in the implementation layer.)

Why does the language continue to maintain these items "as is"?  C is perhaps the most widely known, and used, programming language in the world.  Such a drastic change would have to get past all of the C committees.  That certainly hasn't happened yet.


Another area to consider is parameter passing.  Pass by address and pass by value are two entirely different things.  From the beginning, C required that parameter access be an atomic operation.  That is, the parameter had to be accessible via a single instruction so addresses, integers, floats, and char values are valid parameter types.  Structures, unions, strings, etc. are/were not.  Automatically converting a struct to pass the correct type would violate the pass by value / pass by address rules.
SOLUTION
Avatar of sarabande
sarabande
Flag of Luxembourg image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
In the beginning there was C. Well, actually, I think it was B, and before B, there was something else, and before that, there was assembly and before that there was machine code.

Now, Ritchie made up C as a convenience for himself and his crew at AT&T (or was it Bell Labs at the time), and it was good. I mean that it was so good that the many few programmers in the world adopted it and said it was good. Some may have even been zealots. After all, it was almost assembly language, but a lot easier to manage.

ANSI C came around in 1988. I know that some ANSI/ISO formulations are done by committees consisting of compiler vendor representatives and other people. Adding new features costs these compiler vendors lots of money not only to implement, but also in writing test specifications and testing.

Members of the committees can make requests for new features and probably there are forums where ordinary computer scientists and application developers can make requests. If enough people make the request, then maybe, if there isn't too much opposition in terms of complexity and costs, the request may go through.

I guess your idea just never got approved, or it was not suggested. After all, part of the K&R philosophy was to keep C simple (and let C++ be the elephant).
In c++, you could define a.a, (*a).a, and  a->a to all return something different, if you wished to be so perverse.
#include <iostream>
class foo{
  public:
  int a;
  foo(int a):a(a){}
  foo operator *(){
    return foo(2);
  }
  foo *operator ->(){
     return new foo(3);
  }
  operator int(){
    return 4;
  }
};
int main(){
  foo a(1);
  std::cout << a.a << std::endl;
  std::cout << (*a).a << std::endl;
  std::cout << a->a << std::endl;
  std::cout << a << std::endl;
}
Ozo,
What about my simple soln
I think my last post answers your question as to the "why" the compiler rejects your alternative form.

But looking back at kdo's post, http:#a41373505 , I realize that I must have only read his last paragraph. I think I duplicated much of what kdo said. So, unless you find some additional information in my last post, you can give points for my last post to kdo.
What about my simple soln
In your simple solution, which would often be preferable to a perverse solution,
 ptrEx->fun() and (*ptrEx).fun() are synonymous, while ptrEx.fun() would be a syntax error.
Just to be clear, my original question was neither a request for a change, nor a suggestion.

I am simply attempting to answer a question that was put to me.  The question was "why does the language enforce the difference between '.' and '->'?"

I don't mean to be overly picky, but none of the answers resonate with me.  This doesn't make them wrong, they simply don't resonate, and I will have trouble passing along information that I am in doubt about.  In K&R 'C' where I started 30+ years ago, it made a lot of sense; however, even then, a person could make the argument that a compiler faced with a pointer to a structure could easily handle something like foo.bar in lieu of foo->bar.  For instance, it the old compilers passing an argument like char foo[] and then addressing is as *(foo+10) would throw an error.  Now some, maybe all, compilers seem to be comfortable with this.

To me the answer must be simple and be one of:
     1.  The various committees either didn't consider this, or wanted to maintain a strict notion for the programmer whether he/she was using a pointer or not.  It may very well be as simple as this, but if so, it would be interesting to read the notes from the meeting(s) to understand their logic.
     2.  It was simply too hard to accomplish-----I really doubt this.  There have been many more changes to the language, that to me seem to be at the margin, that had to be much harder to implement.
    3.  There is some case, without redefining operators, where interpreting foo.bar as if the programmer written foo->bar, or vice versa would lead to unexpected or invalid results.  To me this is the most likely option, but I haven't seen a good example.

Sarabande had a good example when she mentioned variables not beginning with a digit.  However, this might be more related to compiler speed.  Her suggestion, which was echoed by others that it is simply one of the requirements of the syntax, may be true.  Usually though, if you dig deep enough into an issue like this, the underlying reason stands out.  That is what I am trying to discover.

It has been an interesting conversation though.  I'll leave it open for a few more days just to see if anything pops.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I will reverse the question...

Having a.b technically means that you have an address related to a, and the item offset and length of the item related to b.

Having p->b means that you have an address in p that contains another address, and only then the b is applied as offset to the other address.

Having a library source code--in your opinion--how the two syntax/semantic variants should be unified?
There does not have to be an underlying reason to not do something. The "something" may simply have never been considered. One could look at C++ features (and other questions) and ask why some of those features were not included in C. I am guessing that not all of these features have been considered and then rejected.
One more comment before I consider this horse dead and beaten.

C was developed as the implementation language for unix, replacing assembly language.  If you think of it in this context, the structure feature would have been implemented to organize key kernel parameters before the need to pass structures (or structure pointers) existed.  The development of task management, device drivers, etc. soon made structure pointers a necessity.  But structures as static entities was already defined.  Ritchie made the decision to use different operators for accessing static structures and dynamic structures, probably because they are, in fact, different operations.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Again, thanks for your thoughtful answers.