Link to home
Start Free TrialLog in
Avatar of sieglej
sieglej

asked on

Efficient access to member data

I would like an efficient, generic way to access structure elements within an array of structures that lives in a class.  The elements may be different types (int, char, etc.).  In C++ the typical books will generate samples that would create a Get() and Set() function for each member variable.

i.e. int MyClass::GetIntItem(int index){
          return DataStuct[index].IntItem;}
     Etc.
This seems like a lot of functions if the data you are encapsulating has many discrete elements which need to be accessed individually (a Get and Set function for each element).  It would seem a common problem that folks would have an elegant solution for...  Data encapsulation is a key element of C++, yes?

It's possible to do something like this with lookup tables and a generic function to process an element based on it's type and address but this seems slow to deal with setting or getting a single element.  This class is intended to be shared amongst mutiple threads as well.  I intend to use a CRITICAL_SECTION to safeguard the access of each element.

Thanks for your help!

Jeff
ASKER CERTIFIED SOLUTION
Avatar of nietod
nietod

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of nietod
nietod

If I understand you correctly, you have a structure that is inside an array, and you want to access the members of different elements of the array.  What I would do, is to write one procedure that returns a reference to the one of the array elements (that is it is returning a reference to a strucuture), then from there invoke a member procedure of the stucture to access the appropriate element.  This means you only have to write a minimal number of procedures.  

example follows.
Say the structure has two members you want to access amd two functions used to access them, like

struct Struct
{
   int I;
   char Ch;  
   int GetI() { return i; };
   char GetCh() { return Ch; };
};

Note that I and Ch are public, so the functions that access them are not necessary, but I like to give you options.

Then you have a class that has an array of these structures and an function that returns one of the structures in the array.  (Note this might be a good time to use operator [], but I did a regular function because it makes a clearer example).

class Class
{
   Struct Array[100];
public:
   Stuct & GetItm(int i) { return Array[i]; };
};

Now to use this to access a member of a structure in the array you would do.

Class TheClass;
int X = TheClass.GetItm(5).GetI();
int Ch = TheClass.GetItm(37).GetCh();

Since the I and Ch members are public, you don't need to use the GetXX() functions to access them and could do.

int X2 = TheClass.GetItm(15).I;
TheClass.GetItm(11).I  = TheClass.GetItm(3).I + 1;

Let me know if you have questions.

Here's an alternative answer:

>> This seems like a lot of functions if the data you are encapsulating has many discrete elements which need to be accessed individually.

What do you worry about?  Speed?  Code size?
In that case, use inline functions for accessors (get...) and mutators (set...).  The compiler will generate code as if you accessed them individually.

>> It would seem a common problem that folks would have an elegant solution for...

Well, if you're concerned with lots of get...() and set...() methods you *could* make the data items public (however, see below...)

>> Data encapsulation is a key element of C++, yes?

Correct.  That's why you have to ask yourself why do expose the implementation details (variables) at all.

Consider a 2D position class having X and Y coordinates as members.  You could create getX(), getY(), setX() and setY() metods but that's exposing IMPLEMENTATION details.  What happens when you move to polar coordinates?

Having said that, the most *efficient* implementation of what you want (and screw OO design principles) will be something like:

    class MyClass
    {
    public:
        int& IntItem(int index) { return DataStuct[index].IntItem; } // Return a reference
        //...
    };

Which allows things like:

    MyClass mc;
    //...
    mc.IntItem(200) += 17;

>>Having said that, the most *efficient* implementation of what you
>> want (and screw OO design principles) will be something like

How does that violate OO design princibles?
>> How does that violate OO design princibles?
Exposing implementation details (cf. the polar coordinates example above).

It is not much better than (assume all members public):

    mc.DataStuct[index].IntItem += 17;

(BTW, the same code will be generated).

The user of the class is aware of the internal members and can access and modify them at will.  Keeping the index in bounds is also the user's responsibility.
It is much better than

  mc.DataStuct[index].IntItem += 17;

As long as changes in the class can be handled in the interface function so that client code doesn't have to be changed, you haven't violatated anything.  And that is the case.  You can change the way the data is stored without changing the interface to the class.  If for, example you changed it so the data was stored in a linked list rather than an array, the interface function could still take the same prameter and look up the item it needed (more slowly of course).  If you changed it to store the number in a different format (change from int to float for example) the interface function could return a proxy class.  The proxy class could be converted to an int so that "read uses" of the function  would continue to work.  The proxk class could support all of int's operators and then would update the stored float, so that write uses of the function would continue to work.  Thus nothing is violated.  (This is true of the solution I presented as well.)
Todd, I won't start a religious war over this but it is my firm belief that accessors and mutators are a sign of bad design in all but the most trivial cases.

That said, I respectfully disagree with your position.  Returning a proxy class can break existing code.  Consider:

    int& ref = mc.IntItem(200);
    // ...
    ref += 17;

How do you handle?
I believe the following overly simplified proxy class will work to adjust the floating point number when the  "int" that is returned is adjusted. (Of course, I've never done this  : - )   )

class IntProxy
{
    float *FPtr;  // -> float that is being proxied (bad design, but makes for a short example..)
public:
    IntProxy(float *FPtr) : FPtr(FPtr) { };
    int operator += (int i)
   {
      *FPtr += (float) i;
      return (int) *FPtr;
   }
};

Not a religious war?  would you settle for a athiestic police action?
Todd, check my example again.  The user of the class saves the result in a reference-to-int then, after some time, adjusts the variable via the reference.  This is a certified proxy killer.

sieglej, the arguments between Todd and myself tend to be longish but are usually informative.  I prefer not to take them to private email because I feel that that information should be public.  However, if you feel we "pollute" your question, say so.
>> sieglej, the arguments between Todd and myself tend to be
>> longish but are usually informative.

well, to me at any rate....touchee

Avatar of sieglej

ASKER

May I comment before accepting an answer... Thanks for the discussion BTW.  Back to my main concern which may be more athsetics than anything.  That is, my structure will have, say, 50 elements.  The solution seems to always get back to a Get and Set type access function for each element -> this would be 100 functions.  OK so I can do them inline, etc.  but I need to lock the structure so was intending something like this for each element:

void CStatus::SetSlotStatus(WORD slot, WORD stat)
{
   EnterCriticalSection(&m_Lock);
   m_pStatus[slot].m_slotstatus=stat;
   LeaveCriticalSection(&m_Lock);
}
WORD CStatus::GetSlotStatus(WORD slot)
{
   EnterCriticalSection(&m_Lock);
   WORD tempstat = m_pStatus[slot].m_slotstatus;
   LeaveCriticalSection(&m_Lock);
   return tempstat;
}

Simple and fairly common but I look at it and say "garsh, I'm duplicating a lot here to access one stinking element" and now have 100 of these rascals hanging around.  Am I relegated to a 100 functions to deal with my 50 elements in an efficient manner?  Where could the code that is consistent to all of them live (i.e. the Lock/Unlock that needs to occurr for all accesses)?  

Thanks for the input!
JS
I think it could be done using a proxy class.  The class would lock the structure and then update it and then unlock it.  I'm not sure if its worth going to that trouble, but it might be.  But I would like to here what Alex has to say first, before we go that route.

By the way what I do to avoid this, is I have a utility that goes through my spirce code looking for class definitions and creates functions for accessing the data members (Depending on things that I place inside the class definition).
Avatar of sieglej

ASKER

I'm not familiar with using proxy classes - are they too ugly for a situation like this?  But I did gloss over one point of your answers (read with comprehension, kids...)  and that would be that by returning a reference to the element with a Get...() function you could both get and set an element, thus requiring only one function per element(duh)...  
e.g.
int &GetInt(index){   return Mystruc[index].Int;} // also bury bounds checking and lock functions here...

main
{
   int i;
   i = GetInt(3);  // get Int
   GetInt(3) = 5; // set Int to 5
}

>> by returning a reference to the element with a Get...() function you could
>> both get and set an element, thus requiring only one function per element

That is correct.  But this approach does not provide as much protection when implimentaion changes as using seperate get/set functions.  The problem is that if you were to change the way the data is stored, then you would not have something that you could return a reference to any longer.  For example, if you changed from int to float, you would be okay if you have seperate get/set functions, like

class C
{
public:
   float UsedToBeInt;
   int GetInt() { return (int) UsedToBeInt; };
   void SetInt(int i) { UsedToBeInt = (float) i; };
};
You see in this case the int was changed to a float, but as far as code that uses the class is concerned, no change was made because the getint and setint functions haven't changed interfaces.  

But if you had one function that returned a reference like

class C
{
   int i;
   int & GetInt() { return i; };
}

If you changed i to a float, you would not have an int to return a reference to any longer.  

You can get around this using a fancy technique called proxy classes.  Basically a proxy class is a short-lived object that acts an interface between a function and the code that calls the function.  In this case the proxy class would pretend to be an int, but would manage to update the floatstored in the class.  

However, as Alex pointed out, this is not perfect.  It is possible to have code that would not work with the proxy class and you would be forced to change the code.  (in other workds you violated an OO principle because a change in the implimentation of the class forced a change in the code that used the class.)  However, I'm pretty sure that A) that isn't likely (the code alex produced that would break the proxy class, was poor code that I would hope would not actually be used)  and B) if the code does break the proxy class, it will cause a compiler error, so you don't have to worry about run-time errors lurking in your program.  Although perhaps Alex may correct me on that point.

I would be happy to explain a bit about the proxy class, but I would like to hear from Alex first.

One more thing the single Get function that returns a reference doesn't let you lock and unlock the data.  You could do that if you return a proxy class however, so it is not impossible.
Avatar of sieglej

ASKER

>>One more thing the single Get function that returns a reference doesn't let you lock and unlock the data.  You could do that if you return a proxy class however, so it is not impossible.

Good point.  As Alex mentioned, you can't trust the user to not use the reference again bypassing the access function (including any locks)... I'm curious about proxy functions and then will cease and desist on this question and let you guys get back to your lives...


Thanks for all your great input...
JS
We're programmer's we have no lives.  (actually Alex is a new father and therefore has less than no life.)

I'll keep it simple.  Well start with a function that returns a reference to an int member and change it to return a proxy instead.  The purpose of the proxy is to convert from a floating point to and int and the lock the member while it is accessed.  

First the proxy class will be made a friend of the class it works with and will be constructed with a pointer to the object it works on, like (this ignores some necessary forward declarations etc.)

class C
{
   float F;
   friend Proxy;
   CProxy GetInt() { return CProxy(this); } ;
};

class CProxy
{
   C *CPtr;  // -> object to work with.
public:
   CProxy(C *CPtr) : CPtr(CPtr) {}; // Construct for a C class object to work on.
};

There are two ways that the value returned (before an int, now a CProxy) can be used.  It can be "queried", that is, its value can be taken and used for something, and it can be "updated", that is its value can be altered.  

When the return value is querried its value is copied to an int or is compared to an int, like

int X = TheC.GetInt(); // querry.
 if (theC.GetInt() > 5) // querry
SomeFunctionWithIntParameter(TheC.GetInt()); // querry.

The proxy class provides an int conversion operator that allows the proxy to be converted to an int.  If the value is querried, this conversion operator will be called and the proxy class knows the value was queried.  It will lock the class it is working with and then convert the classe's data to an int and return it. That looks like

class CProxy
{
// existing stuff.
public
operator int() {  LockClass();  return (int) CPtr->F; };
};

Note, this leaves the class locked.  We'll get to that later.  

The return value can be updated in lots of ways, like

TheC.GetInt() = 5; // = with an int.
TheC.GetInt()++; // ++ operator.

and many more. I'll deal with just the = operator.  The others are handled similarly.

When the return value is updated, things are more complex, because the proxy must detect that the return value (itself) is being altered and it must then make the change to the F member of the C class object it was created for.  If it is altered with the  = operator, the proxy class's = operator will be invoked, the class will lock the data and update the F member like

class CProxy
{
// existing stuff.
public
int operator = (int i) {  LockClass();  CPtr->F = (float) i; return (int) CPtr->F; };
};

again this left the data locked.

One last thing, the destructor for the proxy class will unlock the data (if it is locked, it might not be).

class CProxy
{
   bool Lock;
   ~CProxy() { if (Lock) UnLockClass(); };
};

Now how does this actually work?  if you do

int X;
X = TheClass.GetInt();

GetInt() returns a CProxy class object.  Now since this object is not saved, it is a temporary object, that is, the compiler creates it and destoys it when it is no longer needed.  The proxy Since the proxy object is assigned to an int, its  int conversion operator is called.  This conversion operator locks the data and converts the float to an int wich it returns and is stored in X, then the compiler decides that the proxy is not needed and destroys it.  The proxy's destructor unlocks the data.

if you do

TheClass.GetInt() = 6;

then GetInt(0 returns a temporary proxy again.  The proxy's operator = (int) is called.  This function locks the data and updates the F member of the associated class.  Then the compiler destroys the temporary proxy, causing its destructor to run so the data can be unlocked.

Simple enough?  As you can see it can get quite involved and is not likely to make you case any simpler.  However, there are times when proxy classes are of great value and can increase efficiency greatly.  I recomend that you get a hold of the "Effective C" and "Effective C++" books by Scott Meyers.  They deal with these sort of topics.  (And few books do).  Each book deals with about 20 differnet topics that tend to be pitfalls that intermediate/advanced C++ programmers should be concerned with.  (its not for beginners.)
Avatar of sieglej

ASKER

Proxies are an interesting idea to what I'm doing.  I'll have to digest it a bit and doodle with it.   Thanks also for your tips on the books.   I'll relent on this question and bump the points for your perserverence.

Thanks,
JS
Some comments:

1. I prefer to use references instead of pointers.

2. There is a [hideously ugly] technique of automatically generating accessors and mutators.  Consider:

    class C
    {
    public:
        int Get_x() { return x; }
        void Set_x(int val) { x = val; }
        float Get_y() { return y; }
        void Set_y(float val) { x = val; }
    private:
        int x;
        float y;
    };

You can create a macro:

    #define DECLARE(type, name)                     \
    private:                                                       \
        type name;                                             \
    public:                                                        \
        type Get_##name() { return name; }          \
        void Set_##name(type val) { name = val; }

And then invoke it for each member.  Thus:

    class C
    {
        DECLARE(int, x);
        DECLARE(float, y);
    };

Yuck!
I make it look nicer.  I have a utility that does it.  In my source code I have the psuedo class definition  (in comments its not real C++ code).  This lists the data members with various attibutes about them, like whethor or not they are public etc.  I have a utility that searches the source code and creates a header file that has a true class defintion generated from this psuedo one.  It also adds a set of accessor and mutator functions as appropriate.  Another thing it does is scan the source code file for the member procedures for the class and add the procedure declarations to the header file.  Thus I never have to worry about making sure my class definiton matches my function definitions.  It does lots of other great things too.
Avatar of sieglej

ASKER

Thanks for the further insights