asked on

union initalization

Hello,

my problem is the following.
I want to allow the static initalization of a record array.
The record has one field supposed to be a union of different types.

It's structure is the following:

typedef struct {
char* name;
int type;
union {
int x;
char* str;
double flt;
} val;
} fieldRec;

fieldRec recordDef[] = {
{ "field1", intType, 3 },
{ "field2", strType, "a string" },
{ "filed3", fltType, 0.123456789 }
};

This is wraped inside of a macro but this is what is effectively written.

The problem is that static initalization of union can only be done with using the first type of the union.
Casting to the first type is not a good solution.
It may work with pointers on most computers but will fail with double since it does a conversion and not just a cast.
(I use -int64 under visual C++) The value given in the example will be converted to zero.

My objective is to provide an easy and intuitive way to define and initialize this field definition vector.
I want this initialization to be static because there are plenty of them.

Thus I would like to avoid constructors if possible and if it could be compatible with C, it would be the best solution.

Usage of a macro to wrap the field declaration is ok. Avoiding the need for the user to cast the given value would be allright.

Any suggestion ?

The points will be increased if the answers satisfies all requirements.

Mirkwood

Sorry. Read the KB article below

INFO: Initializing Unions Initializes First Member of the Union
Last reviewed: September 2, 1997
Article ID: Q47693
The information in this article applies to:
Microsoft C for MS-DOS, versions 6.0, 6.0a, 6.0ax
Microsoft C/C++ for MS-DOS, version 7.0
Microsoft Visual C++ for Windows, versions 1.0, 1.5, 1.51, 1.52
Microsoft Visual C++ 32-bit Edition, versions 1.0, 2.0, 2.1, 4.0, 5.0

SUMMARY
When initializing a union, the initialization value is applied to the first member of the union even if the type of the value matches a subsequent member. As stated in the ANSI Standard, Section 3.5.7:

A brace-enclosed initializer for a union object initializes the
member that appears first in the declaration list of the union
type.

Because you cannot initialize the value of any member of a union other than the first one, you must assign their values in a separate statement. Initializing a union with a value intended for a subsequent member causes that value to be converted to the type of the first member.

MORE INFORMATION
The following example demonstrates the issue:

Sample Code

/* Compile options needed: none
*/

#include <stdio.h>
union { int a;
float b;
} test = {3.6}; /* This is intended to initialize 'b' */
/* however, the value will be converted */
/* (first to a long and then to an int) */
/* in order to initialize 'a'. */

void main (void)
{
float dummy = 0.0; /* This causes the floating point */
/* math package to be initialized. */
/* Not necessary with VC++ for */
/* Windows NT. */

printf ("test.a = %d, test.b = %f\n", test.a, test.b);
}

The output from the example, though not what is intended, is as follows:

test.a = 3, test.b = 0.00000

To associate a value with "b", you can reverse the order of the members, as in the following:

union {
float b;
int a;
} test = {3.6};

Or, you can retain the order of the elements and assign the value in a separate statement, as in the following:

test.b = 3.6;

Either of these methods creates the following output:

test.a = 26214, test.b = 3.600000

Under Windows NT, the output would be as follows:

test.a = 1080452710, test.b = 3.600000

REFERENCES
For examples and explanation of possible compiler errors and warnings generated when attempting to initialize a non-primary union element, please see the following article in the Microsoft Knowledge Base:

ARTICLE-ID: Q39910
TITLE : PRB: Initializing Non-Primary Union Element Produces Errors
Keywords : CLngIss kbcode kbfasttip
Version : MS-DOS:6.0,6.00a,6.00ax,7.0; WINDOWS:1.0,1.5,1.51,1.52; WINDOWS NT:1.0,2.0,2.1,4.0,5.0
Platform : MS-DOS NT WINDOWS
Issue type : kbinfo

--------------------------------------------------------------------------------

================================================================================

THE INFORMATION PROVIDED IN THE MICROSOFT KNOWLEDGE BASE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. MICROSOFT DISCLAIMS ALL WARRANTIES, EITHER EXPRESS OR IMPLIED, INCLUDING THE WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL MICROSOFT CORPORATION OR ITS SUPPLIERS BE LIABLE FOR ANY DAMAGES WHATSOEVER INCLUDING DIRECT, INDIRECT, INCIDENTAL, CONSEQUENTIAL, LOSS OF BUSINESS PROFITS OR SPECIAL DAMAGES, EVEN IF MICROSOFT CORPORATION OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. SOME STATES DO NOT ALLOW THE EXCLUSION OR LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES SO THE FOREGOING LIMITATION MAY NOT APPLY.

Last reviewed: September 2, 1997
) 1998 Microsoft Corporation. All rights reserved. Terms of Use.

nietod

It can't be done that way. But there is a way to reach your goal.

Use a class instead of a structor (actually not necessary, you can do it with struct) and provide constructors for each of the cases you want to support. those constructors can perfom the initialization.

nietod

It would be used like

FieldRec recordDef[] = {
FieldRec( "field1", intType, 3 ),
FieldRec( "field2", strType, "a string" ),
FieldRec( "filed3", fltType, 0.123456789) };

Not quite as convenient, but pretty good. Let me know if you have any questions.

meessen

ASKER

Mirkwood, I rejected your answer not because what you say is wrong, but becasue it does not answer my question. I knew already what is contained in your answer for the static initialization constrain on unions.

In your document it said the following:

"...
Or, you can retain the order of the elements and assign the value in a separate statement, as in the following:
test.b = 3.6;
."

Just to be sure, is this an instruction, not a static initialization ?
I don't want instructions because there might be hundreds (up to one tousand) of such vectors defined in a program. Initializing all theses by instructions would make program initialization havy.

I thought of using static initializer much more efficient since they can all be copied into memory in one block.

About uding classes
-----------------------------
This look like a good solution.

The solution will not be C compatible and how is the class initialized ? Will it be a static initialization ?

Suppose I define the constructor as

field( char* name, int type, char* str):val.str(str){}
field( char* name, int type, int x):val.int(x){}

Whera val is a union inside the structure or class with one field named str of type char* and another field named int of type int.

Will this be equivalent to a static initialization ?

Mirkwood

Nope, what you describe is code and code is executed while initialization is just there. Code cannot be outside a code block so that won't work.

nietod

>> The solution will not be C compatible and how is the class initialized ?
>> Will it be a static initialization
Nope. Constructors are performed only at run time.

meessen

ASKER

Related question

If I define a global

int ar[] = {1,2,3,4,5,6,7,8,9 };

Will there be instructions executed to initialize each of these globals ?
In VC++ the debugger moves to each global initalization as an instruction step.
Does it mean, each initialization is equivalent to an instruction or will all these variables be initialized with a single bloc copy into memory ? I know that this is how it worked with older Mac compilers and my work like this with unix compilers. This is the data bloc in the binary file.

If this is equivalent to an instruction, it may be wiser to avoid static initialization because there are many of them and most of the time only a few number of them are effectively used. In this case a Just In Time initialization would be more appropriate.

The question would then move to :

What macro to define in ordre to wrap definition of a function initalizing the array which might also be dynamic.

The function might look like this

fieldRec* myArray(){
static fieldRef* val = NULL;
if( val || !(a = malloc( X )) ) return a;
a->name = "myName";
a[0]->type = 1;
a[0]->val.flt = 0.452;
....
}

But I would like that the user only writes/see this

MESSAGEDEF(NAME){
FIELDDEF(TYPE,VALUE);
FIELDDEF(TYPE,VALUE);
FIELDDEF(TYPE,VALUE);
FIELDDEF(TYPE,VALUE);
} ENDMESSAGEDEF;

Where FIELDEF and the value type may vary from message to message.

nietod

>>If I define a global

>> int ar[] = {1,2,3,4,5,6,7,8,9 };

>> Will there be instructions executed to initialize each of these globals ?
That is implimentation defined (meaning that it is up to the compiler manufacturer and we can not say). However, I can't image that any will not have the information pre-initialized in a release version. I can't promise you that, but I would count on it. In the release versions, VC does currently have static information pre-initialized at the moment. I'm surprised it doesn't in the debug versions. Now this only applies to POD (Plain old data), data that does not have a constructor. I there is a constructor, it will be run at load time.

nietod

>> If this is equivalent to an instruction, it may be wiser to
>> avoid static initialization because there are
>> many of them and most of the time only a few number
>> of them are effectively used. In this case
>> a Just In Time initialization would be more appropriate.
you can safely use static intitialization for POD. Reserve just in time initialization for classes with constructors. But even then, only of there is a really long initialization process. Have you tried any timing studies to see if this is a problem, you may be worried about something that is going to be increadibly fast. A pentium should be able to initialize a memory location with an immediate in a single clock cycle. There is not likely to be any wait for EA calculation in these cases, but there may be a wait for the bus (not ussualy) and there may be a wait of the data is aligned wrong (again that should not happen), so lets say the worst case is 3 clock cycles. (1 is probably far more likely). Then on a 300Mhz computer you should be able to initailize 100 million integers in a second. You have to share CPU time with other programs and the OS, so maybe lets cut that down to 25 million. Isn't that sufficient for your needs? And that's a worst case. I expect the average to be far better.

meessen

ASKER

I didn't tried any timing studies. Your numbre are reassuring me.

Yesterday I implemented a version using dynamic initialization.
The definition macro now defines a function.
It is written like this

defPtr myMessageDef(){
static defPtr it = NULL;
if( it ) return it;
it = malloc(.....);
...

It works well. A definition now looks like

MR_DEF( myMessage )
MR_CST( field1, mr_INT, 12345 );
MR_CST( field2, mr_DBL, 0.12345 );
MR_CST( field3, mr_STR, "This is a string");
MR_DEF_END;

I now need this MR_DEF_END because I need to close the braket and return the initialized dynamic bloc.

The good news is that I only initialize when I really need it, that I can now also add a hash value of the definition which might be usefull to distinguish between historic variation of the same message definition between compiled programs of different generation. This was a problem left open by using static initializers.

So now I'm happy with the solution I came up with. I would like to be honnest and share the points between the people that proposed their help and spent some precious time for me.

There is a same question on the C list. Nietod you can answer here, you get the points and if mirkwood answer on the c list he gets the point.
I'm not satisfied at all with the answers I got on the C list.

ASKER CERTIFIED SOLUTION

nietod

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial