Start Free Trial

asked on

Bit fields in C C++

This is very basic but I could not find convincing answer on internet so far.

I have below test code -

#include <iostream>

using namespace std;

struct flags {
    unsigned int a : 3;
    unsigned int b : 3;
    unsigned int c : 2;
};

int main()
{
    flags f;
    f.a = f.b = f.c = 4;

    cout << "Size of flags is " << sizeof(f) << endl;
    cout << f.a << " " << f.b << " " << f.c << endl;

    return 0;
}

Open in new window

I get below output -

Size of flags is 4
0 0 0

If I am assigning value less than 4 I get correct output. Its only from 4 the output is zero. I also get below warning during compilation by my gcc compiler (probably 4.8.1) -

warning: large integer implicitly truncated to unsigned type [-Woverflow]|

My understanding is that since its an unsigned int, 3 bits can represent up to value 7.

is that correct? If not then why?

ASKER

I have understood why output is zero. So no need to answer that. I have another question. Why size of structure is 4 bytes?? Why not 1??

There may be unnamed padding at the end of a structure or union.

ASKER

Even in case of bit fields? Does that mean size allocation is not reduced as what probably was intended?

In many implementations
sizeof(struct flags {
unsigned int a;
unsigned int b;
unsigned int c;
}) would be > 4, so size allocation does seem to be reduced.

ASKER

So 4 bytes is fair and accurate allocation? Even though i am not or i can't be using rest 24 bits? Isn't that waste of space which i was primarily trying to avoid? Can you please be little elaborate in your answers?

If you are getting a size of 4 for your structure, I suspect you are using a 32-bit machine. Most implementations will pad a structure to be word-aligned, i.e. 32-bit boundaries on a 32-bit machine.

So, the bitfields are probably being packed to use a single byte, but the structure is then being padded to the next 32 bit boundary.

Try creating a structure with a single char in it - what size does the structure report as?

ASKER

Size of struct with 1 char member is 1 on my 64 bit Windows 8 machine.

#include <iostream>

using namespace std;

struct justChar {
    char c;
};

int main()
{
    justChar s;
    s.c = '0';
    cout << "size of struct is " << sizeof(s) << endl;
    return 0;
}

Open in new window

Output -

size of struct is 1

Process returned 0 (0x0) execution time : 0.073 s
Press any key to continue.

//maybe you wanted something more like this:
struct flags {
unsigned char a : 3;
unsigned char b : 3;
unsigned char c : 2;
};

int main()
{
flags f;
f.c = f.b = f.a = 4;

cout << "Size of flags is " << sizeof(f) << endl;
cout << (int)f.a << " " << (int)f.b << " " << (int)f.c << endl;

return 0;
}

ASKER

No. That is not what I want. I want to know how exactly bit fields are treated by compiler or OS or machine. Is there really an advantage for saving space if we use them? The result of sizeof above do not confirm that and I am looking for sound explanation.

ASKER

Sorry ozo. I mean to add this in my above comment. But here it is. Changing from int to char as per your suggestion does return size as 1 byte. But is it because of using char and not because of explicitly using bitfields which size totals to 1 byte? I want to know why explicit bitfield size is ignored by system?

on my compiler
struct flags {
unsigned int a : 3;
unsigned int b : 3;
unsigned int c : 2;
};
saves space compared to
struct flags {
unsigned int a;
unsigned int b;
unsigned int c;
};
and
struct flags {
unsigned char a : 3;
unsigned char b : 3;
unsigned char c : 2;
};
saves space compared to
struct flags {
unsigned char a;
unsigned char b;
unsigned char c;
};

but how much space it saves can depend on the compiler implementation.

Bitfield size is not ignored by the compiler, it only packs to the basic type supplied.

So in your original example the structure of 3 ints should have used 12 bytes, but the bitfields have caused it to be packed down to a single int of 4 bytes.

Once changed to a basic type of char, the compiler packs all 3 bitfields in to a single char which only takes a single byte.

ASKER

ok. I have tried with multiple data types with bitfields in same struct and it is behaving like union. So it takes basic type if type is same for all bit fields members or takes the biggest one as per my test. Thanks for your answers.

ASKER

I've requested that this question be closed as follows:

Accepted answer: 250 points for Richard Keeble's comment #a40285241
Assisted answer: 250 points for ozo's comment #a40285185
Assisted answer: 0 points for mumbaikar's comment #a40285248

for the following reason:

My comment confirms and ends this discussion with more valuable use cases tested. It proves that structure with bit fields is treated like union.

Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is
implementation-defined. Bit-fields are packed into some addressable allocation unit.

In the case of
struct flags {
unsigned int a : 3;
unsigned int b : 3;
unsigned int c : 2;
};
a 4 byte addressable allocation unit was apparently used.

Bit-fields are not treated like union.

ASKER

ozo:

I tried with below struct -

struct flags {
unsigned char a : 3;
unsigned int b : 3;
unsigned long c : 2;
};

And it gave me size 8 bytes. It has taken biggest data type as size of struct. Isn't this a union behavior??

SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Hm - I'm not sure if I understand what you mean with already defined data size.

Let's say I try following with VisualStudio 2010 C++ compiler:

	struct bf1
	{
		char c1 : 1;
		char c2 : 1;
		char c3 : 1;
	};

	struct bf2
	{
		int i1 : 1;
		int i2 : 1;
		int i3 : 1;
	};

	TRACE( "sizeof( bf1 ) = %i\n", sizeof( bf1 ) );
	TRACE( "sizeof( bf2 ) = %i\n", sizeof( bf2 ) );

Open in new window

From your above statement I would expect the sizes should be the same, but in fact the output of this code is

sizeof( bf1 ) = 1
sizeof( bf2 ) = 4

Open in new window

, so it seems there is a difference between used types here.

ZOPPO

ASKER

When I said bit field struct is treated like union I meant it behaves like union since in case of multiple used type, the biggest type is chosen as container. When type is not given (as suggested by HKDK), int is chosen by default since only integral types are allowed as bitfields. Also, I am not sure why it is 'wrong' to use used types when all the compilers that I tried and read today about, supported that functionality. Or may be it is not a good coding practice. is it??

What I mean by "already defined data size" is that when you declare a bit field, you specify the size of the field in bits.
So why are you trying to add an additional size qualifier?

That's sort of like defining an char like this:
unsigned int char ch = 'A';

Of course the compiler will complain about this type of variable deceleration because you've tried to declare a variable as both an int and a char.

But you're effectively doing the same thing with ...
struct flags{ unsigned int a : 2; };
...because you've used both the 'int' key word and the ':2' bit-field syntax.

However, the compiler allows you to add a char, int, or long to the bit field definitions as a way of overriding the default behavior of using an int.

So the following two syntax results in the exact same compiled code:
struct flags{ unsigned a : 2; };
struct flags{ unsigned int a : 2; };

But when you use...
struct flags{ unsigned char a : 2; }; //or
struct flags{ unsigned long a : 2; };
... you are requesting the compiler to override the default of using an 'int' to store the bit field and specifying that char or long be used.

In the case of ...
struct flags{
unsigned char a : 2;
unsigned int b : 2;
unsigned long c : 2;
}
... the compiler will use the largest of the different data sizes requested. In other words, rather than using a char, an int, and a long, it starts with a size of long. Of course 'c' is placed in this long, and if 'b' and 'a' can also fit inside the long, they will get included as well within one long.

Actually, I'm wrong about a, b, and c being combined into one long.
When I actually tried that on my compiler, it used 8 bytes to store the above structure. But if I changed all three to long, it used only 4 bytes.

struct flags{
      unsigned long a:2;
      unsigned long b:2;
      unsigned long c:2;
};
int i = sizeof(flags); //i = 4

struct flags{
      unsigned char a:2;
      unsigned int b:2;
      unsigned long c:2;
};
int i = sizeof(flags); //i = 8

IMO there may be reasons why the programmer wants to have control about the size, i.e. if it's planned to save those structs. If i.e. I know I will never need more than 8 bits in a bit field I would appreciate it doesn't use 4 bytes and blows up my files, if I now only use 3 bits but can imagine I'll later have to add 10 additional bits I would choose a 4-byte sized struct now to ensure the file format won't break.

You can also add additional padding by using the bit syntax without using a name:

struct {
  field1 : 1;  //1 bit field
         : 1;  //1 bit of padding
  field2 : 1;  //2nd 1 bit field
         : 5;  //5 bits of padding
  field3 : 2;  //2 bit field
         : 6; //6 bits of padding 
  };

Open in new window

ASKER

ok. So I was wrong when I thought it behaves like union.

So if I pass all long bit fields, it stores it in INT container even if total bit field size is less than 1 byte (could have been fit in CHAR container). But if we pass all char fields, it chooses to store in CHAR container. And no type is given, by default it chooses INT container. If multiple types given then largest type is chosen as container. Any idea why such behavior??

No claim that this will work on your system.
http://msdn.microsoft.com/en-us/library/aa273913(v=vs.60).aspx
I ran the following on Cygwin using g++.

#include <iostream>

using namespace std;

#pragma pack(push, 1)
struct flags {
    unsigned int a : 3;
    unsigned int b : 3;
    unsigned int c : 2;
};

int main()
{
    flags f;
    f.a = f.b = f.c = 3;  // reduced to 3 from 4 since 4 does not fit into 2-bits

    cout << "Size of flags is " << sizeof(f) << endl;
    cout << f.a << " " << f.b << " " << f.c << endl;

    return 0;
}

Open in new window

Output:

$ ./a
Size of flags is 1
3 3 3

Open in new window

@HooKooDooKu
Could you justify your claim that others are wrong by quoting from the standard?

BTW - using bit fields within a program is ok. It is not necessarily faster and could be slower due to bit-twiddling. Bit field usage is often not-portable.

struct bitflags {
unsigned char a : 3;
unsigned char b : 3;
unsigned char c : 3;
};
union unionflags {
unsigned char a : 3;
unsigned char b : 3;
unsigned char c : 3;
};

int main()
{
{
bitflags f;
f.c = f.b = f.a = 4;
f.b = 2;

cout << "Size of bitflags is " << sizeof(f) << endl;
cout << (int)f.a << " " << (int)f.b << " " << (int)f.c << endl;
}
{
unionflags f;
f.c = f.b = f.a = 4;
f.b = 2;

cout << "Size of unionflags is " << sizeof(f) << endl;
cout << (int)f.a << " " << (int)f.b << " " << (int)f.c << endl;
}
return 0;
}
//Size of bitflags is 2
//4 2 4
//Size of unionflags is 1
//2 2 2

ASKER

@Ozo:
What are you trying to demonstrate with that code? Please explain. Thanks. I think in union we can only access one member at a time and the recent one is stored in memory. In this case, f.b.

@phorric:
pragma didnt work my system. Using GNU g++ 4.8.1 compiler with CodeBlock IDE. its a 64 bit Windows 8 machine.

Demonstrating differences between union and bit-field
On my system, CHAR_BIT==8, so 9 bits requires another storage unit

When you say that the #pragma did not work, did it not compile or was the size still 4?

Looks like I am using g++ 4.8.2. See if this works.
#pragma pack(1)

More complete documentation for your 4.8.1 compiler for packing is here:
https://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/Structure_002dPacking-Pragmas.html#Structure_002dPacking-Pragmas

ASKER

It compiled but size remained 4. Let me try without push.

@HooKooDooKu
Could you justify your claim that others are wrong by quoting from the standard?
"Wrong" could be considered too strong a word... but it is a simple fact is that the key words like char, int, long, and others are not needed to define bit fields.

And in this discussion... their use seems to be adding to confusion rather than clarifying anything.

The fact that these key words are allowed means the programmer is being given some freedom in helping to determine how much memory is allocated to hold the bit fields. (But then again, there was a good reason years ago, an author wrote a book about C++ programming and entitled it "Just Enough Rope to Shoot Yourself in the Foot").

I can find no good reason to utilize these additional keywords except to save some memory allocation if the sum of the bit fields will fit in a variable type that is smaller than the default int.

So I could understand anyone making an argument for something like...
struct flags {
unsigned char a : 3;
unsigned char b : 3;
unsigned char c : 2;
};

because that would allow the compiler to only use 1 byte of memory for the bit fields rather than the default of 4 bytes (on a 32-bit compiler).

It would be harder to justify using something like ...
struct flags{
unsigned long long a : 3;
unsigned long long b : 3;
unsigned long long c : 2;
};

It would be harder to justify using something like ...
struct flags{
unsigned long long a : 3;
unsigned long long b : 3;
unsigned long long c : 2;
};

Why is it hard to justify. It documents the intended usage, as in:
unsigned long long ull = flags.a;

No one claimed that key words like char, int, long, are needed to define bit fields
but since "unsigned" by itself means “unsigned int”, it seems clearer to be explicit.

ASKER

Mine is a 64 bit compiler and default is 4 byte when explicit type is not specified in bit fields.
Also pragma without ush gives same 4 bytes to me.

Why is it hard to justify. It documents the intended usage, as in:
unsigned long long ull = flags.a;

char ch = 1
int i = ch;

...That's conceptually the same thing as...

struct {
unsigned a : 2;
}x;
x.a = 1;
unsigned long long ull = x.a;

In the 1st case, you're taking an 8 bit value and copying it to a 32 bit value (assume 32 bit compiler).
In the second case, you're taking a 2 bit value and copying it to a 64 bit value.

What have you gained by using...
struct{
unsigned long long a : 2;
}x;
...? Other than you have now allocated 8 bytes of memory to story a 2 bit value rather than 4 bytes of memory to story a 2 bit value.

But if you use...
struct{
unsigned char a : 2;
}x;
x.a = 1;
unsigned long long ull = x.a;
...you are still copying a 2 bit value to a 64 bit value, but you've saved some memory by only setting aside 1 byte to store x rather than 4 (default) or 8 (if you define 'a' with a long long prefix).

Sometimes using 8 bytes rather than 4 bytes can be useful for alignment purposes.
And even in situations where you prefer using less memory,
some implementations may use less memory for
struct flags {
unsigned long long a : 17;
unsigned long long b : 17;
unsigned long long c : 17;
unsigned long long d : 17;
unsigned long long e : 17;
};
than for
struct flags {
unsigned int a : 17;
unsigned int b : 17;
unsigned int c : 17;
unsigned int d : 17;
unsigned int e : 17;
};
because with 32 bit ints it may put each field into a separate storage unit
rather than fitting more fields into a larger storage unit.

I've managed to somewhat figure out what is going on...

When you designate bit fields, the compiler will try to combine bit fields that are in the same data type that are side-by-side.

So the sizeof() the following two structures is identical...
struct flags{
unsigned char a:2;
unsigned char b:2;
unsigned int c:2;
unsigned int d:2;
unsigned char e:2;
unsigned char f:2;
};
struct f{
unsigned char a;
unsigned int b;
unsigned char c;
};

What that exact size if depends upon how your compiler aligns data within a structure. In my case, everything gets aligned on 4 byte boundaries, therefore the compiler will allocate 4 bytes for the 1st char, allocate 4 bytes for the int, and allocate another 4 bytes for the 2nd char;

However, if you define a structure that will fit within a byte, it seems that the compiler can decide to ignore the 4 byte alignment and allow you to create an array of that structure where each successive byte is the next element of the array.

@mumbaikar,
If you give me your OS (including release), and the exact command line used to compile/build the sample program I provided using the pragma pack approach, if I have that lying around (or can download), then maybe I can get to the bottom of why you still get 4 bytes when I got one byte size.

Depending on how you build, there are hidden flags that may affect the compilation.

mumbaikar seems to have gotten one byte size in http:#a40285235

ASKER

My system Info -

Hardware information:
CPU Type: Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz
Number of Processors : 2
Number of Cores: 4
Total System Memory: 3.8821GB
OS : Windows 8 Single Language, 64-bit Operating System, x64 based processor

Data Type Size on this Machine
----------------------------------------------------
CHAR 1
SHORT 2
INTEGER 4
LONG 4
FLOAT 4
DOUBLE 8
LONG LONG 8
LONG DOUBLE 16
DOUBLE LONG 16

Compiler Info - MinGW-w64 - for 32 and 64 bit Windows
Downloaded 1 month back from below link to integrate with code block -
http://sourceforge.net/projects/mingw-w64/

The compiler flags for this application that I am aware of are -
-c = compile
-o = object file
-g = debugging info
-Wall = Warnings
-std=C++11 = C11 support

Can you write down the exact command line used to build/run the small program in http:#a40285873 that you ran with and without the push and still got size 4. I will try to download your environment as soon as I get time. (Or if another expert has it already, and can get a size of 1 from that program, or explain why the pack did not work, then that would also work.)

If you are actually testing in a different program, then try the simple program that I posted. Be sure to include in the command line, the entire filename with extension.

ASKER

I tested it in isolated single program as given below -

#include <iostream>

using namespace std;

#pragma pack(1)          //Tried #pragma pack(push,1) but same result

struct flags {
    unsigned int a : 3;
    unsigned int b : 3;
    unsigned int c : 2;
};

int main() {
    flags f;

    f.a = 1;
    f.b = 1;
    f.c = 1;

    cout << "Size of flags is " << sizeof(f) << endl;
    cout << f.a << " " << f.b << " " << f.c << endl;

    return 0;
}

Open in new window

Output -

Size of flags is 4
1 1 1

Process returned 0 (0x0) execution time : 0.115 s
Press any key to continue.

Here is the compiler command line -

x86_64-w64-mingw32-g++.exe -Wall -fexceptions -std=c++11 -Wall -g -pg -c C:\Mumbaikar\CPP\BitFields.1\main.cpp -o obj\Debug\main.o

Linker command -

x86_64-w64-mingw32-g++.exe -o bin\Debug\BitFields.1..exe obj\Debug\main.o -pg -lgmon

I ran your commands in Cygwin, but still got size of 1 using #pragma pack(1).
Does anyone here have mingw32 for trying those commands on your systems? If not, I will have to install it when I get time and see for myself what is going on.

ASKER

@phoffric
Any update on this please?

ASKER CERTIFIED SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER

Appreciate everyone's efforts for nailing this down to the last bit!