padding tradeoff

Compilers do padding of struture members to improve the performance with the expense of memory.  (its a well-known trade-off)


typedef struct {
      uint32_t a;   //4 byte
      uint64_t b;   // 8 byte

On a 32 bit machine, sizeof(TEST) will be 12.
On a 64-bit machine, sizeof(TEST) will be 16. (because of the extra padding added for after member a)

1) What is the default behavior of compiler in gcc? Will it add paading by default or will it complie it a packed structure without padding? Is there a #define to contol this behavior?

2) Lets say if it is adding padding, does it "zeroed" those extra padded bytes? How would the runtime know how mich data to be read for member "a" and what is the starting address of member "b" as there is 4 bytes padded after "a"
Who is Participating?
jkrConnect With a Mentor Commented:
The default behaviour is to add packing - if you don't want that, you can turn that off or fine-tune the behaviour by using '-fpack-struct[=n]' (see also the docs at

And to address the 2nd part of your question: The runtime does not care if *your* structs are padded otr not, since it is solely your code (compiled by gcc/g++) that accesses it, and every code that is supposed to deal with these has be properly instrumentated by gcc/g++ during the compile phase.
perlperlAuthor Commented:
i am little confused.
so basically if we dont specify any option to gcc during compile time, it will do "PADDING" for struct members for performance optimization. Correct?
Yes, that's right. IMO turning off padding nowadays only makes sense on embedded systemms with extremely low amounts of memory, and you'll hardly encounter these "in the wild" any more.
To reduce padding in large structures, cluster the largest data types first (e.g., double), followed by a cluster of the next largest data type (e.g., float or long), and so on down to char. One area where you may want to add a lot of padding is when you are working in a multithreading program and have locking and shared data variables near each other. You may want to add, say, char padding[128], between the locking variable (e.g., semaphore or mutex) and the shared data variable so that the synchronization variables are in a different cache line than the shared variable. Then, when one thread modifies the shared variable, the other threads do not have to invalidate their cache lines for the synchronization variable.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.