views:

151

answers:

4

Background: The compiler may insert padding into a struct to make it's members align better. This will result in the sizeof the struct being larger than the sum of the sizes of it's members. Reordering the members of the structure so they pack better can remove the need for the compiler to pad in this manner and make the struct smaller saving memory. I need to get those memory savings.

The fallback option is to check every struct by hand. I'm looking for an automated approach that can cut down the effort.

Even if it only reduces the number of structs to be checked by hand that would help.

So for example a process/tool/etc that lists all the enums that are bigger than the sum of the sizes of their members, while not perfect would still be helpful as it would limit the ones that need to be manually checked.

Does anyone know of any tools that can do this or can anyone suggest any approaches that might help.

p.s. I need to do this on an embedded C codebase containing over 1 million lines of code.

+1  A: 

You can write a program that in turn writes out a small C program for every permutation of the fields in the struct, and when the output program is compiled and run its prints out the struct size. This will become impractical if the number of fields becomes much larger than 10 or so.

GregS
This doesn't make too much sense. In general, ordering fields largest-to-smallest generates optimal padding.
Anon.
@Anon does that remain true if some type sizes are not powers of two, for instance 10-byte extended doubles or even `char t[5]` ?
Pascal Cuoq
An array of `char` should be ordered with other `char`s - perhaps it would be more accurate to say fields should be ordered by alignment requirements.
Anon.
@Anon I see. I think the rephrased version can be mathematically proved, assuming (1) alignments are powers of two and (2) sizes are multiples of alignments. I once encountered a compiler whose documentation seemed to contradict (2), but in fact the compiler did satisfy it, the documentation was just wrong.
Pascal Cuoq
@Pascal: sizes have to be multiples of alignments, otherwise the second element in an array would be misaligned. C has the identity `(char *)(p+1) == (char *)p + sizeof(*p)`. At least, that's what I think "contiguously allocated" in the standard is supposed to mean - if there are bytes in between the end of p[0] and the start of p[1], then they aren't contiguous.
Steve Jessop
@Steve That's how we reasoned. For what it's worth, the buggy documentation we encountered seemed to say that char arrays (of any size) would be aligned on a 4-byte boundary. And that was for a "major" (as the protagonist in Fight Club would say) embedded compiler.
Pascal Cuoq
AFAIK it's legal to do that - say that char arrays will be aligned in that way, even though an individual char would not. `struct Foo { char a; char b[1];};` is allowed to have padding between a and b, even though `char c[2]` isn't allowed to have padding between c[0] and c[1]. Whether it's a good idea, and whether this compiler actually did what it said, are independent of whether it's legal :-)
Steve Jessop
Come to think of it, there is a reason - in the same way that malloc guarantees that allocations are aligned for any object even though it can only allocate a char array, you might want to create a big char array up front as an automatic or global, or in a struct which is one of those, and then later use it for other types. Embedded programming is approximately 85% writing custom allocators ;-)
Steve Jessop
+1  A: 

CIL is a robust C parser written in OCaml that understand the padding of structs. It comes with a detection C program. Struct padding is platform-specific, I do not doubt you know it, but you could have made it clearer in your question. The detection program packaged with CIL detects the size of types, and the algorithm that CIL assumes is used for the padding of structs is that the n-th field's offset is computed by rounding up (offset of the (n-1)-th field + size of the (n-1)-th field) to the nearest multiple of (alignment of the n-th field).

It would be less than 200 lines of OCaml to make the tool you need, starting from CIL. But there may be better solutions yet.

Pascal Cuoq
+4  A: 

gcc's -Wpadded warning option can be used to tell you when a structure is being padded. This won't tell you when the structure can be made smaller, but it can help reduce the work.

ergosys
I feel like an idiot for not having found that option already.
tolomea
+3  A: 

pahole is a utility written for this specific purpose. It'll analyse your compiled object files (compiled with debugging enabled), and show you the structure holes.

caf
came back and upvoted you, having used it for a bit now I have to say pahole is awesome
tolomea