views:

645

answers:

8

I've spent a few minutes manually re-ordering fields in a struct in order to reduce padding effects[1], which feels like a few minutes too much. My gut feeling says that my time could probably be better spent writing up a Perl script or whatnot to do this kind of optimization for me.

My question is whether this too is redundant; is there already some tool that I'm not aware of, or some compiler feature that I should be able to turn on[2] to pack structs?

The issue is even more complicated by the fact that this needs to be consistently optimized across a few different architectures, so whatever tool used needs to be able to account for different struct alignments and pointer sizes as well.

EDIT: A quick clarification -- what I want to do is re-order the field in the source code in order to avoid padding, not "pack" the struct as is compiling without padding.

EDIT #2: Another complication: depending on the configuration, sizes of some data types may also change. The obvious ones are pointers and pointer-diffs for different architectures, but also floating-point types (16, 32 or 64-bit depending on the 'exactness'), checksums (8 or 16-bit depending on 'speed') and some other non-obvious stuff.

[1] The struct in question is instantiated thousands of times on an embedded device, so each 4-byte reduction of the struct could mean the difference between a go and no-go for this project.

[2] Available compilers are GCC 3.* and 4.* , Visual Studio, TCC, ARM ADS 1.2, RVCT 3.* and a few others more obscure.

+1  A: 

Have a look at #pragma pack. This changes how the compiler aligns elements in the structure. You can use it to force them to be closely packed together without spaces.

See more details here

Howard May
Structures aren't packed by default because accessing aligned members is more efficient. Reordering a struct can reduce the size of the struct without actually breaking the alignment of any member.
Artelius
Not what he was asking... though it will give him the optimal packing.
Dan Olson
+3  A: 

If every single word you can squeeze out of the storage is critical, then I have to recommend optimizing the struct by hand. A tool could arrange the members optimally for you, but it doesn't know, for example, that this value here that you're storing in 16 bits actually never goes above 1024, so you could steal the upper 6 bits for this value over here...

So a human will almost certainly beat a robot on this job.

[Edit] But it seems like you really don't want to hand-optimize your structs for each architecture. Maybe you really have a great many architectures to support?

I do think this problem isn't amenable to a general solution, but you might be able to encode your domain knowledge into a custom Perl/Python/something script that generates the struct definition for each architecture.

Also, if all your members have sizes that are powers of two, then you will get optimal packing simply by sorting members by size (largest first.) In that case, you can just use good old-fashioned macro-based struct-building - something like this:

#define MYSTRUCT_POINTERS      \
    Something*  m_pSomeThing;  \
    OtherThing* m_pOtherThing; 

#define MYSTRUCT_FLOATS        \
    FLOAT m_aFloat;            \
    FLOAT m_bFloat;

#if 64_BIT_POINTERS && 64_BIT_FLOATS
    #define MYSTRUCT_64_BIT_MEMBERS MYSTRUCT_POINTERS MYSTRUCT_FLOATS
#else if 64_BIT_POINTERS
    #define MYSTRUCT_64_BIT_MEMBERS MYSTRUCT_POINTERS
#else if 64_BIT_FLOATS
    #define MYSTRUCT_64_BIT_MEMBERS MYSTRUCT_FLOATS
#else
    #define MYSTRUCT_64_BIT_MEMBERS
#endif

// blah blah blah

struct MyStruct
{
    MYSTRUCT_64_BIT_MEMBERS
    MYSTRUCT_32_BIT_MEMBERS
    MYSTRUCT_16_BIT_MEMBERS
    MYSTRUCT_8_BIT_MEMBERS
};
Charlie Tangora
Until someone builds a smarter robot (for this job)!
SDX2000
Agreed; there's a lot of context-dependent knowledge involved here. Of course, if you have a very large number of structures, and you can embed all of that knowledge in a format that a tool could use, then it might be possible to automate it.
Steve Melnikoff
+1  A: 

Most C compilers won't do this based on the fact that you can do weird stuff (like taking the address of an element in the struct and then use pointer magic to access the rest, bypassing the compiler). A famous example are the double linked lists in the AmigaOS which used guardian nodes as head and tail of the list (this makes it possible to avoid ifs when traversing the list). The guardian head node would always have pred == null and the tail node would have next == null, the developers rolled the two nodes into a single three-pointer struct head_next null tail_pred. By using the address of head_next or the null as the address of the head and tail nodes, they saved four bytes and one memory allocation (since they needed the whole structure only once).

So your best bet is probably to write the structures as pseudo code and then write a preprocessor script that creates the real structures from that.

Aaron Digulla
No C compiler will do this, as that would break the specification, which requires fields of a struct to appear in memory in the order they are declared in the struct.
unwind
Didn't feel like breaking out the specs.
Aaron Digulla
+3  A: 

There is Perl script called pstruct that is usually included with Perl installations. The script will dump out structure member offsets and sizes. You could either modify pstruct or use its output as a starting point for making a utility that packs your structures the way you want.

$ cat foo.h 
struct foo {
    int x;
    char y; 
    int b[5];
    char c;
};

$ pstruct foo.h
struct foo {
  int                foo.x                      0       4
  char               foo.y                      4       1
                     foo.b                      8      20
  char               foo.c                     28       1
}
sigjuice
A: 

It'll depend on the platform/compiler too. As noted, most compilers pad everything to a 4-byte alignment (or worse!), so assuming a struct with 2 shorts and a long:

short
long
short

will take up 12 bytes (with 2*2 bytes of padding).

reordering it to be

short
short
long

will still take up 12 bytes as the compiler will pad it to make data access quicker (which is the default for most desktops, as they prefer quick access over memory usage). Your embedded system has different needs, so you will have to use the #pragma pack regardless.

As for a tool to reorder, I would simply (manually) reorganise your struct layout so that different types are placed together. Put all the shorts in first, then put all the longs in, etc. If you're going to get packing done, that's what a tool would do anyway. You might have 2 bytes of padding in the middle at the transition points between types, but I wouldn't consider that to be worth worrying about.

gbjbaanb
Good advice, but see the latest edit ...
Christoffer
and to think i deleted my answer concerning different datatype sizes! Regardless, if you put all the fields of the same type together, you're going to get optimum packing no matter how large each field is.
gbjbaanb
I'm not sure about "everything to 4-byte alignment"; the compiler will ensure that each member meets its minimal alignment requirement. For example, if long double needs 16-byte alignment, then a char followed by a long double will leave a 15 byte hole; but a short typically needs a 2-byte alignment and a char followed by a short leaves a 1 byte hole (and the ensemble - char, short - followed by long double would leave a 12 byte hole, but followed by a 32-bit int would leave no hole between the short and the int). Etc.
Jonathan Leffler
no, a char followed by a long-double will typically use up 1byte+3padbytes+16bytes. Processor line alignment works like that so the char can be extracted without any bitshifting, but you can tell it to do it differently, align to 0 instead of 4 and your app will just execute slower. You're thinking that everything needs to be aligned by the largest individual type.
gbjbaanb
A: 

The compiler may not reorder fields in structs by its own head. The standard mandates that fields should be layed out in the order they are defined. Doing something else might break code in subtle ways.

As you write, it's of course entirely possible to make some kind of code generator that shuffles around fields in an efficient way. But I prefer to do this manually.

kotlinski
A: 

Thinking about how I'd go about making such a tool... I think I'd start with the debugging info.

Getting the size of each structure from the source is a pain. It overlaps a lot of work that the compiler already does. I'm not familiar enough with ELF to say exactly how to extract the structure size info from a debug binary, but I know that the info exists because debuggers can display it. Perhaps objdump or something else in the binutils package can get this for you trivially (for platforms that use ELF, at least).

After you've got the info, the rest is pretty straightforward. Order the members from largest to smallest, trying to keep as much as the ordering of the original struct as possible. With perl or python it'd even be easy to crossreference it with the rest of the source and maybe even preserve comments or #ifdefs depending on how cleanly they were used. The biggest pain would be changing all initializations of the struct in the entire codebase. Yikes.

Here's the thing. It sounds really nice, but I don't know of any such existing tool that does this, and by the time you write your own... I think you'll have been able to manually reorder most of the structs in your program.

Dan Olson