tags:

views:

303

answers:

8

I have a linked list, which stores groups of settings for my application:

typedef struct settings {
  struct settings* next;
  char* name;
  char* title;
  char* desc;
  char* bkfolder;
  char* srclist;
  char* arcall;
  char* incfold;
} settings_row;
settings_row* first_profile = { 0 };

#define SETTINGS_PER_ROW 7

When I load values into this structure, I don't want to have to name all the elements. I would rather treat it like a named array -- the values are loaded in order from a file and placed incrementally into the struct. Then, when I need to use the values, I access them by name.

//putting values incrementally into the struct
void read_settings_file(settings_row* settings){
    char* field = settings + sizeof(void*);
    int i = 0;
    while(read_value_into(field[i]) && i++ < SETTINGS_PER_ROW);
}

//accessing components by name
void settings_info(settings_row* settings){
    printf("Settings 'profile': %s\n", settings.title);
    printf("Description: %s\n", settings.desc);
    printf("Folder to backup to: %s\n", settings.bkfolder);
}

But I wonder, since these are all pointers (and there will only ever be pointers in this struct), will the compiler add padding to any of these values? Are they guaranteed to be in this order, and have nothing between the values? Will my approach work sometimes, but fail intermittently?

edit for clarification

I realize that the compiler can pad any values of a struct--but given the nature of the struct (a struct of pointers) I thought this might not be a problem. Since the most efficient way for a 32 bit processor to address data is in 32 bit chunks, this is how the compiler pads values in a struct (ie. an int, short, int in a struct will add 2 bytes of padding after the short, to make it into a 32 bit chunk, and align the next int to the next 32 bit chunk). But since a 32 bit processor uses 32 bit addresses (and a 64 bit processor uses 64 bit addresses (I think)), would padding be totally unnecessary since all of the values of the struct (addresses, which are efficient by their very nature) are in ideal 32 bit chunks?

I am hoping some memory-representation / compiler-behavior guru can come shed some light on whether a compiler would ever have a reason to pad these values

A: 

You can't do that the way you are trying. The compiler is allowed to pad any and all members of the struct. I do not believe it is allowed to reorder the fields.

Most compilers have an attribute that can be applied to the struct to pack it (ie to turn it into a collection of tightly packed storage with no padding), but the downside is that this generally affects performance. The packed flag will probably allow you to use the struct the way you want, but it may not be portable across various platforms.

Padding is designed to make field access as efficient as possible on the target architecture. It's best not to fight it unless you have to (ie, the struct goes to a disk or over a network.)

Christopher
+1  A: 

Although not a duplicate, this probably answers your question:

http://stackoverflow.com/questions/119123/why-isnt-sizeof-for-a-struct-equal-to-the-sum-of-sizeof-of-each-member

It's not uncommon for applications to write an entire struct into a file and read it back out again. But this suffers from the possibility that one day the file will need to be read back on another platform, or by another version of the compiler that packs the struct differently. (Although this can be dealt with by specially-written code that understands the original packing format).

Daniel Earwicker
+1  A: 

Technically, you can rely only on the order; the compiler could insert padding. If different pointers were of different size, or if the pointer size wasn't a natural word size, it might insert padding.

Practically speaking, you could get away with it. I wouldn't recommend it; it's a bad, dirty trick.

You could achieve your goal with another level of indirection (what doesn't that solve?), or by using a temporary array initialized to point to the various members of the structure.

Chris Arguin
+1  A: 

It's not guaranteed, but it will work fine in most cases. It won't be intermittent, it will either work or not work on a particular platform with a particular build. Since you're using all pointers, most compilers won't mess with any padding.

Also, if you wanted to be safer, you could make it a union.

Gerald
+4  A: 

In many cases pointers are natural word sizes, so the compiler is unlikely to pad each member, but that doesn't make it a good idea. If you want to treat it like an array you should use an array.

I'm thinking out loud here so there's probably many mistakes but perhaps you could try this approach:

enum
{
    kName = 0,
    kTitle,
    kDesc,
    kBkFolder,
    kSrcList,
    kArcAll,
    kIncFold,
    kSettingsCount
};

typedef struct settings {
    struct settings* next;
    char *settingsdata[kSettingsCount];
} settings_row;

Set the data:

settings_row myRow;
myRow.settingsData[kName] = "Bob";
myRow.settingsData[kDescription] = "Hurrrrr";
...

Reading the data:

void read_settings_file(settings_row* settings){
    char** field = settings->settingsData;
    int i = 0;
    while(read_value_into(field[i]) && i++ < SETTINGS_PER_ROW);
}
dreamlax
this is brilliant, although your two examples of the use are a little off. Under your "set the data," is the way I intended to use the data in the application. And under "reading the data", I wouldn't need the field declaration, I could just directly use settings.settingsdata--and SETTINGS_PER_ROW would just be settings.kSettingsCount. But thanks for your new and safer approach :)
Carson Myers
+4  A: 

Under POSIX rules, all pointers (both function pointers and data pointers) are all required to be the same size; under just ISO C, all data pointers are convertible to 'void *' and back without loss of information (but function pointers need not be convertible to 'void *' without loss of information, nor vice versa).

Therefore, if written correctly, your code would work. It isn't written quite correctly, though! Consider:

void read_settings_file(settings_row* settings)
{
    char* field = settings + sizeof(void*);
    int i = 0;
    while(read_value_into(field[i]) && i++ < SETTINGS_PER_ROW)
        ;
}

Let's assume you're using a 32-bit machine with 8-bit characters; the argument is not all that significantly different if you're using 64-bit machines. The assignment to 'field' is all wrong, because settings + 4 is a pointer to the 5th element (counting from 0) of an array of 'settings_row' structures. What you need to write is:

void read_settings_file(settings_row* settings)
{
    char* field = (char *)settings + sizeof(void*);
    int i = 0;
    while(read_value_into(field[i]) && i++ < SETTINGS_PER_ROW)
        ;
}

The cast before addition is crucial!


C Standard (ISO/IEC 9899:1999):

6.3.2.3 Pointers

A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

[...]

A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined.

Jonathan Leffler
A: 

It seems to me that this approach creates more problems than it solves. When you read this code six months from now, will you still be aware of all the subtleties of how the compiler pads a struct? Would someone else, who didn't write the code?

If you must use the struct, use it in the canonical way and just write a function which assigns values to each field separately. You could also use an array and create macros to give field names to indices.

If you get too "clever" about optimizing your code, you will end up with slower code anyway, since the compiler won't be able to optimize it as well.

Jørgen Fogh
Regarding your first point, the reason I'm learning C is so I can learn about a deal with subtleties like struct padding and other pitfalls before moving up (just so I know how everything works down here). Nobody else will likely read my code, but I can't see why they would need to know those rules, since I didn't have to make any workarounds. Also, I'm not trying to optimize, as much as I am trying to shrink the code and reduce coupling.
Carson Myers
They would need to know the rules if they were to, say, add a new char field to the struct, which would then break (possibly). Shrinking the code is good, iff it enhances readability. It does not in this case, IMHO.As for how things work... If you want this level of detail, you would probably be better off with assembly language anyway. Even C compilers rearrange your code during optimization.
Jørgen Fogh
+2  A: 

It's not guaranteed by the C standard. I've a sneaking suspicion, that I don't have time to check right now either way, that it guarantees no padding between the char* fields, i.e. that consecutive fields of the same type in a struct are guaranteed to be layout-compatible with an array of that type. But even if so, you're on your own between the settings* and the first char*, and also between the last char* and the end of the struct. But you could use offsetof to deal with the first issue, and I don't think the second affects your current code.

However, what you want is almost certainly guaranteed by your compiler, which somewhere in its documentation will set out its rules for struct layout, and will almost certainly say that all pointers to data are word sized, and that a struct can be the size of 8 words without additional padding. But if you want to write highly portable code, you have to use only the guarantees in the standard.

The order of fields is guaranteed. I also don't think you'll see intermittent failure - AFAIK the offset of each field in that struct will be consistent for a given implementation (meaning the combination of compiler and platform).

You could assert that sizeof(settings*) == sizeof(char*) and sizeof(settings_row) == sizeof(char*)*8. If both those hold, there is no room for any padding in the struct, since fields are not allowed to "overlap". If you ever hit a platform where they don't hold, you'll find out.

Even so, if you want an array, I'd be inclined to say use an array, with inline accessor functions or macros to get the individual fields. Whether your trick works or not, it's even easier not to think about it at all.

Steve Jessop
well said in the last point, and I'm going to start setting up permanent asserts as well as some kind of test structure. Thanks for the advice
Carson Myers
+1 for "assert that sizeof(settings*) == sizeof(char*) and sizeof(settings_row) == sizeof(char*)*8."
Tom Leys