views:

353

answers:

5

I'm having a very big struct in an existing program. This struct includes a great number of bitfields.

I wish to save a part of it (say, 10 fields out of 150).

An example code I would use to save the subclass is:

typedef struct {int a;int b;char c} bigstruct;
typedef struct {int a;char c;} smallstruct;
void substruct(smallstruct *s,bigstruct *b) {
    s->a = b->a;
    s->c = b->c;
}
int save_struct(bigstruct *bs) {
    smallstruct s;
    substruct(&s,bs);
    save_struct(s);
}

I also wish that selecting which part of it wouldn't be too much hassle, since I wish to change it every now and then. The naive approach I presented before is very fragile and unmaintainable. When scaling up to 20 different fields, you have to change fields both in the smallstruct, and in the substruct function.

I thought of two better approaches. Unfortunately both requires me to use some external CIL like tool to parse my structs.

The first approach is automatically generating the substruct function. I'll just set the struct of smallstruct, and have a program that would parse it and generate the substruct function according to the fields in smallstruct.

The second approach is building (with C parser) a meta-information about bigstruct, and then write a library that would allow me to access a specific field in the struct. It would be like ad-hoc implementation of Java's class reflection.

For example, assuming no struct-alignment, for struct

struct st {
    int a;
    char c1:5;
    char c2:3;
    long d;
}

I'll generate the following meta information:

int field2distance[] = {0,sizeof(int),sizeof(int),sizeof(int)+sizeof(char)}
int field2size[] = {sizeof(int),1,1,sizeof(long)}
int field2bitmask[] =  {0,0x1F,0xE0,0};
char *fieldNames[] = {"a","c1","c2","d"};

I'll get the ith field with this function:

long getFieldData(void *strct,int i) {
    int distance = field2distance[i];
    int size = field2size[i];
    int bitmask = field2bitmask[i];
    void *ptr = ((char *)strct + distance);
    long result;
    switch (size) {
        case 1: //char
             result = *(char*)ptr;
             break;
        case 2: //short
             result = *(short*)ptr;
        ...
    }
    if (bitmask == 0) return result;
    return (result & bitmask) >> num_of_trailing_zeros(bitmask);
 }

Both methods requires extra work, but once the parser is in your makefile - changing the substruct is a breeze.

However I'd rather do that without any external dependencies.

Does anyone have any better idea? Where my ideas any good, is there some availible implementation of my ideas on the internet?

+2  A: 

If changing the order of the fields isn't out of the question, you can rearrange the bigstruct fields in such a way that the smallstruct fields are together, and then its simply a matter of casting from one to another (possibly adding an offset). Something like:

typedef struct {int a;char c;int b;} bigstruct;
typedef struct {int a;char c;} smallstruct;

int save_struct(bigstruct *bs) {
    save_struct((smallstruct *)bs);
}
Blindy
This requires that all the subset fields are always the first fields defined in the big structure. If they are scattered all over the big structure, this will go wrong.
rikh
Cast bs not s ;)
Magnus Skog
@rikh: I said as much in the opening sentence. But if it's his code, changing the order of member variables is both easy and safe (nothing sane should really break by doing it)
Blindy
@Blindy: If the code was sane, I wouldn't have the big structure altogether ;-). This could work.
Elazar Leibovich
+1  A: 

Macros are your friend.

One solution would be to move the big struct out into its own include file and then have a macro party.

Instead of defining the structure normally, come up with a selection of macros, such as BEGIN_STRUCTURE, END_STRUCTURE, NORMAL_FIELD, SUBSET_FIELD

You can then include the file a few times, redefining those structures for each pass. The first one will turn the defines into a normal structure, with both types of field being output as normal. The second would define NORMAL_FIELD has nothing and would create your subset. The third would create the appropriate code to copy the subset fields over.

You'll end up with a single definition of the structure, that lets you control which fields are in the subset and automatically creates suitable code for you.

rikh
I don't downmod you, but Macros are your fiend, not your friend. Just wanted to mention that. :)
Randolpho
+10  A: 

From your description, it looks like you have access to and can modify your original structure. I suggest you refactor your substructure into a complete type (as you did in your example), and then make that structure a field on your big structure, encapsulating all of those fields in the original structure into the smaller structure.

Expanding on your small example:

typedef struct 
{
  int a;
  char c;
} smallstruct;

typedef struct 
{
  int b;
  smallstruct mysub;
} bigstruct;

Accessing the smallstruct info would be done like so:

/* stack-based allocation */
bigstruct mybig;
mybig.mysub.a = 1;
mybig.mysub.c = '1';
mybig.b = 2;

/* heap-based allocation */
bigstruct * mybig = (bigstruct *)malloc(sizeof(bigstruct));
mybig->mysub.a = 1;
mybig->mysub.c = '1';
mybig->b = 2;

But you could also pass around pointers to the small struct:

void dosomething(smallstruct * small)
{ 
  small->a = 3;
  small->c = '3';
}

/* stack based */    
dosomething(&(mybig.mysub));

/* heap based */    
dosomething(&((*mybig).mysub));

Benefits:

  • No Macros
  • No external dependencies
  • No memory-order casting hacks
  • Cleaner, easier-to-read and use code.
Randolpho
A: 

Just to help you in getting your metadata, you can refer to the offsetof() macro, which also has the benefit of taking care of any padding you may have

Metiu
The offsetof macro doesn't work with bitfields. Didn't find anything equivalent.
Elazar Leibovich
A: 

I suggest to take this approach:

  1. Curse the guy who wrote the big structure. Get a voodoo doll and have some fun.
  2. Mark each field of the big structure that you need somehow (macro or comment or whatever)
  3. Write a small tool which reads the header file and extracts the marked fields. If you use comments, you can give each field a priority or something to sort them.
  4. Write a new header file for the substructure (using a fixed header and footer).
  5. Write a new C file which contains a function createSubStruct which takes a pointer to the big struct and returns a pointer to the substruct
  6. In the function, loop over the fields collected and emit ss.field = bs.field (i.e. copy the fields one by one).
  7. Add the small tool to your makefile and add the new header and C source file to your build

I suggest to use gawk, or any scripting language you're comfortable with, as the tool; that should take half an hour to build.

[EDIT] If you really want to try reflection (which I suggest against; it'll be a whole lot of work do get that working in C), then the offsetof() macro is your friend. This macro returns the offset of a field in a structure (which is most often not the sum of the sizes of the fields before it). See this article.

[EDIT2] Don't write your own parser. To get your own parser right will take months; I know since I've written lots of parsers in my life. Instead mark the parts of the original header file which need to be copied and then rely on the one parser which you know works: The one of your C compiler. Here are a couple of ideas how to make this work:

struct big_struct {
    /**BEGIN_COPY*/
    int i;
    int j : 3;
    int k : 2;
    char * str;
    /**END_COPY*/
    ...
    struct x y; /**COPY_STRUCT*/
}

Just have your tool copy anything between /**BEGIN_COPY*/ and /**END_COPY*/.

Use special comments like /**COPY_STRUCT*/ to instruct your tool to generate a memcpy() instead of an assignment, etc.

This can be written and debugged in a few hours. It would take as long to set up a parser for C without any functionality; that is you'd just have something which can read valid C but you'd still have to write the part of the parser which understands C, and the part which does something useful with the data.

Aaron Digulla
This is a good idea, but I don't like the fragility of not parsing the C file. I know of the offsetof macro, but the offetof macro broke up with me after I used bitfields :-)
Elazar Leibovich
I have written several C parsers and parsers for my own languages plus an XML parser. Writing a parser takes at least a week. After the week, you have something that builds and can grok simple cases. For this task, I'd figure it takes about a month to write something which parses enough of C to solve your problem. Conclusion: Don't go down that road unless you have plenty of time.
Aaron Digulla