tags:

views:

566

answers:

5

There is this macro offsetof in C/C++ which allows you to get the address offset of a member in a POD structure. For an example from the C FAQ:

struct foo {
int a;
int b;
};

struct foo;

/* Set the b member of foo indirectly */
*(int *)((char *)foo + offsetof(b)) = 0xDEADBEEF;

Now this just seems evil to me and I can't see many legit uses of this macro.

One legit example I have seen is it's use in the container_of macro in the Linux Kernel for getting the address of an embedded structures parent object:

/* get the address of the cmos device struct in which the cdev
   structure which inode points to is embedded */
struct cmos_dev *cmos_devp = 
     container_of(inode->i_cdev, struct cmos_dev, cdev);

What other legit uses are there for this macro? When should you not use this macro?

EDIT So far this answer to a different SO question is the best one I've seen so far.

+3  A: 

Well ... In C, it's very useful for any place where you need code to describe a data structure. I've used it e.g. to do run-time-generated GUI:s for setting options.

This worked like this: a command that needs options defines a local structure holding its options, and then describes that structure to the code that generates the GUI, using offsetof to indicate where fields are. Using offsets rather than absolute addresses allows the GUI code to work with any instance of the struct, not just one.

This is bit hard to sketch quickly in an example (I tried), but since comments indicate an example is in order, I'll try again.

Assume we have a self-contained module, called a "command", that implements some action in the application. This command has a bunch of options that control its general behaviour, which should be exposed to the user through a graphical user interface. For the purposes of this example, assume the application is a file manager, and the command could be e.g. "Copy".

The idea is that the copy code lives in one C file, and the GUI code in another, and the GUI code does not need to be hard-coded to "support" the copy command's options. Instead, we define the options in the copy file, like so:

struct copy_options
{
  unsigned int buffer_size;     /* Number of bytes to read/write at a time. */
  unsigned int copy_attributes; /* Attempt to copy attributes. */
  /* more, omitted */
};

static struct copy_options options; /* Actual instance holding current values. */

Then, the copy command registers its configuration settings with the GUI module:

void copy_register_options(GUIModule *gui)
{
  gui_command_begin(gui, "Copy");
  gui_command_add_unsigned_int(gui, "Buffer size", offsetof(struct copy_options, buffer_size));
  gui_command_add_boolean(gui, "Copy attributes", offsetof(struct copy_options, copy_attributes));
  gui_command_end(gui);
}

Then, let's say the user asks to set the copy command's options. We can then first copy the current options, to support cancelling, and ask the GUI module for a dialog holding controls, built at run-time, suitable for editing this command's options:

void copy_configure(GUIModule *gui)
{
  struct copy_options edit = options;

  /* Assume this opens a modal dialog, showing proper controls for editing the
   * named command's options, at the address provided. The function returns 1
   * if the user clicked "OK", 0 if the operation was cancelled.
  */
  if(gui_config_dialog(gui, "Copy", &edit))
  {
    /* GUI module changed values in here, make edit results new current. */
    options = edit;
  }
}

Of course, this code assumes the settings to be pure value-types, so we can copy the struct using simple struct assignment. If we also supported dynamic strings, we'd need a function to do the copying. For configuration data though, any string would probably best be expressed as a statically-sized char array in the struct, which would be fine.

Note how the fact that the GUI module only knows where each value lives expressed as an offset allows us to provide the dialog function with a temporary on-stack copy. Had we instead set up the GUI module with direct pointers to each field, this would not be possible which would be far less flexible.

unwind
Wouldn't just using a pointer also allow you to 'work with any instance of the struct'?
Robert S. Barnes
@Robert: Because it would only work with one; the one you gave it a pointer to. By using offsets, you can build the GUI for two different instances of the same struct, without having to re-describe the struct to the GUI-building code.
unwind
I'm just not getting it for some reason - could you add a short code example to your post?
Robert S. Barnes
May be an example is in order?
SDX2000
One problem here is a lack of type safety. Also, the description needs size information as well as offset because you may not have all members exposed. Finally, offsetof probably fails when applied to bit fields. Personally, I never used it nor had a situation where it would be useful and there was no other solution.
Skizz
@Skizz: This is C, so kind of hard to do this in a type-safe manner, I think. The description's size information is implicit in the function name, i.e. add_unsigned_integer(). If the size could vary, like for a string, you'd need to specify the size in the call, of course. You can't take the address of a bit field, so that would be problematic indeed. I have used this in production code, I find it very useful.
unwind
@unwind: Correct me if I'm wrong, but what you're proposing is similar to this http://stackoverflow.com/questions/400116/what-is-the-purpose-and-return-type-of-the-builtinoffsetof-operator/400683#400683 ? If that's the case then showing that you're basically looking up the options by name in a table at run time to get a stored offset for the named parameter would make your example way clearer.
Robert S. Barnes
@Robert: Well ... Kind of, I guess. I was not intending the names in the add_xxx() calls to be names used for lookups, though. Those are intended as GUI labels, i.e. you'd get a slider or a spin button holding letting you edit an integer, and that control will be labeled "Buffer size". The GUI-building module doesn't do any name-based look-ups, it's just a sequence of n parameters made available for editing, and it just uses the provided type (implicit in the add_xxx() function name) and offset into the provided memory block to know where to read and write the edited values.
unwind
@unwind: The thing that's throwing me for a loop in your example is that I don't see the point of using `offsetof()` when you know the name of the member at compile time, i.e. you could just directly take the address of the member, i.e. `gui_command_add_unsigned_int(gui, "Buffer size", ` Using `offsetof()` only seems to make sense if you're doing a run-time look up of the member as in the example I linked to above.
Robert S. Barnes
@Robert: no, since the GUI module now has the ability to build, display an editing GUI for, and store data in *any* instance of the correct struct, which it would not have if it was supplied with direct pointers into a struct. See the configure() function last in my answer, where a copy is first made, the copy is edited by the GUI, and then if accepted the copy is made current. I guess this could be rewritten in this particular case (by backing up the settings, editing the current settings, and overwriting with the backup on cancel), but in general the ability to access any instance is key.
unwind
@unwind: Went back and reread your answer and finally got it I think. The basic point is to allow the gui code, say for modifying an int value, to write that int directly into a commands custom local struct without having to know the layout of that struct. So for example the copy command and the move command could both have local paramter structs with multiple ints in them and the same gui code can modify all of them just by getting the offset of the member it's working on. You could get similar results, although less efficiently, by passing the gui code function pointers to copy routines.
Robert S. Barnes
@unwind: Showing the same gui code working on two different structs for two different commands might make the example more easily understood. Thanks for taking the time to write the example and answer my questions!
Robert S. Barnes
+1  A: 

offsetof is relatively frequently used for device driver programming where you usually have to write in plain C but sometimes need some "other" features. Consider you have a callback function which gets pointer to some structure. Now this structure is itself is a member of another bigger "outer" structure. with "offsetof" you have ability to change the members of the "outer" structure when you have access only to the "inner" member.

Something like this:

struct A
{
 int a1;
 int a2;
};

struct B
{
 int b1;
 int b2;
 A a;
};

void some_API_callback_func(A * a)
{
 //here you do offsetof 
 //to get access to B members
}

Of course this is dangerous if you have possibility that struct A is used not as part of struct B. But in many places where framework for "some_API_callback_func" is well documented this works fine.

AlexKR
Thanks, but this is exactly what I already described in regards to the container_of macro in the linux kernel.
Robert S. Barnes
+1  A: 

Basically, anything you'd do with a pointer to member (T::*) in C++ is a good candidate for the use of offsetof in C. For that reason, offsetof is much rarer in C++.

Now this is of course a bit circular, so here are some examples:

  • Semi-generic sorting functions for structs. qsort uses a callback, which isn't ideal. Often you just need to sort by the natural order of one member, e.g. the third int in a structure. A hypothetical qsort_int could accept an offsetof argument for this purpose.
  • Similarly, it's possible to write a macro extract such that you can say int out[10]; extract(int, &MyFoo[0], &MyFoo[10], out, offsetof(struct Foo, Bar));
MSalters
+2  A: 

One of the ways I've used it in embedded systems is where I have a struct which represents the layout of non-volatile memory (e.g. EEPROM), but where I don't want to actually create an instance of this struct in RAM. You can use various nice macro tricks to allow you to read and write specific fields from the EEPROM, where offsetof does the work of calculating the address of a field within the struct.

With regard to 'evil', you have to remember that lots of stuff which was traditionally done in 'C' programming, particularly on resource-limited platforms, now looks like evil hackery when viewed from the luxurious surrounding of modern computing.

Will Dean
+3  A: 

One legitimate use of offsetof() is to determine the alignment of a type:

#define ALIGNMENT_OF( t ) offsetof( struct { char x; t test; }, test )

It may be a bit low-level to need the alignment of an object, but in any case I'd consider this a legitimate use.

Michael Burr