views:

108

answers:

2

Hello, I'm writing a FUSE plugin in C. I'm keeping track of data structures in the filesystem through structs like:

typedef struct {
    block_number_t inode;
    filename_t filename; //char[SOME_SIZE]
    some_other_field_t other_field;
} fs_directory_table_item_t;

Obviously, I have to read (write) these structs from (to) disk at some point. I could treat the struct as a sequence of bytes and do something like this:

read(disk_fd, directory_table_item, sizeof(fs_directory_table_item_t));

...except that cannot possibly work as filename is actually a pointer to the char array.

I'd really like to avoid having to write code like:

read(disk_df, *directory_table_item.inode,       sizeof(block_number_t));
read(disk_df,  directory_table_item.filename,    sizeof(filename_t));
read(disk_df, *directory_table_item.other_field, sizeof(some_other_field_t));

...for each struct in the code, because I'd have to replicate code and changes in no less than three different places (definition, reading, writing).

Any DRYer but still maintainable ideas?

+5  A: 

The memory of the string will be part of your struct, even though the array type is promoted to a pointer in many cases, the type stored in the struct is the array, not the pointer.

typedef struct {
    block_number_t inode;
    filename_t filename; //char[SOME_SIZE]
    some_other_field_t other_field;
} fs_directory_table_item_t;

So your read statement:

read(disk_fd, directory_table_item, sizeof(fs_directory_table_item_t));

will work and bring in the data.

When reading and writing memory blocks you should take padding into consideration. Padding is extra, empty fields added by the compiler to align data on relevant boundaries; e.g. a 32-byte value should often start at 4-byte boundary in the memory to allow the processor to read it efficiently. This is normally nothing to be concerned about, but when persisting the struct to disk it can pose problems if you recompile the code with another setting. There are often some kind of #pragma directives that disables padding, I think it is named #pragma pack in MS Visual c++.

Anders Abel
The problem of padding can also appear if you simply change your compiler, as well.
Baltasarq
Thanks for clearing up my confusion :)
badp
Actually we should not use "read()" like this in the first place. Read() uses unbuffered I/O. It can be very inefficient when it is called frequently on small chunk of data. fread() is the right call.
When in doubt, use `sizeof` to tell you the size of the structure. If the bytes from the string are stored inside your data structure, then this should noticeably and significantly increase the size of your `struct`. If you are using a structure that only stores a pointer, the easiest thing to do is to store the array length in the structure and then concatenate the array to the end of the `struct` when you write it to a file (so that it is easy to find and extract later). Be careful storing pointers to disk; they will be wrong when you read the data back in.
bta
+2  A: 

One way to do this is to make static const tables of data that describe your structures so that an simple read/write engine can work with them.

You need to define a structure that can represent everthing you need to know to read or write a single field of a single structure.

typedef struct {
    char * name;
    size_t offset;
    size_t size;
    int    format_as;
    void*  format_struct; // if format_as & IS_STRUCT, this is the structure type
    } field_info_t

enum {
    AS_CHAR =1,
    AS_SHORT,
    AS_LONG,
    // add other types here
    AS_MASK = 0xFF,

    // these flags can be OR'd with type to refine the behavior
    IS_POINTER = 0x100,
    IS_STRUCT  = 0x200,
    };

Then build tables of these that describe all of your data structures.

#define FIELD_OFF(type, field)    ((size_t)(LONG_PTR)&(((type *)0)->field))
#define FIELD_SIZE(type, field)   (sizeof(((type *)0)->field))

static const field_info_t g_fs_directory_table_item_table[] = {
    { "inode",
      FIELD_OFF(fs_directory_table_item_t, inode),
      FIELD_SIZE(fs_directory_table_item_t, inode),
      AS_LONG,
      NULL
    },

    { "filename",
      FIELD_OFF(fs_directory_table_item_t, filename),
      sizeof(filename_t),
      AS_CHAR | IS_POINTER,
      NULL
    },

    { "other_field", 
      FIELD_OFF(fs_directory_table_item_t, other_field),
      FIELD_SIZE(fs_directory_table_item_t, other_field),
      AS_STRUCT,
      &some_other_field_table,
    },
};

And then read and write engines that take a pointer to a structure, and a pointer to the table describing the structure and read/write the various fields.

void ReadStructure(FILE * fh, void * pStruct, field_info_t * pFields, int num_fields)
{
    // this is just a rough sketch of the code.
    for (int ii = 0; ii < num_fields; ++ii)
    {
       int  * field_size = pFields[ii].size;
       char * pfield = (char*)pStruct + pFields[ii].offset;
       if (pFields[ii].format_as & AS_POINTER)
           pfield = *(char**)pfield;  

       switch (pFields[ii].format_as & AS_MASK)
       { 
           case AS_CHAR:
           ....
       }         
    }
}
void WriteStructure(FILE * fh, void * pStruct, field_info_t * pFields, int num_fields);

You still end up having to maintain a field_info_t array for each of your data structures, but once you have it, you can read, write, validate and pretty-print your data with a set of fairly simple functions.

John Knoeller