ansaurus

Question

Can someone explain this definition of the 'dirent' struct in Solaris?

Answer 1

+3 A:

The dirent struct will be immediately followed in memory by a block of memory that contains the rest of the name, and that memory is accessible through the d_name field.

Rob K 2009-02-18 22:10:25

Answer 2

+2 A:

This is a pattern used in C to indicate an arbitrary-length array at the end of a structure. Arrays in C have no built-in bounds checking, so when your code tries to access the string starting at d_name, it will continue past the end of the structure. This relies on readdir() will allocate enough memory to hold the entire string plus the terminating nul.

Commodore Jaeger 2009-02-18 22:11:09

Why not just use a pointer in the struct though? To save a few bytes? I suppose at the OS level that may be the case.

TURBOxSPOOL 2009-02-18 22:49:56

A pointer doesn't do the same thing. To use a pointer would require multiple memory allocations -- one for the dirent structure and one for the name, with the dirent pointing to the name. Using the single-byte-array pattern means one single allocation with d_name being the first byte of the name.

Andrew 2009-02-18 23:14:01

Got ya. Thanks for the clarification, I wasn't considering the extra allocation needed.

TURBOxSPOOL 2009-02-19 00:50:12

Answer 3

A:

It looks like a micro-optimization to me. Names are commonly short, so why allocate space that you know will go unused. Also, Solaris may support names longer than 255 characters. To use such a struct you just allocate the needed space and ignore the supposed array size.

dwc 2009-02-18 22:12:38

Answer 4

+4 A:

As others have pointed out, the last member of the struct doesn't have any set size. The array is however long the implementation decides it needs to be to accommodate the characters it wants to put in it. It does this by dynamically allocating the memory for the struct, such as with malloc.

It's convenient to declare the member as having size 1, though, because it's easy to determine how much memory is occupied by any dirent variable d:

sizeof(dirent) + strlen(d.d_name)

Using size 1 also discourages the recipient of such struct values from trying to store their own names in it instead of allocating their own dirent values. Using the Linux definition, it's reasonable to assume that any dirent value you have will acept a 255-character string, but Solaris makes no guarantee that its dirent values will store any more characters than they need to.

I think it was C 99 that introduced a special case for the last member of a struct. The struct could be declared like this instead:

typedef struct dirent {
  ino_t d_ino;
  off_t d_off;
  unsigned short d_reclen;
  char d_name[];
} dirent_t;

The array has no declared size. This is known as the flexible array member. It accomplishes the same thing as the Solaris version, except that there's no illusion that the struct by itself could hold any name. You know by looking at it that there's more to it.

Using the "flexible" declaration, the amount of memory occupied would be adjusted like so:

sizeof(dirent) + strlen(d.d_name) + 1

That's because the flexible array member does not factor in to the size of the struct.

The reason you don't see flexible declarations like that more often, especially in OS library code, is likely for the sake of compatibility with older compilers that don't support that facility. It's also for compatibility with code written to target the current definition, which would break if the size of the struct changed like that.

Rob Kennedy 2009-02-18 23:58:47

Actually, the entry d_reclen holds the real size of this struct instance, you do not have to compute it yourself.

Raim 2009-02-19 02:43:09

Ah, you're right. The one allocating the struct still needs to figure out how much to allocate, though, so it's still nice to have an easy way to calculate it.

Rob Kennedy 2009-02-19 04:42:37

ansaurus

tags:

views:

answers:

Can someone explain this definition of the 'dirent' struct in Solaris?

related questions