views:

113

answers:

6

I don't understand how the reallocation of memory for a struct allows me to insert a larger char array into my struct.

Struct definition:

typedef struct props
{
    char northTexture[1];
    char southTexture[1];
    char eastTexture[1];
    char westTexture[1];
    char floorTexture[1];
    char ceilingTexture[1];
} PROPDATA;

example:

void function SetNorthTexture( PROPDATA* propData, char* northTexture )
{
    if( strlen( northTexture ) != strlen( propData->northTexture ) )
    {
        PROPDATA* propPtr = (PROPDATA*)realloc( propData, sizeof( PROPDATA ) +
            sizeof( northTexture ) );
        if( propPtr != NULL )
        {
            strcpy( propData->northTexture, northTexture );
        } 
    }
    else
    {
        strcpy( propData->northTexture, northTexture );
    }
}

I have tested something similar to this and it appears to work, I just don't understand how it does work. Now I expect some people are thinking "just use a char*" but I can't for whatever reason. The string has to be stored in the struct itself.

My confusion comes from the fact that I haven't resized my struct for any specific purpose. I haven't somehow indicated that I want the extra space to be allocated to the north texture char array in that example. I imagine the extra bit of memory I allocated is used for actually storing the string, and somehow when I call strcpy, it realises there is not enough space...

Any explanations on how this works (or how this is flawed even) would be great.

+1  A: 

strcpy is not that intelligent, and it is not really working.

The call to realloc() allocates enough space for the string - so it doesn't actually crash but when you strcpy the string to propData->northTexture you may be overwriting anything following northTexture in propData - propData->southTexture, propData->westTexture etc.

For example is you called SetNorthTexture(prop, "texture"); and printed out the different textures then you would probably find that:

 northTexture is "texture"
 southTexture is "exture"
 eastTexture is "xture" etc (assuming that the arrays are byte aligned). 

Assuming you don't want to statically allocate char arrays big enough to hold the largest strings, and if you absolutely must have the strings in the structure then you can store the strings one after the other at the end of the structure. Obviously you will need to dynamically malloc your structure to have enough space to hold all the strings + offsets to their locations.

This is very messy and inefficient as you need to shuffle things around if strings are added, deleted or changed.

Dipstick
Ah, ok. That's no good. This is actually what I would have expected to be the case. Any ideas on how to resize it properly?
Matt
+6  A: 

Is this C or C++? The code you've posted is C, but if it's actually C++ (as the tag implies) then use std::string. If it's C, then there are two options.

If (as you say) you must store the strings in the structure itself, then you can't resize them. C structures simply don't allow that. That "array of size 1" trick is sometimes used to bolt a single variable-length field onto the end of a structure, but can't be used anywhere else because each field has a fixed offset within the structure. The best you can do is decide on a maximum size, and make each an array of that size.

Otherwise, store each string as a char*, and resize with realloc.

Mike Seymour
+1 for mentioning the size 1 array at end of struct convention, and the correct fallbacks for c++ and c
Matt Joiner
A: 

My confusion comes from the fact that I haven't resized my struct for any specific purpose.

In low level languages like C there is some kind of distinction between structs (or types in general) and actual memory. Allocation basically consists of two steps:

  1. Allocation of raw memory buffer of right size
  2. Telling the compiler that this piece of raw bytes should be treated as a structure

When you do realloc, you do not change the structure, but you change the buffer it is stored in, so you can use extra space beyond structure.

Note that, although your program will not crash, it's not correct. When you put text into northTexture, you will overwrite other structure fields.

el.pescado
A: 

This answer is not to promote the practice described below, but to explain things. There are good reasens not to use malloc and suggestions to use std::string, in other answers, are valid.

I think You have come across the trick used for example by Microsoft to avid the cost of a pointer dereference. In the case of Unsized Arrays in Structures (please check the link) it relies on a non-standard extension to the language. You can use a trick like that, even without the extension, but only for the struct member, that is positioned at it's end in the memory. Usually the last member in the structure declaration is also the last, in the memory, but check this question to know more about it. For the trick to work, You also have to make sure, the compiler won't add padding bytes at the end of the structure.

The general idea is like this: Suppose You have a structure with an array at the end like

struct MyStruct
{
    int someIntField;
    char someStr[1];
};

When allocating on the heap, You would normally say something like this

MyStruct* msp = (MyStruct*)malloc(sizeof(MyStruct));

However, if You allocate more space, than Your stuct actually occupies, You can reference the bytes, that are laid out in the memory, right behind the struct with "out of bounds" access to the array elements. Assuming some typical sizes for the int and the char, and lack of padding bytes at the end, if You write this:

MyStruct* msp = (MyStruct*)malloc(sizeof(MyStruct) + someMoreBytes);

The memory layout should look like:

|    msp   |   msp+1  |   msp+2  |   msp+3  |   msp+4  |   msp+5  |   msp+6  | ... |
|    <-         someIntField         ->     |someStr[0]|  <-   someMoreBytes  ->   |

In that case, You can reference the byte at the address msp+6 like this:

msp->someStr[2];
Maciej Hehl
A: 

NOTE: This has no char array example but it is the same principle. It is just a guess of mine of what are you trying to achieve.

My opinion is that you have seen somewhere something like this:

typedef struct tagBITMAPINFO {
  BITMAPINFOHEADER bmiHeader;
  RGBQUAD          bmiColors[1];
} BITMAPINFO, *PBITMAPINFO;

What you are trying to obtain can happen only when the array is at the end of the struct (and only one array).

For example you allocate sizeof(BITMAPINFO)+15*sizeof(GBQUAD) when you need to store 16 RGBQUAD structures (1 from the structure and 15 extra).

PBITMAPINFO info = (PBITMAPINFO)malloc(sizeof(BITMAPINFO)+15*sizeof(GBQUAD));

You can access all the RGBQUAD structures like they are inside the BITMAPINFO structure:

info->bmiColors[0]
info->bmiColors[1]
...
info->bmiColors[15]

You can do something similar to an array declared as char bufStr[1] at the end of a struct.

Hope it helps.

Iulian Şerbănoiu
A: 

One approach to keeping a struct and all its strings together in a single allocated memory block is something like this:

struct foo {
    ptrdiff_t s1, s2, s3, s4;
    size_t bufsize;
    char buf[1];
} bar;

Allocate sizeof(struct foo)+total_string_size bytes and store the offsets to each string in the s1, s2, etc. members and bar.buf+bar.s1 is then a pointer to the first string, bar.buf+bar.s2 a pointer to the second string, etc.

You can use pointers rather than offsets if you know you won't need to realloc the struct.

Whether it makes sense to do something like this at all is debatable. One benefit is that it may help fight memory fragmentation or malloc/free overhead when you have a huge number of tiny data objects (especially in threaded environments). It also reduces error handling cleanup complexity if you have a single malloc failure to check for. There may be cache benefits to ensuring data locality. And it's possible (if you use offsets rather than pointers) to store the object on disk without any serialization (keeping in mind that your files are then machine/compiler-specific).

R..