views:

804

answers:

3

Classic memcpy gotcha with C arrays as function arguments. As pointed out below, I have an error in my code but the erroneous code worked in a local context!

I just encountered this weird behaviour in a porting job, where I'm emulating the Macintosh Picture opcode playback using objects. My DrawString object was drawing garbage on playback because it apparently failed to copy the string argument. The following is a test case I wrote - note how a manual copying loop works but memcpy fails. Tracing in the Visual Studio debugger shows the memcpy ovewrites the destination with garbage.

Memcpy on two local Str255 arrays works fine.

When one of them is a member in an object on the stack, it fails (in other testing it also fails when the object is on the heap).

The following sample code shows the memcpy being invoked in an operator=. I moved it there after it failed in a constructor but there was no difference.

typedef unsigned char Str255[257];

// snippet that works fine with two local vars
Str255 Blah("\004Blah");
Str255 dest;
memcpy(&dest, &Blah, sizeof(Str255));  // THIS WORKS - WHY HERE AND NOT IN THE OBJECT?

/*!
class to help test  CanCopyStr255AsMember
*/
class HasMemberStr255  {
public:
    HasMemberStr255()
    {
        mStr255[0] = 0;
    }

    HasMemberStr255(const Str255 s)
    {
        for (int i = 0; i<257; ++i)
        {
            mStr255[i] = s[i];
            if (s[i]==0)
                return;
        }
    }

    /// fails
    void operator=(const Str255 s)  {
        memcpy(&mStr255, &s, sizeof(Str255));
    };
    operator const Str255&() { return mStr255; }

private:
    Str255 mStr255;
};
-

/*!
Test trivial copying technique to duplicate a string
Added this variant using an object because of an apparent Visual C++ bug.
*/
void TestMacTypes::CanCopyStr255AsMember()
{
    Str255 initBlah("\004Blah");
    HasMemberStr255 blahObj(initBlah);
// using the operator= which does a memcpy fails   blahObj = initBlah;

    const Str255& dest = blahObj;  // invoke cast operator to get private back out
    CPPUNIT_ASSERT( dest[0]=='\004' );
    CPPUNIT_ASSERT( dest[1]=='B' );
    CPPUNIT_ASSERT( dest[2]=='l' );
    CPPUNIT_ASSERT( dest[3]=='a' );
    CPPUNIT_ASSERT( dest[4]=='h' );
    CPPUNIT_ASSERT( dest[5]=='\0' );  //  trailing null
}
+4  A: 

You should write memcpy(mStr255, s, sizeof(Str255));. Without '&'. Str255 is already a pointer. That's according to C++ Standard 4.2:

An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to an rvalue of type “pointer to T.” The result is a pointer to the first element of the array.

Why does it work somewhere? There are two different pointers (for mStr255 and &mStr255) and they has different types — unsigned char * and unsigned char (*)[257]. The address of the array is the same as the address of the first element in the array, but when you pass it as argument to a function you will get address of variable on the stack. By typefing Str255 you are hide the difference. Check the following sample:

unsigned char Blah[10] = "\004Blah";

struct X
{
    void f1( unsigned char(&a)[10] ) // first case (1)
    {
      void* x1 = &a; // pointer to array of unsigned char
      void* x2 = a;  // pointer to unsigned char due to implicit conversion array-to-pointer
    }
    void f2( unsigned char* a )     // second case (2)
    {
      void* x1 = &a; // pointer to variable 'a' which is on the stack
      void* x2 = a;  // pointer to unsigned char
    }
    unsigned char x[10];
};

int main( int argc, char ** argv )
{
    X m;
    m.f1( Blah ); // pass by reference
    m.f2( Blah ); // implicit array-to-pointer conversion

    return 0;
}

When you are write void f( Str255 a ), it is equal to the second case.

Kirill V. Lyadvinsky
Yes, that worked!So, why does it work with two local variables?I am feeling stupid, I cut this code down from memcpy( which was moving stuff up as well as copying.
Andy Dent
in the case of local variables, you are overwriting the unused stack, say [...used stack space|Blaha|dest|unused stack space...], that is, you written 257 bytes from the position of "dest" toward unused direction.
EffoStaff Effo
EffoStaff Effo
I don't think that 'luck' has anything to do with it... just seen your edit, you make the rules clearer, now. Perhaps you can just remove the reference to being lucky?
Charles Bailey
A: 

If I'm reading correctly (and my C++ is a little rusty), your class never actually allocates space for the mStr variable. You declare it (but don't appear to allocate it) in the private section, and you initialize the first element to 0 in the constructor, but you don't appear to every actually construct a Str255 object.

You may need to replace the private declaration with Str255 mStr(), or you may need to do something in the constructor, like mStr = new Str255()

atk
Yes, you're rusty. Str255 is an old-fashioned C array so allocates space inherently.
Andy Dent
The member "Str255 mStr255" in the original post's code snippet could be expanded to "unsigned char mStr255[255]", which does use memory within that allocated for it's HasMemberStr255 instance.
Will
Ah - thanks for the corrections!
atk
+3  A: 

This is probably a good example of why (in my opinion) it's a bad idea to typedef array types.

Unlike in other contexts, in function declarations a parameter of array type is always adjusted to an equivalent pointer type. When an array is passed to the function it always decays into a pointer to the first element.

These two snippets are equivalent:

typedef unsigned char Str[257];
Str src = "blah";
Str dst;
memcpy( &dst, &src, sizeof(Str) ); // unconventional


unsigned char src[257] = "blah";
unsigned char dst[257];
memcpy(&dst, &src, sizeof(unsigned char[257])); // unconventional

In this latter case &dst and &src are both of type unsigned char (*)[257] but the value of these pointers are the same as the value of pointers to the first element of each array, which is what dst and src would decay into if passed directly into memcpy like this.

memcpy(dst, src, sizeof(unsigned char[257])); // more usual

memcpy takes void* arguments so the types of the original pointers don't matter, only their values.

Because of the rule for parameter declarations (an array type of any or unspecified size is adjusted to the equivalent pointer type), these declarations for fn are all equivalent:

typedef unsigned char Str[257];
void fn( Str dst, Str src );


void fn( unsigned char dst[257], unsigned char src[257] );


void fn( unsigned char dst[], unsigned char src[] );


void fn( unsigned char* dst, unsigned char* src );

Looking at this code, it is more obvious that the values being passed into memcpy in this case are pointers to the passed pointers, and not pointers to the actual unsigned char arrays.

// Incorrect
void fn( unsigned char* dst, unsigned char* src )
{
    memcpy(&dst, &src, sizeof(unsigned char[257]));
}

With a typedef, the error is not so obvious, but still present.

// Still incorrect
typedef unsigned char Str[257];
void fn( Str dst, Str src )
{
    memcpy(&dst, &src, sizeof(Str));
}
Charles Bailey
Andy Dent
EffoStaff Effo
"if you wrote good c" - I'm porting it, I didn't write it! I can't use structs for things which assume the original Apple definitions which are typedefs. As I said, normally when porting code like this I use objects and operator overloading to get robust types.
Andy Dent