views:

101

answers:

4

After learning that both strncmp is not what it seems to be and strlcpy not being available on my operating system (Linux), I figured I could try and write it myself.

I found a quote from Ulrich Drepper, the libc maintainer, who posted an alternative to strlcpy using mempcpy. I don't have mempcpy either, but it's behaviour was easy to replicate. First of, this is the testcase I have

#include <stdio.h>
#include <string.h>

#define BSIZE 10

void insp(const char* s, int n)
{
   int i;

   for (i = 0; i < n; i++)
      printf("%c  ", s[i]);

   printf("\n");

   for (i = 0; i < n; i++)
      printf("%02X ", s[i]);

   printf("\n");

   return;
}

int copy_string(char *dest, const char *src, int n)
{
   int r = strlen(memcpy(dest, src, n-1));
   dest[r] = 0;

   return r;
}

int main()
{
   char b[BSIZE];
   memset(b, 0, BSIZE);

   printf("Buffer size is %d", BSIZE);

   insp(b, BSIZE);

   printf("\nFirst copy:\n");
   copy_string(b, "First", BSIZE);
   insp(b, BSIZE);
   printf("b = '%s'\n", b);

   printf("\nSecond copy:\n");
   copy_string(b, "Second", BSIZE);
   insp(b, BSIZE);

   printf("b = '%s'\n", b);

   return 0;
}

And this is its result:

Buffer size is 10                    
00 00 00 00 00 00 00 00 00 00 

First copy:
F  i  r  s  t     b     =    
46 69 72 73 74 00 62 20 3D 00 
b = 'First'

Second copy:
S  e  c  o  n  d          
53 65 63 6F 6E 64 00 00 01 00 
b = 'Second'

You can see in the internal representation (the lines insp() created) that there's some noise mixed in, like the printf() format string in the inspection after the first copy, and a foreign 0x01 in the second copy.

The strings are copied intact and it correctly handles too long source strings (let's ignore the possible issue with passing 0 as length to copy_string for now, I'll fix that later).

But why are there foreign array contents (from the format string) inside my destination? It's as if the destination was actually RESIZED to match the new length.

+4  A: 

The end of the string is marked by a \0 the memory after that can be anything, unless your OS deliberately blanks it then it's just whatever random junk was left there.

Note in this case the 'problem' isn't in the copy_string , you are exactly copying 10chars - but the memory after 'first' in your main code is just random.

Martin Beckett
Oh god, I didn't consider that memcpy() doesn't stop at '\0', stupid, stupid me.
LukeN
+1  A: 

Because you are not stopping at the source size, you are stopping at the destiny size, which happens to be bigger than source, so you are copying the source string plus a bit of garbage past it.

You can easily see that you are copying your source string, with its null terminator. But since you are memcopying 10 bytes and both strings "First" and "Second" are shorter than 10 bytes, you are also copying the extra bytes past them.

Francisco Soto
+1  A: 

The use of memcpy(dest, src, n-1) invokes undefined behavior if dest and src are not both at least n-1 in length.

For example, First\0 is six characters in length, but you read n-1 (9) characters from it; the contents of the memory past the end of the string literal are undefined, as is the behavior of your program when you read that memory.

James McNellis
A: 

The extra "stuff" is there because you've passed the buffer size to memcpy. It's going to copy that many characters, even when the source is shorter.

I'd do things a bit differently:

void copy_string(char *dest, char const *src, size_t n) { 
    *dest = '\0';
    strncat(dest, src, n);
}

Unlike strncpy, strncat is defined to work how most people would reasonably expect.

Jerry Coffin
People often expect `strncat` to work as `strlcat`, i.e. they expect it to take the *full* length of the target buffer, while in reality it takes the *remainder* of the length available for concatenation.
AndreyT
You actually want `if (n > 0) strncat(dest, src, n - 1)` (assuming that `n` is the size of the destination buffer).
caf