tags:

views:

215

answers:

7

Given:

char test[]="bla-bla-bla";

Which is more correct:

(1)

char *test1 = malloc(strlen(test));
strcpy(test1,test);

or (2)

char *test1 = malloc(sizeof(test));
strcpy(test1,test);
+1  A: 
char test[]="bla-bla-bla";
char *test1 = malloc(strlen(test) + 1); // +1 for the extra NULL character
strcpy(test1, test);
mmonem
+6  A: 
char test[]="bla-bla-bla";
char *test1 = malloc(strlen(test) + 1);
strcpy(test1,test);
Tomasz Kowalczyk
strlen(test) + 1. strlen doesn't count the necessary NULL.
Aram Hăvărneanu
corrected, sir!
Tomasz Kowalczyk
-1 for `sizeof(char)`.
R..
Ok, as you wish. I always use sizeof(type) to have more variable-OS-friendly code.
Tomasz Kowalczyk
@Tomasz: Except `sizeof(char)` is defined as equal to `1` byte. As for the number of *bits* in it, however...
Jon Purdy
`sizeof(char)` is defined by the C standard to always be 1.
Tyler McHenry
So what is wrong with putting sizeof(char) into the code? It makes it more explicit in what you are trying to do. Thus if the code is ever changed to use wchar_t for example then it would be easier to spot where the code needs to be changed. So: Though not required here I see it as a benefit to include it in the code as it makes the meaning of the code more precise.
Martin York
I've strongly argued against writing `sizeof(char)` for years. It adds visual clutter, and the semantics of object size in C have been units of `sizeof(char)` since the beginning. That makes an expression clear and simple in a common case, and calls attention to the dependence on a typed allocation in the other cases.
RBerteig
@Martin York: Since you are using the return value of `strlen()`, `char` is implied - if you changed to `wchar_t` you would need to change to `wcslen()` anyway. If you do want to use `sizeof`, I suggest `sizeof test1[0]` (since that will automatically change if you change the type of `test1`).
caf
+3  A: 

Neither:

#include <string.h>
char *mine = strdup(test);
Matt Joiner
This is essentially equivalent to the version using `strlen` -- *on the implementations that provide it.* The problem, of course, is that it's an extension, so you can't depend on its presence.
Jerry Coffin
Of course you can easily write your own `strdup` if it's missing.
R..
`strdup` is not defined by the Standard. After using it you need to call `free()`, so you must also `#include <stdlib.h>`.
pmg
@pmg: You'd need to #include `<stdlib.h>` to use `malloc` so that's kind of a given already.
Billy ONeal
@R: Although it's easy, if you write your own `strdup`, you get undefined behavior; names starting with `str` are reserved. Fortunately, it's just as easy to write your one `dupe_string` (or whatever other name you prefer).
Jerry Coffin
On most good systems, strdup is defined as a weak reference. This means you can override it for your own code. Of course this doesn't make it a good idea, but a quick fix for crappy systems that don't provide it.
Matt Joiner
+3  A: 

I think sizeof is the correct one.Reason behind that is strlen(str) will give you length of the string( excluding the terminating null).And if you are using strcpy,it actually copy the whole string including the terminating null,so you will allocate one byte less if you use strlen in malloc.But sizeof gives the size of the string pointed by test,including the terminating null,so you will get correct size malloc chunk to copy the string including the terminating null.

Anil Vishnoi
sizeof() will return a surprisingly 'wrong' result if you pass a (char *) object to it. It will return "the number of bytes necessary to store a pointer to 'char'", which is not necessarily equal to the size of the string pointed at by the pointer.
Giorgos Keramidas
+1: In the code shown, the `sizeof()` solution is correct and the `strlen()` one is incorrect. The `sizeof()` solution might over-allocate memory if the string has been shortened since it was created. However, the definition of the array must be in scope for it to work - so the more generally safe solution uses `strlen(str)+1`, or assumes a POSIX environment and uses `strdup()`.
Jonathan Leffler
A: 

(1) with strlen but not adding 1 is definitely incorrect. If you add 1, it would have the added benefit that it also works for pointers, not just arrays.

On the other hand, (2) is preferred as long as your string is actually an array, as it results in a compile-time constant, rather than a call to strlen (and thus faster and smaller code). Actually a modern compiler like gcc can probably optimize the strlen out if it knows the string is constant, but it may be hard for the compiler to determine this, so I'd always use sizeof when possible.

R..
In all situations where you can correctly use `sizeof`, the compiler **will** figure out that `strlen` can be optimized. So there’s simply no reason to use `sizeof` *at all*.
Konrad Rudolph
@Konrad: `char test[] = "bla bla \0 bla";`; `strlen(test) == 8`
pmg
@Konrad: not true. It's possible the contents of the string are variable, but always fit in a small fixed-size array. In this case, `strlen` **cannot** be optimized out, but allocating a fixed-size object the same size as the array is always suitable.
R..
@pmg: that’s a completely different use-case. We were (implicitly) talking about zero-delimited strings here, otherwise `strlen` would make no sense at all. When dealing with raw data that may contain null bytes, treating that data as a (zero-delimited) string is obviously a bug.
Konrad Rudolph
A: 

1) definitely causes UB

2) may cause UB (if malloc fails)

I'd go with 2) as there is a better chance of the construct working as intended; or even better I'd write a version that works as intended (without UB) in all situations.


Edit

  • Undefined Behaviour in 1)

    test1 will have space for the characters in test, but not for the terminating '\0'. The call to strcpy() will try to write a '\0' to memory that does not belong to test1, hence UB.

  • Undefined Behaviour in 2)

    If the call to malloc() fails to reserve the requested memory, test1 will be assigned NULL. Passing NULL to strcpy() invokes UB.

The return value of calls to malloc() (and calloc() and friends) should always be tested to ensure the operation worked as expected.

pmg
You know, it would probably be helpful to the OP if you would explain why UB is getting involved (the fact that he's not handling malloc failures).
Billy ONeal
Thanks, Billy. Post edited.
pmg
Actually, both cases have the same UB due to failure to validate the return of `malloc()`. The first case has the bonus UB from not allocating room for the NUL.
RBerteig
+2  A: 

You should use strlen, because sizeof() will fail silently if you change test to be a run-time defined string. This means that strlen is a far safer idea than sizeof as it will keep working.

DeadMG
at run-time the variable also can be uninitialized, strlen/strcpy will crash.
At least in that case you'd get a crash instead of silently the wrong value. In addition, it's much easier for the compiler to warn you about potential uninitialized variable use. Finally, if you pass an uninitialized variable, it doesn't matter how you determined it's size.
DeadMG