views:

300

answers:

10

This seems like it should be really simple, but for some reason, I'm not getting it to work. I have a string called seq, which looks like this:

ala
ile
val

I want to take the first 3 characters and copy them into a different string. I use the command:

memcpy(fileName, seq, 3 * sizeof(char));

That should make fileName = "ala", right? But for some reason, I get fileName = "ala9". I'm currently working around it by just saying fileName[4] = '\0', but was wondering why I'm getting that 9.

Note: After changing seq to

ala
ile
val
ser

and rerunning the same code, fileName becomes "alaK". Not 9 anymore, but still an erroneous character.

+5  A: 

You need to set

fileName[3] = 0;

Make sure fileName has enough space for the end of string NUL byte.

florin
this is not a great solution, what if @wolfPack88 want to use this code in other part of a project, this must be calculate lenght and type it constant?
Svisstack
@Svisstack In the context of the question asked, it is the right answer. Of, course, in real life a constant should be defined and used to derive the size of the array, number of characters in the memcpy and where to put the \0
JeremyP
+5  A: 

You should be using filename[3]='\0';. As to why it's necessary: because nothing else has set the NUL terminator for the string, so you have to.

Edit: of course for real use, you don't use a constant like I've shown above. Typically you'd use something like:

char *substring(char *out, char const *in, size_t len) { 
    memcpy(out, in, len);
    out[len] = '\0';
    return out;
}

Note that you did have pretty much the right idea using memcpy though. strncpy (for an obvious example) is not really the right thing to use for this (or almost any other) purpose. On the list of standard library functions to avoid, strncpy comes second on the list, behind only gets (though, in fairness I have to point out that strtok is a close third).

Also note that (like most standard library functions) this makes no attempt at verifying the parameters you pass -- for example, if you tell it to copy 99 characters from a string that's only 10 characters long into a buffer that's only 5 characters long, it'll try to copy 99 characters anyway, producing undefined behavior).

Edit2: One alternative is to use sprintf.

Jerry Coffin
I guess I was under the impression that memcpy would terminate the string for me after I moved it over... Oh well. Thanks for the help.
wolfPack88
@wolfpack88: nope -- `memcpy` is for copying memory in general, and assumes nothing about the data, so it does nothing but copy what you've told it to, verbatim.
Jerry Coffin
@Jerry Coffin: except where memory overlaps.
dreamlax
@dramlax: yes, if there's a possibility of overlap, you want `memmove` instead.
Jerry Coffin
+16  A: 

C uses a null terminator to denote the end of a string. memcpy doesn't know that you're copying strings (it just copies bytes) so it doesn't think to put one on. The workaround you have is actually the right answer.

Edit: wolfPack88 has a good point. You really need to be changing filename[3]. Also, the below comments bring up some great points about strncpy which is a choice worth learning too.

Pace
Sorry to disagree, but nope, the workaround isn't the right answer. It's a workaround. The right answer is strncpy, as posted by Svisstack.
Bruno Brant
@Bruno: no, it's not. In fact, about the only time `strncpy` is the right answer is to a question about functions to avoid.
Jerry Coffin
@Jerry: hmm... and why is that?
Bruno Brant
@Bruno:because `strncpy` is specified to do something that's rarely useful. If your source shorter than was specified, it pads the result to the specified size. If your source is longer than was specified, it doesn't include the NUL terminator in the result. It works fine for its original purpose (converting a string to a UNIX file name) but that's about all.
Jerry Coffin
@Bruno: There are several good explanations on stack overflow about what `strncpy()` is and isn't; one of mine is here: http://stackoverflow.com/questions/1258550/why-on-earth-would-anyone-use-strncpy-instead-of-strcpy/1258577#1258577 - but if you can dig up one of AndreyT's ones they are very comprehensive (I just can't conjure up the search mojo to find one).
caf
Jerry: Non-zero-terminated fixed length strings (or arrays of characters if you object to my calling them strings) are commonly used in embedded programming, and strncpy works fine on them.
Jeanne Pindar
@Jeanne: based on your description, `memcpy` (or `memmove` if you might have overlap) will work better. While `strncpy` can sort of work for fixed-length strings, the very best you can hope for is that it does exactly what `memcpy`/`memmove` would have.
Jerry Coffin
@Jerry and @caf: Thanks a lot for the info. It seems I have been using `strncpy` inadvertently. And to @Pace, I'm most sorry... Yours is the right answer after all.
Bruno Brant
+2  A: 

The reason is that you copy the three character bytes from the seq, however, there is no terminating null-char. So you're workaround is not a workaround but a correct solution.

C-Strings should be null-terminated. If they're not, then the "user" of the strings reads until he cannot read any further, which results in undefined behaviour.

Btw, why not use strncpy ?

Henri
+2  A: 

The unexpected character is an artifact of not properly null-terminating fileName.

In this case, fileName must be a char buffer having length at least 4 (three for the three characters ala and one for the terminating null character). To set the null character, you can use:

fileName[3] = '\0';

after the memcpy.

Daniel Trebbien
+2  A: 

In addition to null-terminating your string,

fileName[3] = '\0';

You may also want to consider using strncpy instead of memcpy. Also, sizeof(char) should always evaluate to 1, so it is redundant.

Good luck!

Parappa
You mean NUL, not NULL.
lhf
@lhf: The C standard contains no mention of `NUL`; it is called the *null character*.
dreamlax
[4]? Should be [3].
Syd
@Syd - thanks@lhf - I googled NUL vs NULL and they're both good but NULL seems to be more common
Parappa
@Parappa: `NUL` is the ASCII moniker for the null character. `NULL` usually refers to the null pointer constant. The two are fundamentally different but they share the same name. `NUL` is spelled with one L because all ASCII control characters have 2 or 3 character abbreviations. The C standard does not use the term `NUL` for the null character because it because the name is specific to ASCII and supersets or variations of ASCII, which is an implementation detail.
dreamlax
"NULL" is wrong to refer to the terminating character of a C string. When typed in upper case like that it refers to a macro that defines the C null pointer. You should probably write "null" (lower case) or "the null character", or on any implementation that uses the ASCII character set, or a superset thereof e.g. UTF-8, NUL would be OK.
JeremyP
Alright, I edited NULL to "null".
Parappa
+3  A: 

Strings in C are nul terminated, meaning that you need the nul character at the end of the string. It seems like you were lucky enough to have a nul character just at the next character so that you only got one extra garbage character, you could just as well have gotten thousands of garbage characters...

Guffa
+8  A: 

sprintf is your friend for extracting characters from the middle of one string and putting them in a character buffer with null termination.

sprintf(fileName, "%.3s", seq);

or

sprintf(fileName, "%.*s", 3, seq);

or even

snprintf(fileName, sizeof(fileName), "%.*s", len, seq);

will give you what you want. The * version allows a variable length, and snprintf is safer for avoiding buffer overflows

Chris Dodd
+4  A: 

The standard library of C language has no dedicated function for copying part of a string. The proper way to do it is to use memcpy (as you did already) and the explicitly null-terminate the result. You forgot to terminate the result, which is why you see the strange extra characters after the copied portion of the string.

Note that memcpy will only work if you know the length of the source string in advance, i.e. you know that the copied portion of the string lies entirely inside the source string. If there's a chance that the copied portion of the source contains the terminating null-character (i.e. source string ends in the middle of the copied portion), then you'd have to either write your own function for copying or use the non-standard but widely available strlcpy.

Sometimes you can come across code samples that attempt to use strncpy function for that purpose. While it might appear to "work" in some cases, there's absolutely no point in using strncpy, taking into account that it is not intended to be used that way.

AndreyT
+5  A: 

If you want to use memcpy to copy strings, you must set the character '\0' manually after the last character of the string. If you don't want to handle '\0' manually, use strcpy or strncpy instead.

gclello
Beware: strncpy() does not guarantee null termination if the source string is too long.
Jonathan Leffler
Regarding strncpy, it guarantees the destination string will be null-terminated provided that the parameter which specifies the maximum number of characters to be copied from source string is greater than the length of the source string. This behavior makes sense, since in C the memory allocated for the string must be enough to handle both the string characters and the null character. Take a look at the strncpy definition in cplusplus.com (http://cplusplus.com/reference/clibrary/cstring/strncpy/).
gclello
@gclello: Yes, but in this case the asker is copying 3 characters from a string with more than 3 chars. `strncpy` is of limited use in that case. It's faster to `memcpy` 3 chars and always set the 4th to `'\0'`.
tomlogic
@tomlogic: That's right. I only pointed out strncpy to explain why using memcpy requires the user to set the '\0' character manually.
gclello