views:

195

answers:

7

I am new to C and I am very much confused with the C strings. Following are my questions.

Finding last character from a string

How can I find out the last character from a string? I came with something like,

char *str = "hello";
printf("%c", str[strlen(str) - 1]);
return 0;

Is this the way to go? I somehow think that, this is not the correct way because strlen has to iterate over the characters to get the length. So this operation will have a O(n) complexity.

Converting char to char*

I have a string and need to append a char to it. How can i do that? strcat accepts only char*. I tried the following,

char delimiter = ',';
char text[6];
strcpy(text, "hello");
strcat(text, delimiter);

Using strcat with variables that has local scope

Please consider the following code,

void foo(char *output)
{
   char *delimiter = ',';
   strcpy(output, "hello");
   strcat(output, delimiter);
}

In the above code,delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?

How strcat handles null terminating character?

If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?

Is there a good beginner level article which explains how strings work in C and how can I perform the usual string manipulations?

Any help would be great!

+5  A: 
  1. O(n) is the best you can do, because of the way C strings work.
  2. char delimiter[] = ",";. This makes delimiter a character array holding a comma and a NUL Also, text needs to have length 7. hello is 5, then you have the comma, and a NUL.
  3. If you define delimiter correctly, that's fine (as is, you're assigning a character to a pointer, which is wrong). The contents of output won't depend on delimiter later on.
  4. It will overwrite the first NUL.

You're on the right track. I highly recommend you read K&R C 2nd Edition. It will help you with strings, pointers, and more. And don't forget man pages and documentation. They will answer questions like the one on strcat quite clearly. Two good sites are The Open Group and cplusplus.com.

Matthew Flaschen
Thanks. It was very helpful. On point 3 when you say "f you define delimiter correctly, that's fine". Do you mean `char delimiter[] = ";"`? I guess `const char *delimiter = "'"` should also be fine.
LearningMan
@Learning, yes, either of those are fine.
Matthew Flaschen
I am trying to understand why `char *delimiter = ',';` is problematic? Won't `strcat` copy the values?
LearningMan
`','` is a character constant, which has type int. Most likely, it is the integer 44. So basically you're assigning the address 44 to a pointer-to-char. That will lead to undefined behavior (probably, but not necessarily, a segfault). A good compiler will warn you about this by default (gcc says "initialization makes pointer from integer without a cast"), and provide options (e.g. `-pedantic-errors`) to turn this into an error.
Matthew Flaschen
Ahh, it makes more sense now. Thanks agian Matthew.
LearningMan
+7  A: 
  1. Last character: your approach is correct. If you will need to do this a lot on large strings, your data structure containing strings should store lengths with them. If not, it doesn't matter that it's O(n).

  2. Appending a character: you have several bugs. For one thing, your buffer is too small to hold another character. As for how to call strcat, you can either put the character in a string (an array with 2 entries, the second being 0), or you can just manually use the length to write the character to the end.

  3. Your worry about 2 nul terminators is unfounded. While it occupies memory contiguous with the string and is necessary, the nul byte at the end is NOT "part of the string" in the sense of length, etc. It's purely a marker of the end. strcat will overwrite the old nul and put a new one at the very end, after the concatenated string. Again, you need to make sure your buffer is large enough before you call strcat!

R..
+1  A: 

Finding last character from a string

I propose a thought experiment: if it were generally possible to find the last character of a string in better than O(n) time, then could you not also implement strlen in better than O(n) time?

Converting char to char*

You temporarily can store the char in an array-of-char, and that will decay into a pointer-to-char:

char delimiterBuf[2] = "";
delimiterBuf[0] = delimiter;
...
strcat(text, delimiterBuf);

If you're just using character literals, though, you can simply use string literals instead.

Using strcat with variables that has local scope

The variable itself isn't referenced outside the scope. When the function returns, that local variable has already been evaluated and its contents have already been copied.

How strcat handles null terminating character?

"Strings" in a C are NUL-terminated sequences of characters. Both inputs to strcat must be NUL-terminated, and the result will be NUL-terminated. It wouldn't be useful for strcat to write an extra NUL-byte to the result if it doesn't need to.

(And if you're wondering what if the input strings have multiple trailing NUL bytes already, I propose another thought experiment: how would strcat know how many trailing NUL-bytes there are in a string?)

BTW, since you tagged this with "best-practices", I'll also recommend that you take care not to write past the end of your destination buffers. Typically this means avoiding strcat and strcpy (unless you've already checked that the input strings won't overflow the destination) and using safer versions (e.g. strncat. Note that strncpy has its own pitfalls, so that's a poor substitute. There also are safer versions that are non-standard, such as strlcpy/strlcat and strcpy_s/strcat_s.)

Similarly, functions like your foo function always should take an additional argument specifying what the size of the destination buffer is (and documentation should make it explicitly clear whether that size accounts for a NUL terminator or not).

jamesdlin
+1  A: 

How can I find out the last character from a string?

Your technique with str[strlen(str) - 1] is fine. As pointed out, you should avoid repeated, unnecessary calls to strlen and store the results.

I somehow think that, this is not the correct way because strlen has to iterate over the characters to get the length. So this operation will have a O(n) complexity.

Repeated calls to strlen can be a bane of C programs. However, you should avoid premature optimization. If a profiler actually demonstrates a hotspot where strlen is expensive, then you can do something like this for your literal string case:

const char test[] = "foo";
sizeof test // 4

Of course if you create 'test' on the stack, it incurs a little overhead (incrementing/decrementing stack pointer), but no linear time operation involved.

Literal strings are generally not going to be so gigantic. For other cases like reading a large string from a file, you can store the length of the string in advance as but one example to avoid recomputing the length of the string. This can also be helpful as it'll tell you in advance how much memory to allocate for your character buffer.

I have a string and need to append a char to it. How can i do that? strcat accepts only char*.

If you have a char and cannot make a string out of it (char* c = "a"), then I believe you can use strncat (need verification on this):

char ch = 'a';
strncat(str, &ch, 1);

In the above code,delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?

Yes: functions like strcat and strcpy make deep copies of the source string. They don't leave shallow pointers behind, so it's fine for the local data to be destroyed after these operations are performed.

If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?

No, strcat will basically overwrite the null terminator on the dest string and write past it, then append a new null terminator when it's finished.

+3  A: 

A "C string" is in reality a simple array of chars, with str[0] containing the first character, str[1] the second and so on. After the last character, the array contains one more element, which holds a zero. This zero by convention signifies the end of the string. For example, those two lines are equivalent:

char str[] = "foo"; //str is 4 bytes
char str[] = {'f', 'o', 'o', 0};

And now for your questions:

Finding last character from a string

Your way is the right one. There is no faster way to know where the string ends than scanning through it to find the final zero.

Converting char to char*

As said before, a "string" is simply an array of chars, with a zero terminator added to the end. So if you want a string of one character, you declare an array of two chars - your character and the final zero, like this:

char str[2];
str[0] = ',';
str[1] = 0;

Or simply:

char str[2] = {',', 0};

Using strcat with variables that has local scope

strcat() simply copies the contents of the source array to the destination array, at the offset of the null character in the destination array. So it is irrelevant what happens to the source after the operation. But you DO need to worry if the destination array is big enough to hold the data - otherwise strcat() will overwrite whatever data sits in memory right after the array! The needed size is strlen(str1) + strlen(str2) + 1.

How strcat handles null terminating character?

The final zero is expected to terminate both input strings, and is appended to the output string.

slacker
+1: Very clear answer.
Donal Fellows
+1  A: 

How can I find out the last character from a string?

Your approach is almost correct. The only way to find the end of a C string is to iterate throught the characters, looking for the nul.

There is a bug in your answer though (in the general case). If strlen(str) is zero, you access the character before the start of the string.

I have a string and need to append a char to it. How can i do that?

Your approach is wrong. A C string is just an array of C characters with the last one being '\0'. So in theory, you can append a character like this:

char delimiter = ',';
char text[7];
strcpy(text, "hello");
int textSize = strlen(text);
text[textSize] = delimiter;
text[textSize + 1] = '\0';

However, if I leave it like that I'll get zillions of down votes because there are three places where I have a potential buffer overflow (if I didn't know that my initial string was "hello"). Before doing the copy, you need to put in a check that text is big enough to contain all the characters from the string plus one for the delimiter plus one for the terminating nul.

... delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?

Yes that's fine. strcat copies characters. But your code sample does no checks that output is big enough for all the stuff you are putting into it.

If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?

No.

JeremyP
+1  A: 

I somehow think that, this is not the correct way because strlen has to iterate over the characters to get the length. So this operation will have a O(n) complexity.

You are right read Joel Spolsky on why C-strings suck. There are few ways around it. The ways include either not using C strings (for example use Pascal strings and create your own library to handle them), or not use C (use say C++ which has a string class - which is slow for different reasons, but you could also write your own to handle Pascal strings more easily than in C for example)

Regarding adding a char to a C string; a C string is simply a char array with a nul terminator, so long as you preserve the terminator it is a string, there's no magic.

char* straddch( char* str, char ch )
{
    char* end = &str[strlen(str)] ;
    *end = ch ;
    end++ ;
    *end = 0 ;
    return str ;
}

Just like strcat(), you have to know that the array that str is created in is long enough to accommodate the longer string, the compiler will not help you. It is both inelegant and unsafe.

If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?

No, just one, but what ever follows that may just happen to be nul, or whatever happened to be in memory. Consider the following equivalent:

char* my_strcat( char* s1, const char* s2 )
{
    strcpy( &str[strlen(str)], s2 ) ;
}

the first character of s2 overwrites the terminator in s1.

In the above code,delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?

In your example delimiter is not a string, and initialising a pointer with a char makes no sense. However if it were a string, the code would be fine, strcat() copies the data from the second string, so the lifetime of the second argument is irrelevant. Of course you could in your example use a char (not a char*) and the straddch() function suggested above.

Clifford