views:

157

answers:

4

Hi all. I have a question about how is the correct way of manipulate the initialization of c strings For example the next code, isn't always correct.

char *something;
something = "zzzzzzzzzzzzzzzzzz";

i test a little incrementing the number of zetas and effectively the program crash in like about two lines, so what is the real size limit in this char array? how can i be sure that it is not going to crash, is this limit implementation dependent? Is the following code the correct approach that i always must use?

char something[FIXEDSIZE];
strcpy(something, "zzzzzzzzzzzzzzzzzzz");
A: 

first should never crash. second will crash as soon as the number of 'z' + 1 go over the available space on the stack page, or if you try to return from the function.

vlabrecque
The second one might crash if you invoke undefined behavior somehow by overflowing the stack, but "returning from the function" is no more related to the crash then adding two variables.
Thanatos
It'd be pretty related if you were to assign 100 chars to a 30-char array. The return address would more than likely have been clobbered by the overflowing string -- that's how buffer overflows work, and they used to be like the #1 way of breaking into servers and such. Random contents cause a crash; specially malformed contents can get you root/system access.
cHao
well, in my test the first one crash with a real big string
voodoomsr
@cHao: Ah, you are right... I always visualize stack stuff incorrectly. Nonetheless, this is an implementation, albeit of a very popular implementation.
Thanatos
That said, I don't really feel like "assign 100 chars to a 30-char array" is the problem at hand here...
Thanatos
@thanatos: did you just(1) say it could crash with a stack overflow in your answer (2) not understand what that means and (3) critique my explanation of a stack overflow was "unrelated"? (as well as probably voting down my answer since you are the only critic)Without a correct explanation of what the poster actually _did_ rather than what he think he did, there's not much that can be said.
vlabrecque
+1  A: 

The first example is only incorrect in that char *something should really be const char *something. Otherwise, this:

const char *something = "fooooooooooooooooooooooobar";

...should work, and should not crash.

char something[FIXEDSIZE];

...this one, however, can typically crash with a stack overflow if you, well, overflow the stack, which depends on how big that stack is, how big that array is, where this gets called, etc.

Thanatos
can you explain me a little more, what is the reason of the const?
voodoomsr
Thats not a stack.
BC
@BC, it might be a stack if the declaration is an automatic variable. Since the OP didn't provide much context, we don't know whether the declarations are at file scope or at some smaller scope within a function.
RBerteig
'const' informs the compiler that you promise not to modify the memory pointed to by 'something'. If you make this change, you should get a warning message on the code that is actually *wrong*, which is where you are now crashing.
Zack
@voodoomsr, `const char *` is the type of a constant string such as `"abc"`. It represents a promise that the characters point at by `something` are immutable. It allows a compile-time check for errors caused by code like `something[3] = 'd'` (which probably should have been `something[3] == 'd'`) to fail on the grounds that you can't do that because you said so.
RBerteig
In summary, both ways are always correct. If i know that i'm not going to modify the string i should use the const keyword just for compile-check and better semantics of my program but not because is mandatory.I'm really Ok? i still have a contradiction in my mind....i wrote a program that ask for a string using scanf("%s",something. With real big strings crash, with little strings works fine...Why?
voodoomsr
@voodoomsr: To clarify: while a compiler may not warn you about assigning a string literal to a `char *` (ie, `char *s = "foo"`), that pointer is still *read only*. It is not valid to write to `s` in my example. On the `scanf` question: Because `something` must have a length when it is passed to `scanf`, and it cannot know beforehand how many characters I will enter, I can always enter more than the length of the buffer, thus, overflowing it, and writing data to spots in memory where I shouldn't. (`scanf("%s", something);` is typically vulnerable to buffer overflows.)
Thanatos
Thanks Thanatos, so in conclusion i could use scanf("%s", something) without telling the compiler the size of something, but doing that is a risk, because if i pass more characters than the buffer it will crash.In the other situation if i have a char *something; and then in other line i initialize his value directly in the code, something ="zzzzzzzzz" there will be no problem with the number of zetas.
voodoomsr
+1  A: 

The second is always correct.

The first is correct only if you never change the string, since you've assigned a pointer to fixed data.

Joel
Interesting how people downvote without leaving a reason.
Joel
yea even when i put it as a correct answer
voodoomsr
My mistake. (And apparently someone else's too =P) I down-voted because I think this answer doesn't address the OP's actual questions (i.e. "what is the real size limit in this char array?" and "Is the following code the correct approach that i always must use?"). Also, the OP already stated that the first is only sometimes correct. But I guess he marked it right, so it must have satisfied him!
Chris Cooper
+8  A: 

As you say, manipulating this string leads to undefined behaviour:

char *something;
something = "zzzzzzzzzzzzzzzzzz";

If you are curious as to why, see "C String literals: Where do they go?".

If you plan to manipulate your string at all, (i.e. if you want it to be mutable) you should use this:

char something[] = "skjdghskfjhgfsj";

Otherwise, simply declare your char * as a const char * to indicate that it points to a constant.

In the second example, the compiler will be smart enough to declare this as an array on the stack of the exact size to hold the string. Thus, the size of this is limited by your stack.

Of course, you will likely want to specify the size anyway, since it is usually useful to know when manipulating the string.

Chris Cooper
`char something[] = "zzzzzzzzzzzzzzz"` works for mutable strings, and is a lot harder to screw up.
Thanatos