views:

182

answers:

4

I have had really big problems understand the char* lately. Let's say I made a recursive function to revert a char* but depending on how I initialize it I get some access violations, and in my C++ primer I didn't find anything giving me the right path to understand so I am seeking your help.

CASE 1 First case where I got access violation when trying to swap letters around:

char * bob = "hello";

CASE 2 Then I tried this to get it work

char * bob = new char[5];
bob[0] = 'h';
bob[1] = 'e';
bob[2] = 'l';
bob[3] = 'l';
bob[4] = 'o';

CASE 3 But then when I did a cout I got some random crap at the end so I changed it for

char * bob = new char[6];
bob[0] = 'h';
bob[1] = 'e';
bob[2] = 'l';
bob[3] = 'l';
bob[4] = 'o';
bob[5] = '\0';

CASE 4 That worked so I told myself why wouldn't this work then

 char * bob = new char[6];
 bob = "hello\0";

CASE 5 and it failed, I have also read somewhere that you could do something like

char* bob[];

Then add something to that. My question is why do some fail and other not, and what is the best way to do it?

+10  A: 

The key is that some of these pointers are pointing at allocated memory (which is read/write) and some of them are pointing at string constants. String constants are stored in a different location than the allocated memory, and can't be changed. Well most of the time. Often vulnerabilities in systems are the result of code or constants being changed, but that is another story.

In any case, the key is the use of the new keyword, this is allocating space in read/write memory and thus you can change that memory.

This statement is wrong

char * bob = new char[6];
bob = "hello\0";

because you are changing the pointer not copying the data. What you want is this:

char * bob = new char[6];
strcpy(bob,"hello");

or

strncpy(bob,"hello",6);

You don't need the nul here because a string constant "hello" will have the null placed by the compiler.

Hogan
@Hogan: To be nitpicky, you are using null as in the pointer context...for a nul with one l that is '\0', for a null with two l that is NULL...that would be confusing to beginners...
tommieb75
@tommieb75: no prob. I changed it.
Hogan
@tommieb75: actually "nul" might be wrong, that is the term for the ascii 0, but I think that `bob[5] = NULL;` and `bob[5] = 0;` would both work but `bob[5] = NUL;` would fail. I'm to lazy to test it or check the standard however.
Hogan
Is char * bob = new char[];strcopy(bob,"hello");SUppose to work? Cause it does.
Apoc
It will compile and run Apoc, but you are writing over random memory. This way lies buggy code. Allocate the space.
Hogan
Be aware of the demons in strncpy. if you'd done strncpy(bob,"Hello!",6); 'bob' would _not_ be nul terminated.
nos
@nos: true dat, but at the same time strcpy(bob"Hello!") would write over the end of the 6 character buffer he has.
Hogan
@nos: I always did this: `strncpy(s1,s2,x); s1[x] = 0;`, two lines, happy together.
Hogan
@Hogan no `bob` is 6 chars, "Hello!" is 6 chars. strncpy will fill the buffer but not overflow it. It just won't write a nul terminator at either `bob[5] or bob[6]` If you ever do `s1[x] = 0` make sure s1 is atleast `x+1` big.
nos
@nos: ok, I said above `strcpy(bob,"Hello!")` would write over the end of the 6 character buffer and it would. `strcpy()` copies n+1 where n is the size of the 2nd argument as we both know. So what are you saying I got wrong (besides leaving out a comma)?
Hogan
+1  A: 
char * bob = "hello"; 

This actually translated to:

const char __hello[] = "hello";
char * bob = (char*) __hello;

You can't change it, because if you'd written:

char * bob = "hello"; 
char * sam = "hello"; 

It could be translated to:

const char __hello[] = "hello";
char * bob = (char*) __hello;
char * sam = (char*) __hello;

now, when you write:

char * bob = new char[6];    
bob = "hello\0";

First you assign one value to bob, then you assign a new value to it. What you really want to do here is:

char * bob = new char[6];    
strcpy(bob, "hello");
James Curran
The key here is the `const` keyword. Constant merging optimizations are not really at issue IMHO.
Hogan
I'm not a C++ expert so this intrigues me. Does the standard mandate that duplicate string literals in a translation unit must share the same storage?
dreamlax
Just checked, 2.13.4.2 says "Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined. The effect of attempting to modify a string literal is undefined.".
dreamlax
@dreamlax: exactly, that is why I called them optimizations.
Hogan
Changed "would be translated" to "could be translated" for that reason.
MSalters
A: 

Edit: The question was retagged as C++ instead of C which was originally there but re-tagged....

Ok. You have got a couple of things mixed up... new is used by C++, not C.

  • Case #1. That is declaring a pointer to char. You should be able to manipulate the string...can you show the code in what you did to do swapping characters.
  • Case #2/#3. That you got random crap, and discovered that a nul terminator i.e. '\0'...occupies every single string you'll encounter for the duration of C/C++, possibly for the rest of your life...
+-+-+-+-+-+--+
|H|e|l|l|o|\0|
+-+-+-+-+-+--+
            ^
            |
         Nul Terminator
  • Case #4 did not work as you need to use a strcpy to do that job, you cannot simply assign a string like that after calling new, when you declare a string char *s = "foo"; that is initialized at compile time. But when you do it this way, char *s = new char[6]; strcpy(s, "hello"); that gets copied into the pointer variable s.

You will eventually discover that this pointer to a memory block occupied by s will easily get over-written which will induce a fit of conniptions as you realize that you have to be careful to prevent buffer overflows...Remember Case #3 in relation to nul terminator...don't forget that, really, that string's length is 6, not 5 as we're taking into account of the nul terminator.

  • Case #5. That is declaring a pointer to array of type char, i.e. a multi-dimensional array, think of it like this
*(bob + 0) = "foo";
*(bob + 1) = "bar";

I know there is a lot to digest...but feel free to post any further thoughts... :) And best of luck in learning...

Hope this helps, Best regards, Tom.

tommieb75
Actually, that was tagged as 'C' originally when I was posting the answer.... so easy...
tommieb75
@tommie: one more for your hit parade - `bob[5] = 0; // this is better`
Hogan
+1  A: 

You should always use char const* for pointers to string literals (stuff in double quotes). Even though the standard allows char* as well, it does not allow writing to the string literal. GCC gives a compile warning for assigning a literal address into char*, but apparently some other compilers don't.

Tronic