views:

1127

answers:

9

I'm starting to learn C by reading K&R and going through some of the exercises. After some struggling, I was finally able to complete exercise 1-19 with the code below:

/* reverse: reverse the character string s */
void reverse(char s[], int slen)
{
  char tmp[slen];
  int i, j;

  i = 0;
  j = slen - 2;    /* skip '\0' and \n */

  tmp[i] = s[j];
  while (i <= slen) {
    ++i;
    --j;
    tmp[i] = s[j];
  }

  /* code from copy function p 29 */
  i = 0;
  while ((s[i] = tmp[i]) != '\0')
    ++i;

}

My question is regarding that last bit of code where the tmp char array is copied to s. Why doesn't a simple s = tmp; work instead? Why does one have to iterate through the array copying index by index?

+4  A: 

Because both s and tmp are memory addressees. If you s = tmp, both pointers would point to the same array.

Suppose that we have

char s[] ="ab"; 

/*
* Only for explanatory purposes.
* 
*/
void foo(char s[]){ 
    char tmp [] = "cd";
    s= tmp;
 }

foo(s);

after s= tmp you would have

s[0] : 'c'
s[1] : 'd'
s[2] : '\0'

Even though both arrays have the same data, a change in tmp, will affect both of them, because both arrays are actually the same. They both contain data that´s in the same memory address. So by changing any position of the tmp array, or destroying the tmp array, s would be affected in the same way.

By looping over the array, what you are doing is moving a piece of data from one memory address to another.

In my copy of K & R, pointers are explained in chapter 4. A quick glance through the first pages may be of help.

Tom
I find this confusing, and I understand pointer arithmetic.
Ben S
Ben S: Thanks for your feedback. I've re arranged my answer.
Tom
You cannot do the assignment s = tmp;Arrays cannot be assigned in C. Pointers can though, but a pointer is not an array. (The name of an array gets promoted to a pointer to its first element when used as a value though, so if 's' were a pointer you'd be right.
nos
I recommend this pet peeve http://stackoverflow.com/questions/423823/whats-your-favorite-programmer-ignorance-pet-peeve/484900#484900
Johannes Schaub - litb
@nos: Thanks for remarking that. Editing..
Tom
@litb: Now i know what pet peeve is :).
Tom
"s will point to a stack variable when foo ends, that will yield into undefined results." That is not right. Only foo's s variable ever points to "cd". The caller's s is not modified. It will still point to ab. Try it!
Matthew Flaschen
@Matthew Flaschen: Thanks, you are right. This has been very educating.
Tom
@Tom, i added a new section to my pet peeve, explaining how "s" of foo is different from the global "s"
Johannes Schaub - litb
A: 

because tmp is a pointer, and you need to get a copy, not a "link".

Macarse
tmp is not a pointer; it is an array.
Jonathan Leffler
tmp is a char *, a pointer.
Macarse
I recommend this pet peeve http://stackoverflow.com/questions/423823/whats-your-favorite-programmer-ignorance-pet-peeve/484900#484900
Johannes Schaub - litb
k, thanks for the link.
Macarse
A: 

A very straight forward answer would be - both s and tmp are pointers to a memory location and not the arrays themselves. In other words, s and tmp are memory addresses where the array values are stored but not the values themselves. And one of the common ways to access these array values are by using indices like s[0] or tmp[0].

Now, if you will try to simply copy, s = tmp, the memory address of tmp array will be copied over to s. This means that, the original s array will be lost and even s memory pointer will now point to tmp array.

You will understand these concepts well with due time so keep going through the book. I hope this elementary explanation helps.

Elitecoder
Lol, am I too confusing??
Elitecoder
I recommend this pet peeve http://stackoverflow.com/questions/423823/whats-your-favorite-programmer-ignorance-pet-peeve/484900#484900
Johannes Schaub - litb
+7  A: 

Your tmp array was declared on stack and so when your method completes, the memory used to hold the values will be freed because of scoping.

s = tmp means that s should point to the same memory location as tmp. This means that when tmp is freed, s will still be pointing to a now possible invalid, freed memory location.

This type of error is referred to as a dangling pointer.

Edit: This isn't a dangling modifier as pointed out in the comments of this answer. The issue is that saying s = tmp only changes what the parameter points to, not what the actual array that was passed.

Also, you could perform your reverse with a single pass and without allocating a whole array in memory by just swapping the values in place one by one:

void reverse(char s[], int slen) {
    int i = 0;        // First char
    int j = slen - 2; // Last char minus \n\0
    char tmp = 0;     // Temp for the value being swapped

    // Iterate over the array from the start until the two indexes collide.
    while(i < j) {
        tmp = s[i];  // Save the eariler char
        s[i] = s[j]; // Replace it with the later char
        s[j] = tmp;  // Place the earlier char in the later char's spot
        i++;         // Move forwards with the early char
        j--;         // Move backwards with the later char
    }
}
Ben S
+1 - because unlike the loads of other answers, you actually say tmp *is an array*, and *not* a pointer. The other answers would confuse me if i were him, thinking "huh, what pointer do they talk about?? tmp is an array!"
Johannes Schaub - litb
Won't speaking of tmp as an array would be like keeping him under confusion for longer. tmp is a pointer to an array, thats the correct definition. The sooner he learns, the better, isn't it?
Elitecoder
!litb temp is definitely a pointer. It's a pointer to the first character of the array. Understanding how to write the [] operator as a macro would require you to full understand this.
llamaoo7
There is no dangling pointer in his code, assigning tmp to s just modifies the local copy of the pointer to the array. When the function returns, the original s pointer and the array are completely untouched.
Eclipse
@Eclipse: I added a correction to my answer.
Ben S
Ah of course. I had read about this the other day on a C++ thread. Also thanks for the style/optimization advice!
felideon
Thanks for the clarification, Eclipse.
felideon
Elitecoder, the definition is that it *is* an array (variadic length array). The definition of a pointer to an array (that array) would be char(*ptmp)[slen] = I don't see why it would "keep him under confusion" if we speak of tmp as an array, because it *is* an array. sizeof(tmp) will give you slen. What isn't an array is s. The truth can be confusing, but that doesn't mean we should make appear it simplier by telling the wrong things - *that* will just confuse him! Start reading the Standard before you make fun of me, please
Johannes Schaub - litb
Jonathan Leffler
I'm not using the strlen() function since it hasn't been introduced yet in Ch. 1; as far as what the exercise stated, I'm not really doing this to get the answer perfect (is there even an official list of answers?) but for practice. For me, it wouldn't make sense to put a newline at the beginning of a line/string.
felideon
@llamaoo7: there is no variable temp. In the question, the variable tmp is an array; that is what 'char tmp[slen];' defines, and no weasel wording gets around it. It is an ARRAY - a variable length array, indeed, and those are a feature of C99, not C89. You are correct that when you pass tmp to a function, then it decays to a pointer to the first character, but it is not itself a pointer. A pointer, it would be declared 'char *tmp;'. You are setting yourself up for serious misunderstandings and a lifetime of misery if you don't realize the difference.
Jonathan Leffler
A: 

Hi

In case of s=tmp, the value of tmp which is the also the beginning address of the array, would get copied to s.

That way both s and tmp will point to the same address in memory, which I think is not the purpose.

cheers

Andriyev
A: 

There's an interesting sub-thread in this thread about arrays and pointers I found this link on wikipedia with a peculiar code snippet showing just how 'plasticine' C can be!

/* x designates an array */
x[i] = 1;
*(x + i) = 1;
*(i + x) = 1;
i[x] = 1; /* strange, but correct: i[x] is equivalent to *(i + x) */

Of course what's even more confusing in C is that I can do this:

unsigned int someval = 0xDEADD00D;
char *p = (char *)&someval;

p[2] = (char)0xF0;

So the interchangibility of pointers and arrays seems so deep-set in the C language as to be almost intentional.
What does everyone else think?

---Original Post---
s and tmp are both pointers so doing s = tmp will simply make s point at the address where tmp lives in memory.
Another problem with what you outlined is that tmp is a local variable so will become 'undefined' when it goes out of scope i.e when the function returns.

Make sure you thoroughly grasp these three concepts and you won't go far wrong

  1. Scope
  2. The difference between the stack and the heap
  3. Pointers

Hope that helps and keep going!

zebrabox
tmp is an array - not a pointer.
Jonathan Leffler
A: 

Try experimenting and see what happens when you do things like this:

void modifyArrayValues(char x[], int len)
{
    for (int i = 0; i < len; ++i)
     x[i] = i;
}

void attemptModifyArray(char x[], int len)
{
    char y[10];
    for (int i = 0; i < len; ++i)
     y[i] = i;
    x = y;
}


int main()
{
    int i = 0;
    char x[10];
    for (i = 0; i < 10; ++i)
     x[i] = 0;

    attemptModifyArray(x, 10);
    for (i=0; i < 10; ++i)
     printf("%d\n", x[i]); // x is still all 0's

    modifyArrayValues(x, 10);
    for (i=0; i < 10; ++i)
     printf("%d\n", x[i]); // now x has 0-9 in it
}

What happens when you modify the array directly in attemptModifyArray, you are just overwriting a local copy of the address of the array x. When you return, the original address is still in main's copy of x.

When you modify the values in the array in modifyArrayValues, you are modifying the actual array itself which has its address stored in modifyArrayValues local copy of x. When you return, x is still holding on to the same array, but you have modified the values in that array.

Eclipse
+1  A: 

To round out the discussion here are two other possible ways to reverse as string:

void reverse(char string1[], char string2[])
{
  int i = 0, len = 0;

  while(string2[len] != '\0')   // get the length of the string
      len++;

  while(len > 0)
  {
    string1[i] = string2[len-1]; // copy the elements in reverse
    i++;
    len--;
  }
  string1[i] = '\0'; // terminate the copied string 
}

Or recursively:

void reverse (const char *const sPtr)
{
  //if end of string
  if (sPtr[0] == '\0')
  {
    return;
  }
  else  //not end of the string...
   {
    reverse(&sPtr[1]);  //recursive step
    putchar(sPtr[0]);   //display character
   }
}
iwanttoprogram
The second solution displays the string in reverse order without changing the string - whereas the exercise requests that the string is reversed without printing it. The first solution would be better if it use strlen() and if the subscript '[len-1]' avoided the repeated subtraction (by decrementing len before starting the loop).
Jonathan Leffler
+7  A: 

Maybe I'm just old and grumpy, but the other answers I've seen seem to miss the point completely.

C does not do array assignments, period. You cannot assign one array to another array by a simple assignment, unlike some other languages (PL/1, for instance; Pascal and many of its descendants too - Ada, Modula, Oberon, etc.). Nor does C really have a string type. It only has arrays of characters, and you can't copy arrays of characters (any more than you can copy arrays of any other type) without using a loop or a function call. [String literals don't really count as a string type.]

The only time arrays are copied is when the array is embedded in a structure and you do a structure assignment.

In my copy of K&R 2nd Edition, exercise 1-19 asks for a function reverse(s); in my copy of K&R 1st Edition, it was exercise 1-17 instead of 1-19, but the same question was asked.

Since pointers have not been covered at this stage, the solution should use indexes instead of pointers. I believe that leads to:

#include <string.h>
void reverse(char *s)
{
    int i = 0;
    int j = strlen(s) - 1;
    while (i < j)
    {
        char c = s[i];
        s[i++] = s[j];
        s[j--] = c;
    }
}

#ifdef TEST
#include <stdio.h>
int main(void)
{
    char buffer[256];
    while (fgets(buffer, sizeof(buffer), stdin) != 0)
    {
        int len = strlen(buffer);
        if (len == 0)
            break;
        buffer[len-1] = '\0';  /* Zap newline */
        printf("In:  <<%s>>\n", buffer);
        reverse(buffer);
        printf("Out: <<%s>>\n", buffer);
    }
    return(0);
}
#endif /* TEST */

Compile this with -DTEST to include the test program and without to have just the function reverse() defined.

With the function signature given in the question, you avoid calling strlen() twice per line of input. Note the use of fgets() -- even in test programs, it is a bad idea to use gets(). The downside of fgets() compared to gets() is that fgets() does not remove the trailing newline where gets() does. The upsides of fgets() are that you don't get array overflows and you can tell whether the program found a newline or whether it ran out of space (or data) before encountering a newline.

Jonathan Leffler