tags:

views:

108

answers:

3

Using printf to print "\4unix\5lancs\2ac\2uk\0" I find, instead of a print in the form of ♦unix♣lancs☻ac☻uk, I get garbage (♫ ,►E¦§Qh ↕).

I cannot find an explanation for this; I use the following method to tokenise a string:

/**
 * Encode the passed string into a string as defined in the RFC.
 */
char * encodeString(char *string) {
    char stringCopy[128];
    char encodedString[128] = "";
    char *token;

    /* We copy the passed string as strtok mutates its argument. */
    strcpy(stringCopy, string);

    /* We tokenise the string on periods. */
    token = strtok(stringCopy, ".");

    while (token != NULL) {
     char encodedToken[128] = "";

     /* Encode the token. */
     encodedToken[0] = (char) strlen(token);
     strcat(encodedToken, token);

     /* Add the encodedString token to the encodedString string. */
     strcat(encodedString, encodedToken);

     /* Prepare for the next iteration. */
     token = strtok(NULL, ".");
    }

    /* A null character is appended already to the encoded string. */

    return encodedString;
}

And the following code in my driver to print the result when tokenising "unix.lancs.ac.uk":

int main(int argc, char *argv[]) {
    char *foo = "unix.lancs.ac.uk";
    char *bar = encodeString(foo);

    printf("%s\n", bar);

    return 0;
}

If I add a printf to print encodedString at the end of the encodeString method, I don't get garbage printed out (rather, ♦unix♣lancs☻ac☻uk twice).

(Upon debugging I notice the actual memory contents is changed.)

Can anyone explain this phenomenon to me?

+4  A: 

When you say:

 return encodedString;

you are returning a local variable, which will have ceased to exist by the time you come to use it. A quick hack would be to make encodedString static.

anon
+3  A: 

DO NOT use functions that return strings.

Send the string address as paremeter and change it in the function.

The compiler should display a warning. Don't ignore compiler warnings especially in C.

Your function should be like this:

void encodeString(char *string, char *encodedString)
{
.
.
.
}

See here

JCasso
Cheers for the advice! I see that's how a lot of the standard methods work anyway.
Beau Martínez
First sentence is little bit strict, second however VERY TRUE. Try this: char* encodeString(const char *string, char *encodedString); where return value is encodedString (and you can still use it in expressions, etc)
MaR
+8  A: 

You are returning a pointer to the array encodedString, which is local to the encodeString() function and has automatic storage duration.

This memory is no longer valid after that function exits, which is what is causing your problem.

You can fix it by giving encodedString static storage duration:

static char encodedString[128];

encodedString[0] = '\0';

(You can no longer use the initialiser to empty the array, because arrays with static storage duration maintain their values from one invocation of the function to the next.)

caf
Cheers! But that doesn't explain why if I call printf within the encodeString method the memory doesn't then get freed by the time I call the printf in main, do you know why that happens?
Beau Martínez
That's just one of the many weird manifestations you can get when you break the rules in C. The difference is because the extra call to `printf()` inside the function alters what's on the stack - and its this left-over stuff on the stack that the erroneuous `printf` call in the main function runs into.
caf