tags:

views:

283

answers:

8

I need some help with this, since it baffles me in my C program

I have 2 strings(base, and path)

BASE: /home/steve/cps730
PATH: /page2.html

this is how printf reads then just before I call a sprintf to join their content together. here is the code block

        int memory_alloc = strlen(filepath)+1;
        memory_alloc += strlen(BASE_DIR)+1;
        printf("\n\nAlloc: %d",memory_alloc);
        char *input = (char*)malloc(memory_alloc+9000);
        printf("\n\nBASE: %s\nPATH: %s\n\n",BASE_DIR,filepath);
        sprintf(input, "%s%s",BASE_DIR,filepath); //   :(

        printf("\n\nPATH: %s\n\n",input);

Now, can you explain how the final printf statement returns

PATH: e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/stev

because it dont understand it at all.

** I added 9000 in the malloc statement to prevent program from crashing (since the size of the string is obviously bigger then 31 bytes.

Full Output

Alloc: 31

BASE: /home/steve/cps730
PATH: /page2.html



PATH: /home/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/steve/cps730e/stev

Sending: 
HTTP/1.0 404 Not Found
Date: Sat, 12 Sep 2009 19:01:53 GMT
Connection: close

EDIT...................All the code that uses these variables

const char *BASE_DIR = "/home/steve/cps730";
 char* handleHeader(char *header){
    //Method given by browser (will only take GET, POST, and HEAD)
    char *method;
    method = (char*)malloc(strlen(header)+1);
    strcpy(method,header);
    method = strtok(method," ");

    if(!strcmp(method,"GET")){
     char *path = strtok(NULL," ");
     if(!strcmp(path,"/")){
      path = (char*)malloc(strlen(BASE_DIR)+1+12);
      strcpy(path,"/index.html");
     }
     free(method);
     return readPage(path);
    }
    else if(!strcmp(method,"POST")){

    }
    else if(!strcmp(method,"HEAD")){

    }
    else{
     strcat(contents,"HTTP/1.1 501 Not Implemented\n");
                strcat(contents, "Date: Sat, 12 Sep 2009 19:01:53 GMT\n");
                strcat(contents, "Connection: close\n\n");
    }
    free(method);

}

//Return the contents of an HTML file
char* readPage(char* filepath){
    int memory_alloc = strlen(filepath)+1;
    memory_alloc += strlen(BASE_DIR)+1;
    printf("\n\nAlloc: %d",memory_alloc);
    char *input = (char*)malloc(memory_alloc+9000); 
    printf("\n\nBASE: %s\nPATH: %s\n\n",BASE_DIR,filepath);
    sprintf(input, "%s%s\0",BASE_DIR,filepath);

    printf("\n\nPATH: %s\n\n",input);

    FILE *file;
    file = fopen(input, "r");
    char temp[255];
    strcat(contents,"");

    if(file){
     strcat(contents, "HTTP/1.1 200 OK\n");
                strcat(contents, "Date: Sat, 12 Sep 2009 19:01:53 GMT\n");
                strcat(contents, "Content-Type: text/html; charset=utf-8\n");
                strcat(contents, "Connection: close\n\n");

     //Read the requested file line by line
     while(fgets(temp, 255, file)!=NULL) { 
      strcat(contents, temp);   
     }
    }
    else{
     strcat(contents, "HTTP/1.0 404 Not Found\n");
                strcat(contents, "Date: Sat, 12 Sep 2009 19:01:53 GMT\n");
                strcat(contents, "Connection: close\n\n");
    }

    return contents;
}
+1  A: 

Aaah - the thrill of the chase as the question morphs while we're trying to resolve the problem!

The current code looks like:

const char *BASE_DIR = "/home/steve/cps730";

//Handles the header sent by the browser
char* handleHeader(char *header){
    //Method given by browser (will only take GET, POST, and HEAD)
    char *method;
    method = (char*)malloc(strlen(header)+1);
    strcpy(method,header);
    method = strtok(method," ");

    if(!strcmp(method,"GET")){
        char *path = strtok(NULL," ");
        if(!strcmp(path,"/")){
                path = (char*)malloc(strlen(BASE_DIR)+1+12);
                strcpy(path,"/index.html");
        }
        free(method);
        return readPage(path);
    }
    ...

Question: if this is running in a web server, is it safe to be using the thread-unsafe function strtok()? I'm going to assume 'Yes, it is safe', though I'm not wholly convinced. Have you printed the header string? Have you printed the value of path? Did you really intend to leak the allocated path? Did you realize that the malloc() + strcpy() sequence does not copy BASE_DIR into path?


The original version of the code ended with:

 printf("\n\nPATH: %s\n\n", filepath);

Hence the original suggested partial answer:

You format into input; you print from filepath?


What is the chance that filepath points to already released memory? When you allocate the memory, you could be getting anything happening to the quasi-random area that filepath used to point to. Another possibility could be that filepath is a pointer to a local variable in a function that has returned - so it points to somewhere random in the stack that is being reused by other code, such as sprintf().

I also mentioned in a comment that you might conceivably need to ensure that malloc() is declared and check the return value from it. The '(char *)' cast is not mandatory in C (it is in C++), and many prefer not to include the cast if the code is strictly C and not bilingual in C and C++.


This code works for me:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    const char *BASE_DIR = "/home/steve/cps730";
    const char *filepath = "/page2.html";

    int memory_alloc = strlen(filepath) + 1;
    memory_alloc += strlen(BASE_DIR) + 1;
    printf("\n\nAlloc: %d", memory_alloc);
    char *input = (char*)malloc(memory_alloc + 9000);
    printf("\n\nBASE: %s\nPATH: %s\n\n", BASE_DIR, filepath);
    sprintf(input, "%s%s", BASE_DIR, filepath);

    printf("\n\nPATH: %s\n\n", filepath);
    printf("\n\nPATH: %s\n\n", input);

    return(0);
}

It produces extraneous empty lines plus:

Alloc: 31
BASE: /home/steve/cps730
PATH: /page2.html
PATH: /page2.html
PATH: /home/steve/cps730/page2.html
Jonathan Leffler
I thought that, too, but the problem it seems is that `filepath` is changing from being used in `sprintf`. He prints `filepath` before using `sprintf`, and I assume it prints correctly before the call. After it's not.
GMan
I just realized that, but it didnt change anything
Steven1350
I wonder if the `char *` cast was hiding the fact that `malloc()` was not declared; that can lead to problems. There is no check on the return value of `malloc()`; granted, it is unlikely to be the problem, but... I don't have a good explanation for why `filepath` appears to change - but then we don't have a 'working' code sample from the questioner. Fragmentary program snippets lead to fragmentary answers...
Jonathan Leffler
I don't suppose you are related to Sam Leffler? :-)
DigitalRoss
@digitalross: Sam Leffler and I share the same last name - there might be a common ancestor within recorded history, but I'm not aware of it; any relationship is extremely distant.
Jonathan Leffler
+4  A: 

Well, clearly this can't happen :-)

My guess is that your heap is horribly corrupted already.

I would look at the actual pointer values used by filepath, input and base. I wonder if you'll find that input is very close to filepath?

I would also look at how filepath, base etc were originally created, could you have a buffer over-run there?

djna
That's my suspicion too. When variables change when they shouldn't, it's often because something else changed that accidentally ran over the old variable. Check for uninitialized variables, erroneous pointer arithmetic, or buffer overruns.
GMan
+1 for corrupted heap. There are many tools for diagnosing this early (eg. Valgrind).
Filip Navara
Well, clearly this is happening - the question is why is it happening.
Jonathan Leffler
@Jonathan Leffler That there was a smiley, the joke may not have been funny, but it was a joke. The rest of my answer did try to suggest "why?".
djna
A: 

Suggestions


There is nothing obviously wrong with the program. (Update: well, there is something obvious now. For the first hour only a few lines were posted, and they had no serious bugs.) You will have to post more of it. Here are some ideas:

  1. malloc(3) returns void * so it should not be necessary to cast it. If you are getting a warning, it most likely means you did not include <stdlib.h>. If you aren't, you should. (For example, on a 64-bit system, not prototyping malloc(3) can be quite serious. Some of the 64-bit environments don't really support K&R C. :-)
  2. Speaking of warnings, please be sure you are turning them all on. With gcc you can turn most of them on with -Wall.
  3. You are not checking the return value of malloc(3) for an error.
  4. Use a memory debugger like Electric Fence. There are many choices, see my link.
DigitalRoss
He's also not initializing the memory he gets from malloc. Shouldn't he zero it out if he plans to use it for strings to help prevent any sort of corruption?
mrduclaw
No need to zero the string; sprintf() assigns to the space, not concatenates. And if it was concatenating, a simple *input = '\0'; would be sufficient.
Jonathan Leffler
No. It's not necessary given his logic and so it will either have no effect or will hide the real problem. Zeroing the memory makes lots of sense when allocating structures but not so much for string storage.
DigitalRoss
A: 

Try this code:

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
    const char* BASE_DIR = "/home/steve/cps730";
    const char* filepath = "/page2.html";
    int memory_alloc = strlen(filepath);
    memory_alloc += strlen(BASE_DIR)+1;
    printf("\n\nAlloc: %d",memory_alloc);
    char *input = (char*)malloc(memory_alloc);
    printf("\n\nBASE: %s\nPATH: %s\n\n",BASE_DIR,filepath);
    sprintf(input, "%s%s",BASE_DIR,filepath); //   :(

    printf("\n\nPATH: %s\n\n",input);

    return 0;
}

If this doesn't have a problem, then there must be something wrong elsewhere in the code. That's how undefined behavior sometimes may manifest itself (messing up how unrelated code works).

(BTW, I didn't add +1 to both strlen calls, since the concatenated string is still going to have only one null-terminator.)

UncleBens
A: 

Since the BASE_DIR value is repeating itself, either BASE_DIR or filepath is probably overlapping the in input memory.

Make sure both BASE_DIR and filepath really has allocated memory.

A first try is to just make a local copy of BASE_DIR and filepath before calling sprintf.

Peter Olsson
+6  A: 

You call readPage with an invalid pointer path - it points into the memory previously allocated with the method pointer, which is freed right before the call to readPage. The next malloc can reuse this memory and then anything can happen...

hjhill
How do you recommend I change this? I should also add that it does work normally when the index.html stuff is executed.
Steven1350
you sir, are my hero
Steven1350
There's also a leak from the allocated path... @hjhill - well done on guessing that the original code was the readpage() function.
Jonathan Leffler
It works normally when the index.html stuff is executed because you allocate another chunk of memory for path in this case (which is never freed by the way). Hmm... looks like some bigger changes are needed - I'll use another answer for this...
hjhill
A: 

The easiest way to figure out what's going on is to trace through the execution in a debugger (possibly dropping to tracing the assembly code).

A few guesses as to what might be going on:

  • memory corruption by another thread (seems unlikely if this is readily repeatable)
  • corrupt heap (also seems unlikely, as you dump the 2 component strings after the malloc() call)
  • as mentioned by Jonathan Leffler in a comment, you might be missing a header (perhaps stdio.h) and the compiler is generating the incorrect calling /stack clean up sequence for the printf/sprintf calls. I would expect that you would see some compile time warnings if this were the case - ones that you should take note of.

What compiler/target are you using?

Michael Burr
A: 

To do this correctly, I'd change the code to:

/* CHANGED: allocate additional space for "index.html" */
method = (char*)malloc(strlen(header)+1+10);
strcpy(method,header);
method = strtok(method," ");

if(!strcmp(method,"GET")){
    char *path = strtok(NULL," ");
    if(!strcmp(path,"/")){
             /* CHANGED: don't allocate new memory, use previously allocated */
             strcpy(path,"/index.html");
    }
    /* CHANGED: call function first and free memory _after_ the call */
    char *result = readPage(path);
    free(method);
    return result;
}
hjhill