views:

109

answers:

6

I wrote this function to read a line from a file:

const char *readLine(FILE *file) {

    if (file == NULL) {
        printf("Error: file pointer is null.");
        exit(1);
    }

    int maximumLineLength = 128;
    char *lineBuffer = (char *)malloc(sizeof(char) * maximumLineLength);

    if (lineBuffer == NULL) {
        printf("Error allocating memory for line buffer.");
        exit(1);
    }

    char ch = getc(file);
    int count = 0;

    while ((ch != '\n') && (ch != EOF)) {
        if (count == maximumLineLength) {
            maximumLineLength += 128;
            lineBuffer = realloc(lineBuffer, maximumLineLength);
            if (lineBuffer == NULL) {
                printf("Error reallocating space for line buffer.");
                exit(1);
            }
        }
        lineBuffer[count] = ch;
        count++;

        ch = getc(file);
    }

    lineBuffer[count] = '\0';
    char line[count + 1];
    strncpy(line, lineBuffer, (count + 1));
    free(lineBuffer);
    const char *constLine = line;
    return constLine;
}

The function reads the file correctly, and using printf I see that the constLine string did get read correctly as well.

However, if I use the function e.g. like this:

while (!feof(myFile)) {
    const char *line = readLine(myFile);
    printf("%s\n", line);
}

printf outputs gibberish. Why?

A: 

readLine() returns pointer to local variable, which causes undefined behaviour.

To get around you can:

  1. Create variable in caller function and pass its address to readLine()
  2. Allocate memory for line using malloc() - in this case line will be persistent
  3. Use global variable, although it is generally a bad practice
qrdl
How can I change that? (Sorry, I'm new to C.)
lron
A: 
const char *readLine(FILE *file, char* line) {

    if (file == NULL) {
        printf("Error: file pointer is null.");
        exit(1);
    }

    int maximumLineLength = 128;
    char *lineBuffer = (char *)malloc(sizeof(char) * maximumLineLength);

    if (lineBuffer == NULL) {
        printf("Error allocating memory for line buffer.");
        exit(1);
    }

    char ch = getc(file);
    int count = 0;

    while ((ch != '\n') && (ch != EOF)) {
        if (count == maximumLineLength) {
            maximumLineLength += 128;
            lineBuffer = realloc(lineBuffer, maximumLineLength);
            if (lineBuffer == NULL) {
                printf("Error reallocating space for line buffer.");
                exit(1);
            }
        }
        lineBuffer[count] = ch;
        count++;

        ch = getc(file);
    }

    lineBuffer[count] = '\0';
    char line[count + 1];
    strncpy(line, lineBuffer, (count + 1));
    free(lineBuffer);
    return line;

}


char linebuffer[256];
while (!feof(myFile)) {
    const char *line = readLine(myFile, linebuffer);
    printf("%s\n", line);
}

note that the 'line' variable is declared in calling function and then passed, so your readLine function fills predefined buffer and just returns it. This is the way most of C libraries work.

There are other ways, which I'm aware of:

  • defining the char line[] as static (static char line[MAX_LINE_LENGTH] -> it will hold it's value AFTER returning from the function). -> bad, the function is not reentrant, and race condition can occur -> if you call it twice from two threads, it will overwrite it's results
  • malloc()ing the char line[], and freeing it in calling functions -> too many expensive mallocs, and, delegating the responsibility to free the buffer to another function (the most elegant solution is to call malloc and free on any buffers in same function)

btw, 'explicit' casting from char* to const char* is redundant.

btw2, there is no need to malloc() the lineBuffer, just define it char lineBuffer[128], so you don't need to free it

btw3 do not use 'dynamic sized stack arrays' (defining the array as char arrayName[some_nonconstant_variable]), if you don't exactly know what are you doing, it works only in C99.

Yossarian
*note that the 'line' variable is declared in calling function and then passed,* - you probably should have deleted the local declaration of line in the function then. Also, you need to tell the function how long the buffer is that you are passing and think of a strategy for handling lines that are too long for the buffer you pass in.
JeremyP
BTW it was not me that down voted this answer.
JeremyP
+2  A: 

In your readLine function, you return a pointer to the line array (Strictly speaking, a pointer to its first character, but the difference is irrelevant here). Since it's an automatic variable (i.e., it's “on the stack”), the memory is reclaimed when the function returns. You see gibberish because printf has put its own stuff on the stack.

You need to return a dynamically allocated buffer from the function. You already have one, it's lineBuffer; all you have to do is truncate it to the desired length.

    lineBuffer[count] = '\0';
    realloc(lineBuffer, count + 1);
    return lineBuffer;
}

ADDED (response to follow-up question in comment): readLine returns a pointer to the characters that make up the line. This pointer is what you need to work with the contents of the line. It's also what you must pass to free when you've finished using the memory taken by these characters. Here's how you might use the readLine function:

char *line = readLine(file);
printf("LOG: read a line: %s\n", line);
if (strchr(line, 'a')) { puts("The line contains an a"); }
/* etc. */
free(line);
/* After this point, the memory allocated for the line has been reclaimed.
   You can't use the value of `line` again (though you can assign a new value
   to the `line` variable if you want). */
Gilles
How would I `free()` the memory after I'm done with it?
lron
@Iron: I've added something to my answer, but I'm not sure what your difficulty is so it may be off the mark.
Gilles
@Iron: the answer is that you don't free it. You document (in the API documentation) the fact that the returned buffer is malloc'd ansd needs to be freed by the caller. Then people who use your readLine function will (hopefully!) write code similar to the snippet that Gilles has added to his answer.
JeremyP
A: 

If your task is not to invent the line-by-line reading function, but just to read the file line-by-line, you may use a typical code snippet involving th getline() function (see the manual page at http://linuxmanpages.com/man3/getline.3.php):

   #define _GNU_SOURCE
   #include <stdio.h>
   #include <stdlib.h>

   int
   main(void)
   {
       FILE * fp;
       char * line = NULL;
       size_t len = 0;
       ssize_t read;

       fp = fopen("/etc/motd", "r");
       if (fp == NULL)
           exit(EXIT_FAILURE);

       while ((read = getline(&line, &len, fp)) != -1) {
           printf("Retrieved line of length %zu :\n", read);
           printf("%s", line);
       }

       if (line)
           free(line);
       exit(EXIT_SUCCESS);
   }
mbaitoff
That's not portable.
JeremyP
More precisely, this `getline` is specific to GNU libc, i.e., to Linux. However, if the intent is to have a line-reading function (as opposed to learning C), there are several public domain line-reading functions available on the web.
Gilles
+2  A: 

Some things wrong with the example:

  • you forgot to add \n to your printfs. Also error messages should go to stderr i.e. fprintf(stderr, ....
  • (not a biggy but) consider using fgetc() rather than getc(). getc() is a macro, fgetc() is a proper function
  • getc() returns an int so ch should be declared as an int. This is important since the comparison with EOF will be handled correctly. Some 8 bit character sets use 0xFF as a valid character (ISO-LATIN-1 would be an example) and EOF which is -1, will be 0xFF if assigned to a char.
  • There is a potential buffer overflow at the line

    lineBuffer[count] = '\0';
    

    If the line is exactly 128 characters long, count is 128 at the point that gets executed.

  • As others have pointed out, line is a locally declared array. You can't return a pointer to it.

  • strncpy(count + 1) will copy at most count + 1 characters but will terminate if it hits '\0' Because you set lineBuffer[count] to '\0' you know it will never get to count + 1. However, if it did, it would not put a terminating '\0' on, so you need to do it. You often see something like the following:

    char buffer [BUFFER_SIZE];
    strncpy(buffer, sourceString, BUFFER_SIZE - 1);
    buffer[BUFFER_SIZE - 1] = '\0';
    
  • if you malloc() a line to return (in place of your local char array), your return type should be char* - drop the const.

JeremyP
A: 

You should use the ANSI functions for reading a line, eg. fgets. After calling you need free() in calling context, eg:

...
const char *entirecontent=readLine(myFile);
puts(entirecontent);
free(entirecontent);
...

const char *readLine(FILE *file)
{
  char *lineBuffer=calloc(1,1), line[128];

  if ( !file || !lineBuffer )
  {
    fprintf(stderr,"an ErrorNo 1: ...");
    exit(1);
  }

  for(; fgets(line,sizeof line,file) ; strcat(lineBuffer,line) )
  {
    if( strchr(line,'\n') ) *strchr(line,'\n')=0;
    lineBuffer=realloc(lineBuffer,strlen(lineBuffer)+strlen(line)+1);
    if( !lineBuffer )
    {
      fprintf(stderr,"an ErrorNo 2: ...");
      exit(2);
    }
  }
  return lineBuffer;
}