ansaurus

Question

buffering when passing input from standard input to a function

Answer 1

A:

You'll have to do the pre-buffering yourself, i.e., read stdin until EOF is seen and then pass one long string (probably constisting of \n-separated lines) to your function. Or your pre-buffer read routine could allocate an array of char*'s that point to allocated lines. Or your pre-buffer routine would parse stdin and return preprocessed info. Depends on what you want to do with the information.

Karel Kubat 2009-09-27 22:59:10

Answer 2

A:

What you have now is a 'filter'. Filters are wonderful programs but, of course, not applicable to all situations. Anyway, see if you can keep your program working as a filter.

If you really must read all the input before processing, you have to save the input somewhere, and, it makes no sense calling the processing function with a FILE* (all the data in the FILE* has already been read); you can read all the input into a char array and pass that array to your function.

void process_data(char data[], size_t data_len) { /* do work */ }

pmg 2009-09-27 23:08:53

Pass in the length too - to avoid buffer overflows and to handle embedded nulls (`'\0'`) in the data.

Jonathan Leffler 2009-09-27 23:09:37

If the data is 'allowed' to have embedded NULs, I'd use `unsigned char data[]` rather than (plain) `char data[]`. The length is always nice to have.

pmg 2009-09-27 23:15:00

Answer 3

A:

Suppose you were to open a file and then pass the file handle to your function. Your code in the function would still have to read to EOF on that regular file. Further, it would have to deal with allocating enough space to store the file, and deal with short reads.

All this is just the same set of issues that you must deal with for stdin - the only possible difference being that stdin coming from a terminal will give you short reads for each line of input, whereas each read from a pipe will give you a short read for the size of the pipe buffer (or the atomic writes smaller than the buffer size), and a plain disk file will only usually give you a short read on the last block of a file. Since your function cannot tell ahead of time how much space is needed (certainly not for the pipe or terminal inputs), you have to be prepared to deal with dynamic memory allocation - malloc() and realloc().

Also, if your function is expecting to get the data already read for it, why is it being passed a file handle (FILE pointer) and not a character buffer and its length? You pass a file handle to a function when you need the function to use it - to read from a readable handle, or to write to a writable handle (and, just occasionally, both if the handle is open for reading and writing).

Here's a working example program. I had to work out something that needed to slurp the whole file into memory, process it, and spew out some answer - so I've chosen to sort the file by characters. Moderately pointless, but it demonstrates what to do. It also has an operational variable arguments error reporting function in it.

Have fun!

/*
 * Demo code for StackOverflow question 1484693
 */

#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <errno.h>
#include <string.h>

static char *arg0;

static void error(const char *fmt, ...)
{
    va_list args;
    int errnum = errno;  /* Catch errno before it changes */

    fprintf(stderr, "%s: ", arg0);
    va_start(args, fmt);
    vfprintf(stderr, fmt, args);
    va_end(args);
    if (errnum != 0)
        fprintf(stderr, " (%d: %s)", errnum, strerror(errnum));
    fputc('\n', stderr);
    exit(1);
}

static int char_compare(const void *v1, const void *v2)
{
    char c1 = *(const char *)v1;
    char c2 = *(const char *)v2;
    if (c1 < c2)
        return -1;
    else if (c1 > c2)
        return +1;
    else
        return 0;
}

static void process_my_file(FILE *fp)
{
    char   *buffer;
    size_t  buflen = 1024;
    size_t  in_use = 0;
    ssize_t nbytes;

    if ((buffer = malloc(buflen)) == 0)
        error("out of memory - malloc()");

    while ((nbytes = fread(buffer + in_use, sizeof(char), buflen - in_use, fp)) > 0)
    {
        if (nbytes < 0)
            error("error from fread()");
        in_use += nbytes;
        if (in_use >= buflen)
        {
            char *newbuf;
            buflen += 1024;
            if ((newbuf = realloc(buffer, buflen)) == 0)
                error("out of memory - realloc()");
            buffer = newbuf;
        }
    }

    /* Consistency - number/size vs size/number! */
    qsort(buffer, in_use, sizeof(char), char_compare);
    fwrite(buffer, sizeof(char), in_use, stdout);
    putchar('\n');

    free(buffer);
}

int main(int argc, char **argv)
{
    arg0 = argv[0];

    if (argc > 1)
    {
        for (int i = 1; i < argc; i++)
        {
            FILE *fp;
            if ((fp = fopen(argv[i], "r")) == 0)
                error("failed to open file %s", argv[i]);
            process_my_file(fp);
            fclose(fp);
        }
    }
    else
        process_my_file(stdin);
    return(0);
}

You can call this with one or more file names as arguments; each file name is sorted separately. You can pipe something into it; you can let it read from standard input. I choose to ignore the possibility that fwrite() and fclose() might fail; I also choose to ignore the possibility of overflow on buflen in process_my_file(). You can check them if you choose. (Note that the output for each file contains one more newline than the input does.)

Exercises for the reader:

Print non-printable characters as ''\xXX`' escape sequences.
Break the output into lines of not more than 64 characters each.
Devise or research alternative allocation strategies, such as doubling the space on each allocation (see 'The Practice of Programming')

Jonathan Leffler 2009-09-27 23:08:57

ansaurus

tags:

views:

answers:

buffering when passing input from standard input to a function

related questions