ansaurus

Question

How do I handle a stream of data internal to a C-based app?

Answer 1

+2 A:

I don't think there's a smarter approach (except finding an automata library that already does this for you). Be careful with allocating proper size for the "last line" buffer: if it cannot handle arbitrary length and the input comes from something accessible to third parties, it becomes a security risk.

Pavel Radzivilovsky 2010-10-13 11:26:02

Answer 2

A:

I think you should copy chunks of characters to another buffer until the latest chunk you write contains a new line character. Then you can work on the whole line.

You can save the rest of the buffer (after the '\n') into a temporary and then create a new line from it.

Opera 2010-10-13 11:31:23

Answer 3

+1 A:

This would be easy to do using C++'s std::string, but in C it takes some code if you want to do it efficiently (unless you use a dynamic string library).

char *bz_read_line(BZFILE *input)
{
    size_t offset = 0;
    size_t len = CHUNK;  // arbitrary
    char *output = (char *)xmalloc(len);
    int bzerror;

    while (BZ2_bzRead(&bzerror, input, output + offset, 1) == 1) {
        if (offset+1 == len) {
            len += CHUNK;
            output = xrealloc(output, len);
        }
        if (output[offset] == '\n')
            break;
        offset++;
    }

    if (output[offset] == '\n')
        output[offset] = '\0';  // strip trailing newline
    else if (bzerror != BZ_STREAM_END) {
        free(output);
        return NULL;
    }

    return output;
}

(Where xmalloc and xrealloc handle errors internally. Don't forget to free the returned string.)

This is almost an order of magnitude slower than bzcat:

lars@zygmunt:/tmp$ wc foo
 1193  5841 42868 foo
lars@zygmunt:/tmp$ bzip2 foo
lars@zygmunt:/tmp$ time bzcat foo.bz2 > /dev/null

real    0m0.010s
user    0m0.008s
sys     0m0.000s
lars@zygmunt:/tmp$ time ./a.out < foo.bz2 > /dev/null

real    0m0.093s
user    0m0.044s
sys     0m0.020s

Decide for yourself whether that's acceptable.

larsmans 2010-10-13 11:50:46

I have a bunch of bz2 streams concatenated in one very large file. I'm trying to write a self-contained application to unpack one stream among many. This is very helpful, thanks!

Alex Reynolds 2010-10-13 18:03:28

ansaurus

tags:

views:

answers:

How do I handle a stream of data internal to a C-based app?

related questions