views:

128

answers:

1

So a FILE stream can have both input and output buffers. You can adjust the output stream using setvbuf (I am unaware of any method to play with the input buffer size and behavior).

Also, by default the buffer is BUFSIZ (not sure if this is a POSIX or C thing). It is very clear what this means for stdin/stdout/stderr, but what are the defaults for newly opened files? Are they buffered for both input and output? Or perhaps just one?

If it is buffered, does output default to block or line mode?

EDIT: I've done some tests to see how Jonathan Leffler's answer effected real worls programs. It seems that if you do a read then a write. The write will cause the unused portion of the input buffer to dropped entirely. In fact, the there will be some seeks that are done to keep things at the right file offsets. I used this simple test program:

/* input file contains "ABCDEFGHIJKLMNOPQRSTUVWXYZ" */
#include <stdio.h>
#include <stdlib.h>

int main() {

    FILE *f = fopen("test.txt", "r+b");
    char ch;
    fread(&ch, 1, 1, f);
    fwrite("test", 4, 1, f);
    fclose(f);
    return 0;
}

resulted in the following system calls:

read(3, "ABCDEFGHIJKLMNOPQRSTUVWXYZ\n", 4096) = 27 // attempt to read 4096 chars, got 27
lseek(3, -26, SEEK_CUR)                 = 1        // at this point, i've done my write already, so forget the 26 chars I never asked for and seek to where I should be if we really just read one character...
write(3, "test", 4)                     = 4        // and write my test
close(3)                                = 0

While these are clearly implementation details I found them to be very interesting as far as how the standard library could be implemented. Thanks Jonathan for your insightful answer.

+2  A: 

A single file stream has a single buffer. If the file is used for both input and output, then you have to ensure that you do appropriate operations (fseek() or equivalents) between the read and write (or write and read) operations.

The buffering behaviour of the standard channels is platform dependent.

Typically, stdout is line buffered when the output goes to the terminal. However, if stdout is going to a file or pipe rather than to a terminal, it most usually switches to full buffering.

Typically, stderr is either line buffered or unbuffered, to ensure that error messages get seen (for example, even if the program is about to crash).

Typically, stdin is line buffered; this means you get a chance to edit your input (backspacing over errors, etc). You would seldom adjust this. Again, if the input is coming from a file (or pipe), the behaviour might be different.

Newly opened files will generally be fully buffered. A particular implementation might change that to line buffering if the device is a terminal.

Your premise - that there are two buffers - is incorrect.


Section 7.19.3 of C99, it says:

At program startup, three text streams are predefined and need not be opened explicitly — standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output). As initially opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device.

So, as originally stated, stderr is either unbuffered or line buffered (it is not fully buffered).

Jonathan Leffler
Is two buffers not allowed? It would strike me as significantly simpler to implement if they were just two buffers after all the standard says that the default buffers are allocated on first use. I say this because while it would be clear that an input operation could clear the buffer of any output characters (the buffer now being clear, could be used for input...), the reverse is more complex to deal with.
Evan Teran
I've also found that the standard does say that `stderr` is by default unbuffered.
Evan Teran
Two buffers would be significantly more complex (apart from requiring twice as much memory). For example, if you read from the input buffer, do the fseek() back to to where you just started reading, and then write(), you have to ensure that you copy the data from the (hypothetical) output buffer over the (hypothetical) input buffer.
Jonathan Leffler
well, c++ addresses that by having IO buffers "tied" to each other. In which case a read will automatically flush any pending writes. The reverse situation is where i gets more complicated. For example if you do a read (asking for 1 byte, but it buffered in 100), then you do a write...where does it write? Does it simply ditch the other 99 read bytes from the input buffer?
Evan Teran
@Evan: read 1 byte (buffering 99 as yet unread bytes); then fseek() to the current position (fseek(fp, 0L, SEEK_CUR);), then write 10 bytes; then the first 10 bytes of the buffered 99 have been trampled, but the new data can be read. Note that you can't do this with a terminal; you can't seek on a terminal, and yet you must seek to switch between read and write operations. So, you'd be firmly into the territory of 'undefined behaviour'. But with a disk file, the behaviour is well defined. (After the write, you must fseek() before fread(), but the written data is there to be read.)
Jonathan Leffler
Very interesting, thank you for your well informed answer. One thing I've noticed in my test is that the "read 1 byte, then write a few" has very different behavior in c++ with iostreams as it does in c with FILE streams. Clearly a side effect of iostreams actually having 2 buffers!
Evan Teran