ansaurus

Question

InputStreamReader buffering issue

Answer 1

+2 A:

Why don't you use 2 InputStreams? One for reading the header and another for the body.

The second InputStream should skip the header bytes.

bruno conde 2010-04-13 17:02:55

Thanks I think I'll have to do this.

Mike Q 2010-04-14 12:19:28

Answer 2

A:

My first thought is to close the stream and reopen it, using InputStream#skip to skip past the header before giving the stream to the new InputStreamReader.

If you really, really don't want to reopen the file, you could use file descriptors to get more than one stream to the file, although you may have to use channels to have multiple positions within the file (since you can't assume you can reset the position with reset, it may not be supported).

T.J. Crowder 2010-04-13 17:03:20

If you create multiple `FileInputStream`s with the same `FileDescriptor`, then they will behave as if they are the same stream.

Tom Hawtin - tackline 2010-04-13 17:09:02

@Tom: Yeah, I was assuming he would use them in series, not in parallel, and that he would reset the position between using one and using the other. But you can't assume you can reset the position... (I don't think they'll behave like the *same stream*, I think it would be worse than that; they'd just share actual file position. Data caching within the individual instances could in theory make that really, really messy if you tried to use them in parallel.)

T.J. Crowder 2010-04-13 17:16:32

Answer 3

A:

I suggest rereading the stream from the start with a new InputStreamReader. Perhaps assume that InputStream.mark is supported.

Tom Hawtin - tackline 2010-04-13 17:06:02

Answer 4

+2 A:

Here is the pseudo code.

Use InputStream do not wrap a Reader around it
Read bytes containing header and store them into ByteArrayOutputStream.
Create ByteArrayInputStream from ByteArrayOutputStream and decode header, this time wrap ByteArrayInputStream into Reader with ASCII charset.
Compute the length of non-ascii input, and read that number of bytes into another ByteArrayOutputStream.
Create another ByteArrayInputStream from the second ByteArrayOutputStream and wrap it with Reader with charset from the header.

Alexander Pogrebnyak 2010-04-13 17:06:31

Thanks for your suggestion.Unfortunately the header is not fixed length, either in binary or character terms, so I do need to parse it through a Charset decoder to figure out its structure and therefore its length.I also need to avoid reading the entire content into an internal buffer.

Mike Q 2010-04-13 21:24:17

Answer 5

A:

It's even easier:

As you said, your header is always in ASCII. So read the header directly from the InputStream, and when you're done with it, create the Reader with the correct encoding and read from it

private Reader reader;
private InputStream stream;

public void read() {
    int c = 0;
    while ((c = stream.read()) != -1) {
        // Read encoding
        if ( headerFullyRead ) {
            reader = new InputStreamReader( stream, encoding );
            break;
        }
    }
    while ((c = reader.read()) != -1) {
        // Handle rest of file
    }
}

derBiggi 2010-06-29 08:43:27

Thanks. Eventually I went with another solution which was to write an InputStreamReaderUnbuffered which does exactly the same as InputStreamReader but has no internal buffer so you never read too much. See my edit.

Mike Q 2010-06-29 16:30:24

ansaurus

tags:

views:

answers:

InputStreamReader buffering issue

related questions