views:

325

answers:

5

Does it make sense to always wrap an InputStream as BufferedInputStream, when I know whether the given InputStream is something other than buffered? For e.g:

InputStream is = API.getFromSomewhere()
if(!(is instanceof BufferedInputStream))
  return new BufferedInputStream(is);
return is;
+2  A: 

I would not do that, I would leave it at the highest abstraction level possible. If you are not going to use tha mark and reset capabilities of a BufferedStream, why bother wrapping it?

If a consumer needs it, it is better to wrap it there.

Peter Tillemans
This seems to imply that mark and reset are the only useful things `BufferedInputStream` adds over a plain `InputStream`. This might be true from an API sense, but as other have said, `BufferedInputStream` takes care of buffering reads for you. Reading byte-at-a-time from a bare `FileInputStream` is 40x slower than reading from one wrapped in a `BufferedInputStream`. That said, return the `InputStream` and keep your method signature as such. Users can wrap if they wish.
jasonmp85
I agree that from a performance standpoint it is better to wrap it in 99.9% of the cases. It does however relieve the consumer of its responsibility to think how to use the InputStream. These kind of assumptions from the consumer limits the reusability.
Peter Tillemans
@Peter: I think that in rather more than 0.1% of the cases the consumer will not read one byte at a time and instead will itself use some sort of buffer, in which case the BufferedInputStream is useless overhead.
Michael Borgwardt
@Michael, I think Peter's point is that it will be faster than a byte-by-byte read 99% of the time, not that 99% of the time it would be used as a byte-by-byte read.
Yishai
A: 

It also depends on how you are going to read from the InputStream. If you are going to read it a character/byte at a time (ie read()), then the BufferedInputStream will reduce your overheads by queitly doing bulk reads on your behalf. If you are going to read it into a 4k or 8k byte/char array a block at a time then the BuffredInputStream probably won't benefit you.

jowierun
A: 

Of course it depends on how the Stream is used. Sometimes buffering is not desired (e.g. if you want to read single bytes immediately after they arrive - using a buffered Stream would cause the read to block until the buffer is full, which in turn may cause a deadlock if some other application waits for a response). So as in the previous answer, buffering should only be used if you really know that all possible consumers can deal with it.

PapaNappa
'A buffered Stream would cause the read to block until the buffer is full'. No it won't. It will block until some data is available. 'Until the buffer is full' is obviously incorrect - consider the behaviour at EOS when there most probably will only be a partial buffer available before the EOS.
EJP
+2  A: 

Does it make sense to always wrap an InputStream as BufferedInputStream, when I know whether the given InputStream is something other than buffered?

No.

It makes sense if you are likely to perform lots of small reads (one byte or a few bytes at a time), or if you want to use some of the higher level functionality offered by the BufferedInputStream API; for example the readLine method.

However, if you are only going to perform large block reads using the read(byte[]) and / or read(byte[], int, int) methods, wrapping the InputStream in a BufferedInputStream does not help. (And in response to @Peter Tillman's comment, the block read use-cases definitely represent more than 0.1% of uses of InputStream classes!!)

Stephen C
+1  A: 

You may not always need buffering so, for that, the answer would be No, in some cases it's just overhead.

There is another reason it is "No" and it can be more serious. BufferedInputStream (or BufferedReader) can cause unpredictable failures when used with network socket when you also have enabled a timeout on the socket. The timeout can occur while reading a packet. You would no longer be able to access the data that were transferred to that point - even if you knew that there was some non-zero number of bytes (see java.net.SocketTimeoutException which is a subclass of java.io.InterruptedIOException so has bytesTransferred variable available).

If you are wondering how a socket timeout could occur while reading, just think of calling the read(bytes[]) method and the original packet that contains the message ended up being split but one of the partial packets is delayed beyond the timeout (or the remaining portion of the timeout). This can happen more frequently when wrapped again in something that implements java.io.DataInput (any of the reads for multiple byte values, like readLong() or readFully() or the BufferedReader.readLine() method.

Note that java.io.DataInputStream also is a bad candidate for socket streams that have a timeout since it doesn't behave well with timeout exceptions either.

Kevin Brock
As regards BufferedInputStream and BufferedReader this is urban myth. If you get a read timeout, (i) you are reading, ergo the internal buffer was empty, otherwise you wouldn't be reading; (ii) no data arrived within the timeout period. Ergo no data is lost.Try it.
EJP
@EJP: You got me really thinking about this, yet I still think this can be a problem. When the buffered stream really needs to do I/O (fill the buffer) then that is the point when you can get a timeout exception and the internal variables to track how many bytes in the buffer would not be updated. I have tried to test this but, though I can replicate timeout exceptions, I cannot seem to replicate yet a situation where `bytesTransferred` is non-zero. Until then, I cannot prove this one way or the other. [I have had lost data with DataInputStream and timeout.]
Kevin Brock
@EJP: Perhaps then that reading a byte array from the socket never results in a partially read buffer due to timeout and then `bytesTransferred` is never non-zero (otherwise BufferredInputStream would fail). This may also be a case where the JVM implementation on different platforms/vendors may produce different results - I've just been testing on Windows 7 with Sun/Oracle Java.
Kevin Brock
But BufferedInputStream doesn't 'fill the buffer'. See the Javadoc. It reads whatever there is to be read, like any other read does, and returns that length. Specifically, it never blocks twice. If any data arrives within the timeout period, there is no timeout. Conversely, if there is a timeout, no data arrived. So there is nothing be lost.As concerns DataInputStream the problem is real. As concerns BufferedInputStream, no.
EJP