views:

646

answers:

4

How buffered streams are working on the background and how it actually differs and what is the real advantage of using the same?

Another Query,.. Since DataInputSytream is also Byte based, but it is having methods to readLine().. Whats the point in here

+1  A: 

From the BufferedInputStream javadoc:

A BufferedInputStream adds functionality to another input stream-namely, the ability to buffer the input and to support the mark and reset methods. When the BufferedInputStream is created, an internal buffer array is created. As bytes from the stream are read or skipped, the internal buffer is refilled as necessary from the contained input stream, many bytes at a time. The mark operation remembers a point in the input stream and the reset operation causes all the bytes read since the most recent mark operation to be reread before new bytes are taken from the contained input stream.

Internally a buffer array is used and instead of reading bytes individually from the underlying input stream enough bytes are read to fill the buffer. This generally results in faster performance as less reads are required on the underlying input stream.

The opposite is then true for BufferedOutputStream.

mark() and reset() could be used as follows:

1 BufferedInputStream bis = new BufferedInputStream(is);
2 byte[] b = new byte[4];
3 bis.read(b); // read 4 bytes into b
4 bis.mark(10); // mark the stream at the current position - we can read 10 bytes before the mark point becomes invalid
5 bis.read(b); // read another 4 bytes into b
6 bis.reset(); // resets the position in the stream back to when mark was called
7 bis.read(b); // re-read the same 4 bytes as line 5 into b

To explain mark/reset some more...

The BufferInputStream internally remembers the current position in the buffer. As you read bytes the position will increment. A call to mark(10) will save the current position. Subsequent calls to read will continue to increment the current position but a call to reset will set the current position back to its value when mark was called.

The argument to mark specifies how many bytes you can read after calling mark before the mark position gets invalidated. Once the mark position is invalidated you can no longer call reset to return to it.

For example, if mark(2) had been used in line 4 above an IOException would be thrown when reset() is called on line 6 as the mark position would have been invalidated since we read more than 2 bytes.

Mark
Mark, could you give an example for mark and reset. Thanks
i2ijeya
Still i cant get Mark and reset.. Whats that bis.mark(10) will do?
i2ijeya
I explained mark/reset some more. You could also look at the source to BufferedInputStream to see for yourself what happens when these methods are called.
Mark
Thanks MARK, I really understood what it means.. Thanks..
i2ijeya
+2  A: 

Buffered streams write or read data in larger chunks by – nomen est omen – buffering. Depending on the underlying stream, this can increase performance dramatically.

From java.io.BufferedOutputStream's Javadocs:

By setting up such an output stream, an application can write bytes to the underlying output stream without necessarily causing a call to the underlying system for each byte written.

Henning
+1  A: 

With un-buffered I/O each read or write request is passed directly to the Operating System. Java's buffered I/O streams read and write data to their own memory buffer (usually a byte array). Calls to the Operating System are only made when the buffer is empty (when doing reads) or the buffer is full (when doing writes). It is sometimes a good idea to flush the buffer manually after critical points in your application.

Since the Operating System API calls may result in disk access, network activity and the like, this can be quite expensive. Using buffers to batch the native Operating System I/O into larger chunks often significantly improves performance.

Tendayi Mawushe
Thanks and do you have any idea regarding DataInputStream which i have added at last?
i2ijeya
There is very little difference with regard to buffering. Input stream types like DataInputStream which allow reading line by line are still affected by they same buffering concerns. What line oriented input streams give you the detecting of the line end characters on the platform being used and segment the input stream by line for you so you don't have to do this yourself.
Tendayi Mawushe
+3  A: 

Buffered Readers/Writers/InputStreams/OutputStreams read and write to the OS in large chunks for optimization. In case of writers and outputstreams, the data is buffered in memory until there is enough collected to write out a big chunk. In case of readers and inputstreams, a large chunk is read form disk/network/... into the buffer and all reads are done from that buffer until the buffer is empty, and a new chunk is read in.

DataInputStream is indeed byte based. The readLine method is deprecated. Internally it reads bytes from disk/network/... byte-for-byte until it has collected a complete line. So this stream could be sped up by using a BufferedInputStream as it's source, such that the bytes for the line are read from the in-memory buffer instead of directly from disk.

AJK
Thanks AJK.. Good explaination...
i2ijeya