views:

1265

answers:

5

I am writing an OutputStream, just noticed this in the OutputStream interface,

   public abstract void write(int b) throws IOException;

This call write one byte to the stream but why it takes integer as an argument?

+3  A: 

according to javadoc for OutputStream, the 24 high-order bits are ignored by this function. i think the method exists for compatibility reasons: therefore you don't need to convert to byte first and you can simply pass an integer.

regards

Atmocreations
compatibility with what, though?
skaffman
well, compatibility might be the wrong word for it... let's rather call it simplicity or programmer-friendly ;o)
Atmocreations
The discontinuity between `write(int)` and `write(byte[])` is quite striking, though, especially when you see that the default implementation of `write(byte[])` just calls `write(int)` in a loop.
skaffman
Also, there is no numeric literal translator for the byte object - i.e. 5 is an int, 5l is a long, but there is no 5b or something similar.
aperkins
jup, that's something else. but as long as the code does exactly what the documentation predicts, everything's fine, right?
Atmocreations
I think Atmocreations is suggesting that `write(int)` is intended to save users from having to do explicit casting. Personally I think I'd rather have the contract be `write(byte)` and not have to read the docs to know that 3/4 of the bits are ignored. *shrug*
Grant Wagner
@Grant: fully agree. no one said it was good the way it is. :o( regards
Atmocreations
Whatever the original reason for the design, the point is moot. The `write(int)` method is so widely used that changing it would cause massive disruption. The API ain't broken enough to justify this.
Stephen C
+9  A: 

So you can signal EOF:

"Notice that read() returns an int value. If the input is a stream of bytes, why doesn't read() return a byte value? Using a int as a return type allows read() to use -1 to indicate that it has reached the end of the stream."

http://java.sun.com/docs/books/tutorial/essential/io/bytestreams.html

mhm, good point
Atmocreations
So if you write(-1), what happens? Does it close the stream? :-)
Ken
Nope. You end up writing strange Unicode characters to output stream. But if there were a recieving stream for your write(-1) I do believe it would signal end of stream to that stream.There are 3 possibilities for streaming: The first 2 transfer a specified number of bytes: write(byte[] b, int off, int len), and write(byte b). The third option lets you transfer an undetermined number of bytes by using int. That's why int is used.
In other words, `write(int)` is there for symmetry with `read()`. That's not so bad, when you think about it. Good answer.
skaffman
I don't think so. For one, OutputStream has a close() method, and any OutputStream you use together with an InputStream will likely be written by you.
wds
I have to agree with a phD (wds);-) My understanding of streams is starting to shatter. But, I *think* close() closes a stream, whereas -1 simply says "this is the end of the steam. I should be closed." Closing is implicit with -1, the stream *should* close. But you should always be explicit with close() in a finally.
I'm Googleless (I tried) on this. It would be nice to prove/disprove my assumptions with a link.
+1  A: 

The Java IOStream classes have been a part of Java since 1.0. These classes only deal with 8 bit data. My guess is that the interface was designed like this so that the one write(int b) method would be called for int, short, byte, and char values. These are all promoted to an int. In fact since most JVMs run on 32 bit machines, the int primitive is the most efficient type to deal with. The compiler is free to store types such as bytes using 32 bits anyway. Interestingly, byte[] really is stored as a sequence of 8 bit bytes. This makes sense since an array could be quite large. However in the case of single primitive values such as int or byte, the ultimate space occupied at runtime doesn't really matter as long as the behavior is consistent with the spec.

More background:

http://www.java-samples.com/showtutorial.php?tutorialid=260

The assumption for the IOStream classes is that the caller only really cares about lowest 8 bits of data even when passing in an int. This is fine as long the caller knows it is really dealing with bytes, but it becomes a problem when underlying data is really text that uses some other character encoding such as multi-byte Unicode. This is why the Reader classes were introduced way back with Java 1.1. If you care about text data and performance, the IOStream classes are faster, but the Reader classes are more portable.

Gary
+4  A: 

Actually I've been working with bytes a bit lately and they can be annoying. They up-convert to ints at the slightest provocation and there is no designation to turn a number into a byte--for instance, 8l will give you a long value 8, but for byte you have to say (byte)8

On top of that, they will (pretty much) always be stored internally as ints unless you are using an array (and maybe even then.. not sure).

I think they just pretty much assume that the only reason to use a byte is i/o where you actually need 8 bits, but internally they expect you to always use ints.

By the way, a byte can perform worse since it always has to be masked...

At least I remember reading that years ago, could have changed by now.

As an example answer for your specific question, if a function (f) took a byte, and you had two bytes (b1 and b2), then:

f(b1 & b2)

wouldn't work, because b1 & b2 would be up-converted to an int, and the int couldn't be down-converted automatically (loss of precision). So you would have to code:

f( (byte)(b1 & b2) )

Which would get irritating.

And don't bother asking WHY b1 & b2 up-converts--I've been cussing at that a bit lately myself!

Bill K
<begin rant> amen - byte manipulation in Java is full of such potholes (generally the type that are almost impossible to catch at design time). and why in the world can't the compiler figure out that new byte[]{0x01, 0x02} is an array of bytes? Why do I have to write new byte[]{(byte)0x01, (byte)0x02}? <end rant>
Kevin Day
You only have to cast values that are larger than 0x7F, because bytes are signed. There's nothing irritating about it. It's much better than having bunch of unsigned/signed char. Use IDE, it will check for type safety and do the cast for you. Mask operation is one instruction, it does not affect performance.
tulskiy
A: 

Maybe it's because bytes are signed by default, and files store bytes as unsigned values. That is why read() returns an int - to give 255 instead of -1 for $FF. Same with write(int), you can not store $FF as 255 in a byte.

tulskiy