views:

981

answers:

7

I'm reading a binary file like this:

InputStream in = new FileInputStream( file );
byte[] buffer = new byte[1024];
while( ( in.read(buffer ) > -1 ) {

   int a = // ??? 
}

What I want to do it to read up to 4 bytes and create a int value from those but, I don't know how to do it.

I kind of feel like I have to grab 4 bytes at a time, and perform one "byte" operation ( like >> << >> & FF and stuff like that ) to create the new int

What's the idiom for this?

EDIT

Ooops this turn out to be a bit more complex ( to explain )

What I'm trying to do is, read a file ( may be ascii, binary, it doesn't matter ) and extract the integers it may have.

For instance suppose the binary content ( in base 2 ) :

00000000 00000000 00000000 00000001
00000000 00000000 00000000 00000010

The integer representation should be 1 , 2 right? :- / 1 for the first 32 bits, and 2 for the remaining 32 bits.

11111111 11111111 11111111 11111111

Would be -1

and

01111111 11111111 11111111 11111111

Would be Integer.MAX_VALUE ( 2147483647 )

+8  A: 

You should put it into a function like this:

public static int toInt(byte[] bytes, int offset) {
  int ret = 0;
  for (int i=0; i<4 && i+offset<bytes.length; i++) {
    ret <<= 8;
    ret |= (int)bytes[i] & 0xFF;
  }
  return ret;
}

Example:

byte[] bytes = new byte[]{-2, -4, -8, -16};
System.out.println(Integer.toBinaryString(toInt(bytes, 0)));

Output:

11111110111111001111100011110000

This takes care of running out of bytes and correctly handling negative byte values.

I'm unaware of a standard function for doing this.

Issues to consider:

  1. Endianness: different CPU architectures put the bytes that make up an int in different orders. Depending on how you come up with the byte array to begin with you may have to worry about this; and

  2. Buffering: if you grab 1024 bytes at a time and start a sequence at element 1022 you will hit the end of the buffer before you get 4 bytes. It's probably better to use some form of buffered input stream that does the buffered automatically so you can just use readByte() repeatedly and not worry about it otherwise;

  3. Trailing Buffer: the end of the input may be an uneven number of bytes (not a multiple of 4 specifically) depending on the source. But if you create the input to begin with and being a multiple of 4 is "guaranteed" (or at least a precondition) you may not need to concern yourself with it.

to further elaborate on the point of buffering, consider the BufferedInputStream:

InputStream in = new BufferedInputStream(new FileInputStream(file), 1024);

Now you have an InputStream that automatically buffers 1024 bytes at a time, which is a lot less awkward to deal with. This way you can happily read 4 bytes at a time and not worry about too much I/O.

Secondly you can also use DataInputStream:

InputStream in = new DataInputStream(new BufferedInputStream(
                     new FileInputStream(file), 1024));
byte b = in.readByte();

or even:

int i = in.readInt();

and not worry about constructing ints at all.

cletus
+1, Watch out for endianness!
Carl Norum
I just have to consider the fact my array might not read exact `% 4` bytes right?
OscarRyz
If the array's length is not %4, then you can pad the remaining bytes with 0. (Since x | 0 := x and 0 << n := 0).
Pindatjuh
Isn't DataInputStream or using RandomAccessFile easier? This way you can just do in.readInt().
Taylor Leese
@Taylor L, yeap, but I think you'll read much more times
OscarRyz
@Oscar - Why would you read "more times"?
Taylor Leese
Because there are up to 256 integers in 1024 bytes, and reading one at a time would hit the dist 256x more times isn't?
OscarRyz
@Oscar: I think the point is that some of the Java IO classes will do this buffering automatically for you.
cletus
Like BufferedInputStream? :)
OscarRyz
@Oscar - That depends on how you setup your stream. There's no reason you couldn't read the entire file into a BufferedInputStream and then wrap that with a DataInputStream and call readInt() in a loop. This would prevent what you are talking about.
Taylor Leese
Chris Dodd
@Chris: you're quite right about the negative values. Fixed.
cletus
It's better to use standard library to handle byte -> int conversions than to hand code. Java even provides a library for doing this with different endianess, see java.nio.ByteBuffer.
Kevin Brock
+2  A: 

try something like this:

a = buffer[3];
a = a*256 + buffer[2];
a = a*256 + buffer[1];
a = a*256 + buffer[0];

this is assuming that the lowest byte comes first. if the highest byte comes first you might have to swap the indices (go from 0 to 3).

basically for each byte you want to add, you first multiply a by 256 (which equals a shift to the left by 8 bits) and then add the new byte.

stmax
+1 except you should use << instead of multiplication
Andrey
Although I conceptually agree with Andrey, I'd hope any descent compiler would figure that out and fix it for you. However, << IS clearer for this purpose.
Bill K
@Andrey: to be fair, the Java compiler will probably translate `x * 256` into `x << 8` automatically.
cletus
depends on quality of compiler :)
Andrey
+4  A: 

The easiest way is:

RandomAccessFile in = new RandomAccessFile("filename", "r"); 
int i = in.readInt();

-- or --

DataInputStream in = new DataInputStream(new BufferedInputStream(
    new FileInputStream("filename"))); 
int i = in.readInt();
Taylor Leese
assuming that his binary file contains big endian signed ints. otherwise it'll fail. horribly. :)
stmax
+1  A: 
for (int i = 0; i < buffer.length; i++)
{
   a = (a << 8) | buffer[i];
   if (i % 3 == 0)
   {
      //a is ready
      a = 0;
   }       
}
Andrey
+8  A: 

ByteBuffer has this capability, and is able to work with both little and big endian integers.

Consider this example:


//  read the file into a byte array
File file = new File("file.bin");
FileInputStream fis = new FileInputStream(file);
byte [] arr = new byte[(int)file.length()];
fis.read(arr);

//  create a byte buffer and wrap the array
ByteBuffer bb = ByteBuffer.wrap(arr);

//  if the file uses little endian as apposed to network
//  (big endian, Java's native) format,
//  then set the byte order of the ByteBuffer
if(use_little_endian)
    bb.order(ByteOrder.LITTLE_ENDIAN);

//  read your integers using ByteBuffer's getInt().
//  four bytes converted into an integer!
System.out.println(bb.getInt());

Hope this helps.

Tom
+1 See also http://stackoverflow.com/questions/2211927/converting-long64-to-byte512-in-java
trashgod
+1 I like this answer much better than mine.
Taylor Leese
+3  A: 

just see how DataInputStream.readInt() is implemented;

    int ch1 = in.read();
    int ch2 = in.read();
    int ch3 = in.read();
    int ch4 = in.read();
    if ((ch1 | ch2 | ch3 | ch4) < 0)
        throw new EOFException();
    return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4 << 0));
Santhosh Kumar T
+4  A: 

If you have them already in a byte[] array, you can use:

int result = ByteBuffer.wrap(bytes).readInt();

source: here

just for your info! saludos!

iEisenhower