views:

248

answers:

4

I am implementing the BitTorent protocol using Java via this spec. In the messages section all messages are fixed length except 2 of them; for one of them it's the only variable message after the handshake so I can check others and assume it's a piece message when no other messages met. But for the following message

bitfield: <len=0001+X><id=5><bitfield>

The bitfield message may only be sent immediately after the handshaking sequence is completed, and before any other messages are sent. It is optional, and need not be sent if a client has no pieces.

The bitfield message is variable length, where X is the length of the bitfield. The payload is a bitfield representing the pieces that have been successfully downloaded. The high bit in the first byte corresponds to piece index 0. Bits that are cleared indicated a missing piece, and set bits indicate a valid and available piece. Spare bits at the end are set to zero.

A bitfield of the wrong length is considered an error. Clients should drop the connection if they receive bitfields that are not of the correct size, or if the bitfield has any of the spare bits set.

I can't come up with a way to parse it if i do not know the length; how am I supposed to locate id in a stream of bytes?

Edit: In payload of the bitfield message is the 0's or 1's for each piece in the torrent file, length of the message will change depending on the size of the torrent content. So i don't think i can assume that the number of pieces will always fit in a 5 byte number.

+1  A: 

I've not read the spec in detail but without either explicitly knowing the length of a variable length field or some termination delimiter, I don't see how you can process it either. Does the bitfield=<len=0001+X> not perhaps indicate that you will be told of the (variable) length up-front?

oxbow_lakes
+3  A: 

The id field will always be the 5th byte of a message, after the four bytes for the len field. You can do something like the following:

DataInputStream stream;

// ...

int    length  = stream.readInt();
byte   id      = stream.readByte();
byte[] payload = new byte[length - 1];

stream.readFully(payload);

That should work for any message, actually, since they all have the same len+id header.

Edit: "So i don't think i can assume that the number of pieces will always fit in a 5 byte number."

A four-byte length field can handle up to 2^32-1 bytes in the payload, and with 8 bits per byte that gives you room for 34,359,738,360 pieces. That should be plenty! :-)

John Kugelman
+2  A: 

I can't come up with a way to parse it if i do not know the length;

Judging from the description, the length is given in the first 4 bytes of the message.

how am I supposed to locate id in a stream of bytes?

It looks as though the id is the 5th byte in each message, right after the length field. So you just have to look at the first 5 bytes after you're finished parsing the previous message.

Michael Borgwardt
+2  A: 

Earlier in the spec you referenced, I read: 'The length prefix is a four byte big-endian value.'. I read that as: read next four bytes, convert them to an int, and that should be your length. If you are unfamiliar with the bytes-to-int-conversion process, I've used something similar to this.

necrobious
From what i understand reading the spec that is true for all messages but bitfield and piece. For the reasons i added to the question.
Hamza Yerlikaya