views:

565

answers:

1

I have a couple of questions regarding the Bittorrent Peer Wire Protocol. I am trying to implement it in Java using this spec.

In Peer Wire Protocol section it says that all integers are four byte big endian values. AFAIK java uses big endian. Does that mean say if i want to send a choke message

choke:<len=0001><id=0>

Do i just write to the sokcet 1 followed by 0?

As for my second question. when requesting for a piece do i think multiple files as one big continous file? or think in individual files? because piece length won't align with files so one index can both contain end of one file and beginning of another?

As for my last question when i open a connection to the peer and send my handshake, do i just keep requesting pieces or request then wait for a while to see if it will request something from us? how does the talking take place? I have mostly done http type network programming where i ask for something wait for a response. but if i keep requesting pieces how am i going to send pieces?

+2  A: 

Question 1

Sticking to simple methods, if you are using stream-based I/O then use DataInputStream and DataOutputStream when writing primitive types (e.g., byte, int, long, etc.):

Socket s; // assume this is already connected
DataOutputStream out = new DataOutputStream( s.getOutputStream );
out.writeByte( 1 );
out.writeInt( 0 );
out.flush(); // optional

If you are using non-blocking I/O (e.g. classes from the java.nio package) then use ByteBuffer s:

Socket s; // assume this is already connected
SocketChannel = s.getChannel();
ByteBuffer buf = ByteBuffer.allocate(8); // two 4-byte integers
buf.put( 1 ).putInt( 0 );
buf.flip();
c.write( buf ); // assuming channel is writable :)

Each of these methods will take care of byte ordering issues on your behalf.

Question 2

(Note that usually you are transferring blocks, which are fragments of pieces, on the wire. I'll gloss over that here :) )

When sending/receiving pieces, it's best to think of the files (or file) as continuous, like you said. The .torrent file contains information on file boundaries, in the info dictionary. In the multi-file case, each file has path and a length; the single file case has an optional name and length. Since you know the piece size, number of pieces and total content length (all from the .torrent file), you can put pieces "in the right place" as you receive them.

A simple thing to do is create a single file equal to the size of the torrent. When you receive a piece, write it to the correct byte offset within this single file (sometimes called a ".downloading" file). For instance, consider a torrent consisting of two files:

a/b/file1.txt [100 bytes]
a/b/file2.txt [200 bytes]

piece size (pz) = 50 bytes
total size (tz) = 100+200 = 300 bytes
number pieces (np) = 300/50 = 6
file = my_torrent.downloading

Assume we number pieces and byte offsets starting with zero. Say you receive all of piece 1. At what (start) byte offset does it go in my_torrent.downloading? It goes at (1*pz) = (1*50) = 50. Where does piece 0 go? At (0*pz) = (0*50) = 0. And so on...

I'll bet that now you can figure out how you turn this .downloading file into the "real" content inside your torrent.

Question 3

When participating in a BitTorrent swarm, you are uploading and downloading pieces to and from multiple peers simultaneously . Think about that one for a second. At the same time you are requesting a piece from some peer, another peer might be doing the same from you. Quite different from the semantics of HTTP as you already pointed out. So, to speak directly to your question, other peers will ask you for data they are interested. :)

Just to make sure, before you request a piece a from a peer make sure that peer has the piece you want (check out the bitfield and have messages) and you've respected the proper choking/interested behavior. Given that, what you normally want to do is request data from your list of known peers (that the tracker or DHT told you about) in rarest first order. The spec talks about this and there are A LOT of optimizations and politeness considerations here. (Tit-for-tat behavior, for instance.) You might notice that the spec doesn't spell a lot of this out. That's because a lot of the secret sauce of BitTorrent clients lies in this part of the implementation. :)

I hope this helps you some!

JLR
note that according to the spec the id field is a single byte, not an int.
Michael Borgwardt
right, but per his question... :)
JLR