This post is a short summary of the Bittorrent Protocol in dialog format.

While implementing the BitTorrent protocol, I’ve had to learn (and still am learning) quite a bit about networks and how computers communicate; namely communication on a distributed one to one basis. Implementing BitTorrent, at least as far as my road map envisions, involves at minimum three things:

  1. Parsing a Torrent file (and figuring out the Pieces needed).
  2. Communicating with a Tracker (and figuring out who the Peers are).
  3. Downloading and Uploading with peers.

    Byte Size Communication

The communication part, when I download and upload data from and to peers, or leech and seed data, so to speak, is done with a series of byte messages, read and written from and to sockets. This post is intended to, very simply, depict a typical conversation between sockets (a me and a you), which is something I had trouble envisioning when I began writing my client.

First, however, a brief review of bytes is in order. Bytes are made up of eight bits, that is, at minimum 00000000, 0, and at maximum 11111111, 256. An important part of the protocol is understanding a four-byte big-endian number, that is a number with four digits with the largest values being leftmost (just like us humans read).

That is to say, 0 0 0 1, as a four byte big endian number, is 1. And 0 0 1 0 is 256, or in Go:

buffer := make([]byte, 4)
binary.BigEndian.PutUint32(buffer, uint32(256))

And 16 KiB, or 16384 bytes is:

binary.BigEndian.PutUint32(buffer, uint32(16384))
// [0 0 64 0]

And now, the script (of course, consult the unofficially official specs for more clarity:

A Dialog:

Note Bene

  • All messages (except the handshake) are prefixed with the length of the message to come.
  • That is, 1 for the message id and n for however many more bytes comes with the payload:
  • the first byte that follows is a 0 through 9 value of the message id.

After hearing on the grape-vine, a Tracker, about a fellow torrentor, I ask you, a stranger:

Me: Hi! I speak Bittorent… I’m looking for this file here, and my name is -FL2016-.

handshake: <protocol string length><protocol string><blank bytes><file hash><peer_id>
'19' 'BitTorrent protocol' '        ' 'sha1 hash' 'random'

You: Hi! I (also) speak Bittorent… I’ve got this file here, and my name is -YU1969-.

handshake: <protocol string length><protocol string><blank bytes><file hash><peer_id>
'19' 'BitTorrent protocol' '        ' 'sha1 hash' 'random bytes'

You: So I’ve got almost all of the file you want…

bitfield: <len=0001+X><id=5><bitfield>
'0 0 0 7' '5' '240' '255' '255' '255' '239' '255'

You: And I also have this.

have: <len=0005><id=4><piece index>
'0 0 0 5' '4' '0 0 0 78'

Me: I’m interested in what you have (but I’m still choked).

interested: <len=0001><id=2>
'0 0 0 1' '2'

You: Alright! You’re unchoked now, go ahead and ask for what you need.

unchoke: <len=0001><id=1>
'0 0 0 1' '1'

Me: I see you’ve got piece 24, can I have the first part of it? I like to receive 16 KiB at a time…

request: <len=0013><id=6><index><offset><length>
'0 0 0 13' '6' '0 0 0 24' '0 0 0 0' '0 0 64 0'

You: Sure, here you go, part one of piece 24.

piece: <len=0009+X><id=7><index><offset><block>
'0 0 64 9' '7' '0 0 0 24' '0 0 0 0' '23 255 233 65 13 5 76 111' // ... 16KiB of data

Me: I see you’ve got piece 24, can I have the second part (offset 16KiB) of it? I like to receive 16 KiB at a time…

request: <len=0013><id=6><index><offset><length>
'0 0 0 13' '6' '0 0 0 24' '0 0 64 0' '0 0 64 0'

You: Sure, here you go, part two (offset 16KiB) of piece 24.

piece: <len=0009+X><id=7><index><offset><block>
'0 0 64 9' '7' '0 0 0 24' '0 0 64 0' '3 35 45 95 88 233 200 108' // ... 16KiB of data

You: You know, you’re asking for a bit more than I can spare, hold up.

choke: <len=0001><id=0>
'0 0 0 1' '0'

// silence for a few seconds

Me: I know we haven’t spoken in a while, but stick around, please!

keep-alive: <len=0000>
'0 0 0 0'

Me: Oh, turns out I’ve got everything you’ve got, don’t worry about it.

notinterested: <len=0001><id=3>
'0 0 0 1' '3'

Conclusion

There is much I gloss over, but when squinting, this is how the torrent protocol works: a client asks another client for some data, and they ask with a procedure (protocol) of byte messages like above. Now, for security’s sake, after the blocks (the offset piece of pieces) arrive and make up a piece, the client checks the sha-1 hash provided by the torrent file and writes it to disk.

My implementation is not complete, and of course, there are not too few Recursers who have implemented this. Those very not too few who have been helping me, namely Krace Kumar. Thanks you!