views:

141

answers:

3

From my understanding the XMPP protocol is based on an always-on connection where you have no, immediate, indication of when an XML message ends.

This means you have to evaluate the stream as it comes. This also means that, probably, you have to deal with asynchronous connections since the socket can block in the middle of an XML message, either due to message length or a connection being slow.

I would appreciate one source per answer so we can mod them up and see what's the favourite.

A: 

Igniterealtime.org provides an open source XMPP-server and client written in java

Looking at source is quite good for me, but in this case I need some information on dealing with the linearity of the data and not a full grown implementation of it.Thanks anyway.
Gustavo Carreno
A: 

ejabberd is written in Erlang. I don't know the details of the ejabberd implementation, but one advantage of using Erlang is really inexpensive threads. I'll speculate they start a thread per XMPP connection. In Erlang terminology these would be called processes, but these are not protected-memory address spaces they are lightweight user-space threads.

DGentry
+1  A: 

Are you wanting to deal with multiple connections at once? Good asynch socket processing is a must in that case, to avoid one thread per connection.

Otherwise, you just need an XML parser that can deal with a chunk of bytes at a time. Expat is the canonical example; if you're in Java, try XP. These types of XML parsers will fire events as possible, and buffer partial stanzas until the rest arrives.

Now, to address your assertion that there is no notification when a stanza ends, that's not really true. The important thing is not to process the XML stream as if it is a sequence of documents. Use the following pseudo-code:

stanza = null
while parser has more:
  switch on token type:
     START_TAG:
       elem =  create element from parser state
       if stanza is not null:
         add elem as child of stanza
       stanza = elem
     END_TAG:
       parent = parent of stanza
       if parent is not null:
         fire OnStanza event
       stanza = parent

This approach should work with an event-based or pull parser. It only requires holding on to one pointer worth of state. Obviously, you'll also need to handle attributes, character data, entity references (like & and the like), and special-purpose the stream:stream tag, but this should get you started.

Joe Hildebrand