ansaurus

Question

How to correctly pass XML strings between a custom TCP client / server?

Answer 1

A:

Using a zero-byte is the right approach. it should (at least afaik) not break anything in respect of unicode or other encoding and gives you definitely more flexibility than any length byte/long.

Niko 2009-09-13 18:45:31

Answer 2

+5 A:

Your examples aren't actually well-formed XML, which may be part of your problem. If you're going to the trouble of using XML, you may as well use well-formed XML, which has rules for node termination, i.e:

<data id="DATA1" val="..." />

or

You can then use a SAX parser for the stream, which will give you events as nodes and attributes are parsed.

I would then implement your two types of commands like this:

// individual commands
<get id="data_1"/>

// multiple commands
<multi>
  <data id="DATA1"/>
  <data id="DATA2"/>
  ...
</multi>

Greg Campbell 2009-09-13 19:12:49

+1 but to be pedantic, you should have written "well-formed XML". "valid XML" means that the XML conforms to a schema, which is something very different: http://en.wikipedia.org/wiki/XML#Well-formedness_and_error-handling

Wim Coenen 2009-09-13 22:38:11

Good point - I'll change it.

Greg Campbell 2009-09-14 02:01:55

I agree with this - the most logical way to do this is to extend your XML schema, so that a complete request is delimted by `<request req_id=NNNN></request>` and a reply by `<reply req_id=NNNN></reply>`.

caf 2009-09-16 05:55:17

Answer 3

A:

There are three ways I can think of:

Describe the length out of band: This could be a little like an HTTP header: CR deliminate a length in ascii, then all following bytes are counted in the length.
Null terminate the string. The Null char is unique.
CR or LF terminate the node and a line based protocol can read the XML.

As mentioned elsewhere, make sure your XML conforms to standards so that either side can be swapped out and then old code won't have to be tweaked to conform.

quamrana 2009-09-13 20:56:32

Answer 4

+1 A:

I see two options that make a lot of sense, that I've used before:

1- Just send it, and don't terminate the XML. If the XML is valid, it'll have only a single root node. You don't have to terminate it, since the client can parse it until it discovers that it has a complete XML file.

2- use "Pascal" style strings. I find this really easy, since the read can be done all at once, and it makes all the rest of the problems non-existant. Basically, Prepend your 'string' document with an integer that is the number of bytes to be sent. I do this particularly when dealing with TCP, since I can fetch out what I call "packets" or groups of complete data all at once.

Erich 2009-09-13 21:16:32

Answer 5

A:

I like the idea of simple CRLF delimiting, seems simplest. From the link provided would this work? (with CRLF == two bytes 1013)

Send:

   <GET ID="DATA1" />CRLF

Reply:

   <ID="DATA1" VAL="3" />CRLF
   <ID="DATA1" VAL="2" />CRLF
   <ID="DATA1" VAL="1" />CRLF
   ...

As answer 2 mentioned, an XML reply with multiple lines may occur. Might this cause problems with a CRLF at each line, rather than the end of the response? Can't CRLF naturally occur within a multi-line XML string?

Reply:

   <multi>CRLF
     <data id="DATA1"/>CRLF
     <data id="DATA2"/>CRLF
   </multi>CRLF

Brian 2009-09-14 15:15:14

Ok, from the XML spec it looks like line endings must only be LF, and if CRLF or CR are found they are converted to just LF:http://www.w3.org/TR/REC-xml/#sec-line-endsSo using CRLF as the XML string packet delimiter looks like it should work. I will give that a try.Thanks for your help.

Brian 2009-09-14 16:15:48

ansaurus

tags:

views:

answers:

How to correctly pass XML strings between a custom TCP client / server?

related questions