tags:

views:

169

answers:

8

I have two separate apps - one a client (in C#), one a server (in C++). They need to exchange data in the form of "structs" and ~ about 1 MB of data a minute is sent from server to client.

Whats better to use - XML or my own Binary format?

With XML:

  • Translating XML to a struct using a parser would be slow I believe? ("good",but: load parser, load XML, parse)
  • The other option is parsing XML with regex (bad!)

With Binary:

  • compact data sizes
  • no need for meta information like tags;
  • but structs cannot be changed easily to accomodate new structs/new members in structs in future;
  • no conversion from text (XML) to binary (struct) necessary so is faster to receive and "assemble" into a struct)

Any pointers? Should I not be considering binary at all?? A bit confused about what approach to take.

+2  A: 

If you have .NET applications in both ends, use Windows Communication Foundation. This will allow you to defer the decision until deployment time, as it supports both binary and XML serialization.

Mark Seemann
Hi, I updated the question. The client is in c# and the server is in c++.
Liao
There is a Windows native library for WCF now. Look for "Windows Web Services"
Nemanja Trifunovic
@Nemanja: does Windows Web Services support binary XML?
John Saunders
+1  A: 

A good point for XML would be interoperability. Do you have other clients that also access your server?

Before you use your own binary format or do regex on XML...Have you considered the serialization namespace in .NET? There are Binary Formatters, SOAP formatters and there is also XmlSerialization.

flq
+6  A: 

1MB of data per minute is pretty tiny if you've got a reasonable network connection.

There are other choices between binary and XML - other human-readable text serialization formats, such as JSON.

When it comes to binary, you don't have to have versioning problems - technologies like Protocol Buffers (I'm biased: I work for Google and I've ported PB to C#) are explicitly designed with backward and forward compatibility in mind. There are other binary formats to consider as well, such as Thrift.

If you're worried about performance though, you should really measure it. I'm pretty sure my phone could parse 1MB of XML sufficiently quickly for it not to be a problem in this case... basically work out what you're most concerned about, in terms of:

  • Simplicity of code
  • Interoperability
  • Performance in terms of CPU
  • Network traffic
  • Backward/forward compatibility
  • Human readability of on-the-wire format

It's all a balancing act - but you're the one who has to decide how much weight to give each of those factors.

Jon Skeet
A: 

text/xml

  • Human readable
  • Easier to debug
  • Bandwidth can be saved by compressing
  • Tags document the data they contain

binary

  • Compact
  • Easy to parse (if fixed size fields are used, just overlay a struct)
  • Difficult to debug (hex editors are a pain)
  • Needs a separate document to understand what the data is.

Both forms are extensible and can be upgraded to newer versions provided you insert a type and version field at the beginning of the datagram.

doron
+1  A: 

Another advantage of a XML is that you can extend the data you are sending by adding an element, you wont have to alter the receiver's code to cope with the extra data until you are ready to.

Also even minimal(fast) compression of XML can dramatic reduce the wire load.

Adrian
A: 

Use Google Protocol Buffers.

Anon
A: 

you did not say if they are on the same machine or not. I assume not.

IN that case then there is another downside to binary. You cannot simply dump the structs on the wire, you could have endianness and sizeof issues.

XML is very wordy, YAML or JSON are much smaller

pm100
A: 

Don't forget that what most people think of as XML is XML serialized as text. It can be serialized to binary instead. This is what the netTcpBinding and other such bindings do in WCF. The XML infoset is output as binary, not as text. It's still XML, just in binary.

John Saunders