Key factors are:
- what capabilities have your clients?
(e.g. Can you pick an XML parser from the shelf - without ruling out most of them because of performance reasons? Can you compress the packets on the fly?)
- What is the complexity of your data ("flat" or deeply structured?)
- Do you need high-frequency updates? Partial updates?
In my experience:
A simple text protocol (which would categorize itself as DSL) with an interface of
string RunCommand(string commandAndParams)
// e.g. RunCommand("version") returns "1.23"
makes many aspects easier: debugging, logging and tracing, extension of protocol, etc. Having a simple terminal / console for the device is invaluable in tracking down problems, running tests etc.
Let's discuss the limitation in detail, as a point of reference for the other formats:
- The client needs to run a micro parser. That's not as complex as it might sound (the core of my "micro parser library" is 10 functions with about 200 lines of code total), but basic string processing should be possible
- A badly written parser is a big attack surface. If the devices are critical/sensitive, or are expected to run in a hostile environment, implementation requires utmost care. (that's true for other protocols, too, but a quickly hacked text parser is easy to get wrong)
- Overhead. Can be limited by a mixed text/binary protocol, or base64 (which has an overhead of 37%).
- Latency. With typical network latency, you will not want many small commands issued, some way of batching requests and their returns helps.
- Encoding. If you have to transfer strings that aren't representable in ASCII, and can't use something like UTF-8 for that on both ends, the advantage of a text-based protocol drops rapidly.
I'd use a binary protocol only if requried by the device, device processing capabilities are insanely low (say, USB controllers with 256 bytes of RAM), or your bandwidth is severely limited. Most of the protocols I've worked with use that, and it's a pain.
Google protBuf is an approach to make a binary protocol somewhat easier. A good choice if you can run the libraries on both ends, and have enough freedom to define the format.
CSV is a way to pack a lot of data into an easily parsed format, so that's an extension of the text format. It's very limited in structure, though. I'd use that only if you know your data fits.
XML/YAML/... I'd use only if processing power isn't an issue, bandwith either isn't an issue or you can compress on the fly, and the data has a very complex structure. JSON seems to be a little lighter on overhead and parser requirements, might be a good compromise.