A: 

"Cross-platform support (Python, Java, C#, C++, Ruby, Perl)"

Too bad this criteria is first. The intent behind most languages is to express fundamental data structures and processing differently. That's what makes multiple languages a "problem": they're all different.

A single representation that's good across many languages is generally impossible. There are compromises in richness of the representation, performance or ambiguity.

JSON meets the remaining criteria nicely. Messages are compact and parse quickly (unlike XML). Nesting is handled nicely. Changing structure without breaking code is always iffy -- if you remove something, old code will break. If you change something that was required, old code will break. If you're adding things, however, JSON handles this also.

I like human-readable. It helps with a lot of debugging and trouble-shooting.

The subtlety of having Python tuples turn into lists isn't an interesting problem. The receiving application already knows the structure being received, and can tweak it up (if it matters.)


Edit on performance.

Parsing the XML and JSON documents from http://developers.de/blogs/damir_dobric/archive/2008/12/27/performance-comparison-soap-vs-json-wcf-implementation.aspx

xmlParse 0.326 jsonParse 0.255

JSON appears to be significantly faster for the same content. I used the Python SimpleJSON and ElementTree modules in Python 2.5.2.

S.Lott
-1 Sorry, but XML is typically faster, eg http://dev.robertmao.com/2007/10/01/json-vs-xml-parsing-performance/. Will remove when you update. People just prefer JSON because it's easier to deal with in Javascript.
cletus
Also http://developers.de/blogs/damir_dobric/archive/2008/12/27/performance-comparison-soap-vs-json-wcf-implementation.aspx
cletus
Interesting. Here, JSON's faster: http://www.crossedconnections.org/w/index.php/2006/06/20/json-vs-xml-performance-update/
S.Lott
A little outdated though, FF 1.5, IE6 and really not much in it.
cletus
+2  A: 

One major consideration is "do you want to have to specify each structure definition"?

If you are OK with that, then you could take a look at:

  1. Protocol Buffers - http://code.google.com/apis/protocolbuffers/docs/overview.html
  2. Thrift - http://developers.facebook.com/thrift/ (more geared toward services)

Both of these solutions require supporting files to define each data structure.


If you would prefer not to incur the developer overhead of pre-defining each structure, then take a look at:

  1. JSON (via python cjson, and native PHP json). Both are really really fast if you don't need to transmit binary content (such as images, etc...).
  2. Yet Another Markup Language @ http://www.yaml.org/. Also really fast if you get the right library.

However, I believe that both of these have had issues with transporting binary content, which is why they were ruled out for our usage. Note: YAML may have good binary support, you will have to check the client libraries -- see here: http://yaml.org/type/binary.html


At our company, we rolled our own library (Extruct) for cross-language serialization with binary support. We currently have (decently) fast implementations in Python and PHP, although it isn't very human readable due to using base64 on all the strings (binary support). Eventually we will port them to C and use more standard encoding.

Dynamic languages like PHP and Python get really slow if you have too many iterations in a loop or have to look at each character. C on the other hand shines at such operations.

If you'd like to see the implementation of Extruct, please let me know. (contact info at http://blog.gahooa.com/ under "About Me")

gahooa
+1  A: 

I tried several methods and settled on compressed JSON as the best balance between speed and memory footprint. Python's native Pickle function is slightly faster, but the resulting objects can't be used with non-Python clients.

I'm seeing 3:1 compression so all the data fits in memcache and the app gets sub-10ms response times including page rendering.

Here's a comparison of JSON, Thrift, Protocol Buffers and YAML, with and without compression:

http://bouncybouncy.net/ramblings/posts/more_on_json_vs_thrift_and_protocol_buffers/

Looks like this test got the same results I did with compressed JSON. Since I don't need to pre-define each structure, this seems like the fastest and smallest cross-platform answer.

mb
A: 

You might be interested into this link :

http://kbyanc.blogspot.com/2007/07/python-serializer-benchmarks.html

An alternative : MessagePack seems to be the fastest serializer out there. Maybe you can give it a try.

GrosBedo