views:

786

answers:

3

The use case is long term serialization of complex object graphs in a textual format.

A: 

If you use YAML, you might have a slightly fewer bytes going across the wire. Not enough to be significant I think, but in some situations it could be relevant. I also think YAML is easier to read and work with in a text editor, but that is really kind of subjective, and it wouldn't really matter if this is not a normal usage scenario.

If bandwidth is not a factor and you wont be working with the YAML/XML in a text editor often, I think it doesn't really matter which you use.

John Conrad
+7  A: 

Short answer:

if you expect humans to create/read the document (configuration files, reports, etc) then you may consider YAML, otherwise choose XML (for machine-to-machine communication).

Long answer:

1) Length

Both XML and YAML are approximately the same. Good XML libraries can skip all whitespaces while for YAML it is required. A complex YAML contains a lot of indentation spaces (do not use tabs!)

2) Network failure

Part of a YAML document is often a valid document, so if a YAML document is incomplete there is no way to detect it. An XML parser will always check whether a document is at least well-formed.

3) Language support

only a few major programming languages have a proper support for YAML (does VisualBasic have a YAML parser ?)

4) General knowledge

You do not need to explain to a developer (even junior) what is XML. YAML is not that widely used yet.

5) Schema

XML - producer and consumer can agree on a Schema to establish a reliable data exchange format.

6) Syntax

XML is very rich: namespaces, entities, attributes

7) external dependencies

Java and Python have XML support in the standard libraries. YAML (for these 2 languages) requires an external dependency

8) maturity

XML specification is older and it is rock solid. YAML is still under construction. 1.1 contains inconsistencies (there is even a wiki to maintain the list of mistakes!).

9) XSLT

If you need to transform an XML document to another format (XML, HTML, YAML, PDF) you can use XSLT while for YAML you have to write a program.

A: 

I agree: YAML is more readable, and seems like a good fit for, say, dev-read/writable configuration files. But there is little benefit for machine-to-machine communication. Also, for textual markup (xml's traditional forte, like xhtml, docbook), xml is better.

Specifically for object serialization I can't think of good reason to use YAML.

In fact, I would suggest considering JSON instead: it is based on object (or, struct, since there's no behavior) model, rather than hierarchic (xml) or relational (SQL) models. Because of this, it's bit more natural fit for object data. But XML works just fine as well, and there are many good tools.

Last thing: terms "long-term" and "object serialization" do not mix. Latter implies close coupling: your objects change, so does serialization. Actual object serialization should not be used for storing data, data binding/mapping is more appropriate. But it may be that you are using serialization in sense of "storing/restoring data using convenient object wrappers"; if so, that's fine.

StaxMan