sometimes it feels like XML has been used just because it was fashionable.
Some strengths:
- You can validate XML data against XSD
- You can easily provide contracts (as XSD) to other parties that should either create/consume XML data, without literally describing them
- You can have one to many relations in multi-levels in XML data representation
- XML is arguably more readable than CSV
- XML is natively supported by the .net framework
To name a few from the top of my head.
You can have a much more complex hierarchy, etc. and structure with XML vs. CSV. It offers a lot more flexibility.
CSV is useful when you just have a series of a values that relate to some piece of information and you know you will always store values for each field.
XML has the benefit of having self-describing data (tags) and having hierarchy - which gives you a lot more flexibility in the way that you store the data.
CSV is more lightweight if you want to move things about since its normally 2 more times smaller than XML
XML is standard and won't be hit by different OS'es version of CSV
.csv files are good when your data is strictly tabular and you know its structure. As soon as you start having relationships between different levels of your data, xml tends to work better because relationships can be made obvious (even without schemas) just by nesting.
Of course it is fashionable and buzz-worthy sometimes. It all depends on your application. I prefer config files in XML because they are easy to parse. Whereas, I use CSV files for DataGridView or database dumps.
This Daily WTF : XML vs CSV The Choice is Obvious will help you make your decision ;)
Structured, human readable, easier to edit, validation, parsability, transformability, typing, namespaces, powerful libraries behind it, are all amongst many of the reasons.
Above all else though it is standard.
In addition to the other answers, XML allows you to specify which character set the document is in.
XML provides a way of tagging your data with metadata (provided by the tag names and attribute names), whereas CSV does not. Couple this with the ability to define structured hierarchies and it makes XML easier to understand when provided with just the data, whereas CSV would require an accompanying tool or document to describe how each value is interpreted.
You can easily traverse through XML data even when you have complex data.
Check these links:
And again one more for XML: The X in XML stands for Extensible (I know, not really mnemonic :-P). That means, with the help of the XML namespace mechanism, you can join any two XML languages you like and combine them in the same document. Given that there is only one CSV 'language' (not counting the myriads of delimiter styles), XML can handle quite a lot of complexity, and that in a modular way.
This however, is the advantage of CSV: If you really have tabular data, XML syntax is most often overkill.
I don't have enough reputation to comment on the relevant answer, but someone suggested compressing the XML as a way to gain size parity with csv formats. While this is true, XML compression can somtimes come back to bite you. If you are transferring XML data from point to point and it fails, it's nice to be able to read the XML and figure out what went wrong. If the XML is compressed and the transfer fails, it's sometimes not possible to decompress it and examine the contents. In other words compressing XML cancels out the human-readability advantage it has.
I have found of the greatest advantages of XML to be the parsing functionality and the strict validation that comes out-of-the-box with most XML libraries. The insistence on well-formedness and easy-to-understand error message (xyz not closed in line x, column y) are a real help compared to hunting broken values, or unknown behaviour, because of an error in the CSV file.
I've also found that some cvs generators/parsers have a lot of difficulty with general text data. Long text strings with a lot of carriage returns and commas and quotations, etc etc, just make life really difficult when it comes to manipulating a cvs.
SSMS likes to truncate csv for fun.
I would say use XML (and or JSON) because someday you or someone (with a short temper and a large gun collection) may have to go find an error in the CSV data.
So yes, I'm saying readability, don't forget to think of the other guy! He may be thinking about you.
- There are existing parsers and emitters for it in every language and database
- They deal with encoding for me
- They deal with escaping for me
That's all that matters to me.
Sure, there's a semi-standard way to do escaping in CSV (i.e., "the way Excel does it"), and it's not exactly hard to write yourself, but it does take some time. And then you've got to implicitly agree on a character encoding out-of-band. But then, because it's so simple, people try to write it themselves, and invariably screw up either #2 or #3.
JSON also meets #2 and #3 and is getting close to satisfying #1. It's also arguably simpler, at least for non-document files. Not surprisingly, I find myself using it more and more, internally and externally.