tags:

views:

336

answers:

9

I'm curious, I've been developing pretty powerful websites/web apps, and I've never learnt XML, even odder I've never really felt the need to. It's not like Curl or Prepared Statements where before knowing what they did and how they worked I had a feeling 'there's got to be an easier way to do this!' or 'there's got to be something designed for this!'.

Currently I work with MySQL and JSON and I don't have this feeling of 'I need to learn that' (XML), this must be wrong!

I'm really interested to hear some compelling arguments for XML, and learn about things which it can do beter than JSON or MySQL (or some other aspect of web dev) and when i should be using it!

A: 

With PHP, as with most dynamic languages, it's best relegated to interoperability purposes. XML is faster to rewrite than Java, but PHP is faster to rewrite than XML.

Ignacio Vazquez-Abrams
XML is a means of structuring data, Java and PHP are programming languages, they can't be compared.
David Dorward
+1  A: 

I use XML mainly for config files or as transportation format, however if you are familiar with JSON, or YAML, they might be just as fine for you, so there is no real need to learn XML.

Tapdingo
+10  A: 

XML is useful for storing heterogeneous tree structures, in situations where general purpose tools can be applied to them and some redundancy is desirable. If you are doing modern web development, there is a good chance you are producing XHTML rather than HTML, and are producing RSS or Atom, so you should already using be it. The most common RDF formats use it.

JSON is a bit easier to work with for data on the web, but hasn't got the same feature set - you can't have attributes in JSON so there is no implicit difference between data and meta-data, and you don't have processing instructions or the ability to create entities for repeated chunks of text. On the other hand, many uses of XMLl don't use those features either. SQL databases have a fixed schema, and do not represent trees well.

Mostly XML is used for interoperability.

Pete Kirkham
If he's doing web development, there's a good chance he's producing invalid `XHTML` sent as `text/html`. IMHO :)
Ionuț G. Stan
Yes, but that's not an excuse for not learning it.
Pete Kirkham
*"you can't have attributes in JSON"* How about: `{ "firstAttribute": "value1", "secondAttribute": "value2", "content": [ /*blah*/] }`? And those attributes can themselves be objects, whereas in XML they are limited to strings.
Daniel Earwicker
@Daniel - your example would be represented in XML as <data><firstAttribute>value1</firstAttribute><secondAttribute>value2</secondAttribute><content>...</content></data>, _not_, as Pete said: <data firstAttribute="value1" secondAttribute="value2"><content>...</content></data>. True, the diffence may just be semantic, but there's certainly an argument for a delineation of data from meta-data
K Prime
@K Prime - but I called them `firstAttribute` etc for the simple reason that they represent anything from XML that would normally be represented by attributes. If metadata is worth supporting, then why give it a crippled facility that can't support structure but only string values?
Daniel Earwicker
@Daniel Earwicker There is no 'string' in XML. The built in types for attributes are CDATA, NMTOKEN, NMTOKENS, rounghly character data, enumerated value, or list of enumerated values. But those fields are not metadata, they are data. You can always use the name of the field to signify they are attributes, but that is in interpretation rather than the difference between text, structural markup (element tags) and metadata (attributes)
Pete Kirkham
@Pete Kirkham - maybe I was unclear. An attribute value cannot make use of the recursive structural features available elsewhere in XML, via nested elements (or to put it briefly, it's "just a string" as understood by anyone familiar with programming languages). Why is structure assumed to be less useful for metadata than it is for ordinary data? In any case, there are many standard uses of XML attributes where your claim that they are for metadata seems quite dubious, e.g. in XHTML, everything about the `img` is specified in attributes.
Daniel Earwicker
@Daniel Earwicker You must be the only person on the planet complaining that XML is not complex enough! Traditionally, the use cases for attributes (in SGML, HTML etc.) didn't require complex types, so they aren't represented, just character data, enumerated values, or lists of enumerated values. I'd have preferred other parts getting left out (DTDs), and the entity mechanism separated from the document type mechanism, but there you go.
Pete Kirkham
@Pete Kirkham - again I must have been unclear. I'm not suggesting that attributes should be made more complex so they are as powerful as the rest of XML, I'm suggesting that the same mechanisms can and should serve for both metadata and data, because there's no advantage in having a special way to represent metadata if it is so severely limited. (I think there is one obvious missing piece of "necessary complexity" in self-describing XML: records with named members and lists with ordered items are not distinguishable. But there's no point trying to fix that in XML.)
Daniel Earwicker
@Daniel Earwicker given the said mechanism is so obviously useful, and the distinction between character data which is presented to a user agent and metadata (such as the list of enumerated values which correspond to CSS classes) which is not presented is valid in the primary use case of web markup, I disagree.
Pete Kirkham
@Pete Kirkham - "In the primary use case of web markup", I completely agree with you. But in the first case you mention in your answer, "for storing heterogeneous tree structures", the SGML model of text+markup just isn't helpful. It doesn't fit that problem. And unfortunately that is the most popular use case for XML specifically, much more prevalent in the real world than XHTML.
Daniel Earwicker
@Daniel Earwicker I was quite careful to say 'XML is useful for ...' not 'XML is particularly suited for...' or 'XML was designed for...' . A swiss-army knife is useful, but certainly not optimal.
Pete Kirkham
+5  A: 

JSON is very lightweight which makes it better suited for passing data around to the front end.

XML has descriptive tags that (I personally find) make it easier to read in a raw format. If I wanted to have any sort of settings file that is loaded in from my program, i would have it in an XML file format.

That's my idea of it anyway, but i'm sure there are much more in-depth reasons for choosing one over the other. Of which i am not experienced enough to list :)

However i did find a couple of sites that make some good points.

http://ajaxian.com/archives/json-vs-xml-the-debate (Some good points in the comments)

http://webignition.net/articles/xml-vs-yaml-vs-json-a-study-to-find-answers/

Kohan
What do you mean `xml does not support arrays`?!?!?
klausbyskov
My mistake, edited.
Kohan
thanks kohan, some interesting articles there
Haroldo
@Kohan, shame you got rid of the part about arrays. It is true that XML does not support arrays (as in: it provides no built-in specific feature to make them self-describing). Instead, XML (ab)uses the same syntax for both arrays with ordered unnamed members and record-like objects with named members. The number one reason XML needed schemas was to get over this limitation!
Daniel Earwicker
This was my first attempt at a question answer for someone and I pannicked (I feel bad just asking all the time, but hey, im learning), thats some good information though. Thanks for clearing it up for me.
Kohan
+3  A: 

I use XML for translations of web site labels, tags etc, or non-repetitive content. For this kind of thing, it's a life saver.

danp
A: 

"Knowing XML" can mean a couple of different things.

The first is understanding the basic syntax. It is a prerequisite for writing XHTML, SVG, Atom, RSS, and a host of other languages which are XML applications.

The second builds upon the first and is an understand of how to develop your own XML applications, i.e. custom data storage or exchange formats. JSON can fulfill a similar role and has some advantages (such as being able to implicitly represent an array of data: { bar: ['foo'] }. In XML a parser would have to know to convert the contents of <bar> into an array for the programming language if you want to treat it as a simple data structure) <bar><foo/></bar>) and disadvantages (XML lets you have optional things in any order with less effort).

David Dorward
A: 

XML is the only solution for the data interchange and good for nothing else.

So, you have to learn XML only if you gonna parse or supply an RSS feed. No rocket science though, as it is same markup language as HTML, with some strict guidelines.

Ol' good article from ol' good Joel to sort things out

Col. Shrapnel
Absolutism is always wrong.
Tom
+3  A: 

One of the advantage of XML over other serialization formats is the number of tools available. The other is the ability to formalize the description of you data (XML Schema).

The availability of tools lets you use XML editors, transformers, visualizers, ... For example, where I work, we have the communication team using an XML editor to edit content and metadata. They are not technical enough to write JSON by hand (or XML), but it is very easy to give them a template with a nice generic frontend to edit the needed documents.

Having a way to describe the format (XSD, DTD, Relax NG, ...) means that you can also automagically validate your documents. It also serves as a pretty good documentation of what is allowed and what is not in your documents.

Guillaume
+3  A: 

XML is simply for storing messages in a structured way that is (ostensibly) application agnostic. This is all it is. Said another way, XML offers a way to preserve semantics (meaning) of data when communicating between different applications. It's also popular as a configuration format since (1) a config file is just a message between different application sessions* and (2) almost every language has mature, standard XML libraries.

*you can also think of this as just a degenerate case of communicating between applications.

Rodrick Chapman