tags:

views:

4234

answers:

4

I've seen a fair share of ungainly XML->JSON code on the web, and having interacted with Stack's users for a bit, I'm convinced that this crowd can help more than the first few pages of Google results can.

So, we're parsing a weather feed, and we need to populate weather widgets on a multitude of web sites. We're looking now into Python-based solutions.

This public weather.com RSS feed is a good example of what we'd be parsing (our actual weather.com feed contains additional information because of a partnership w/them).

In a nutshell, how should we convert XML to JSON using Python?

A: 

Well, probably the simplest way is just parse the XML into dictionaries and then serialize that with simplejson.

dguaraglia
+8  A: 

There is no "one-to-one" mapping between XML and JSON, so converting one to the other necessarily requires some understanding of what you want to do with the results.

That being said, Python's standard library has several modules for parsing XML (including DOM, SAX, and ElementTree). And as of the latest release, Python 2.6, support for converting Python data structures to and from JSON is included in the json module.

So the infrastructure is there.

Dan
+2  A: 

While built in libs for xml parsing are quite good I am partial to lxml.

But for parsing RSS feed I'd recommend Universal Feed Parser. (can also parse Atom) Its main advantage is that it can digest even most malformed feeds.

Python 2.6 already includes Json parser, but newer version with improved speed is available as simplejson

Whit this tools building your app shouldn't be that difficult.

Luka Marinko
A: 

You may want to have a look at http://designtheory.org/library/extrep/designdb-1.0.pdf. This project starts off with an XML to JSON conversion of a large library of XML files. There was much research done in the conversion, and the most simple intuitive XML -> JSON mapping was produced (it is described early in the document). In summary, convert everything to a JSON object, and put repeating blocks as a list of objects.

objects meaning key, value pairs (dictionary in python, hashmap in java, object in javascript, ...)

There is no mapping back to XML to get an identical document, the reason is, it is unknown whether a key, value pair was an attribute or an value, therefore that information is lost. (Subjective text follows) If you ask me, attributes are a hack to start; then again they worked well for HTML.

pykler