views:

164

answers:

4

I have a following JSON string coming from external input source:

{value: "82363549923gnyh49c9djl239pjm01223", id: 17893}

This is wrong-formatted JSON string ("id" and "value" must be in quotes), but I need to parse it anyway. I have tried simplejson and json-py and seems they could not be set up to parse such strings.

I am running Python 2.5 on Google App engine, so any C-based solutions like python-cjson are not applicable.

Input format could be changed to XML or YAML, in adition to JSON listed above, but I am using JSON within the project and changing format in specific place would not be very good.

Now I've switched to XML and parsing the data successfully, but looking forward to any solution that would allow me to switch back to JSON.

Thanks.

A: 

You could use a string parser to fix it first, a regex could do it provided that this is as complicated as the JSON will get.

davidosomething
This is possible, but I am considering such type of solution as weird, so for now I am just looking for a json parsing library that could process this broken JSON.
Serge Tarkovski
+5  A: 

since YAML (>=1.2) is a superset of JSON, you can do:

>>> import yaml
>>> s = '{value: "82363549923gnyh49c9djl239pjm01223", id: 17893}'
>>> yaml.load(s)
{'id': 17893, 'value': '82363549923gnyh49c9djl239pjm01223'}
mykhal
well, python-yaml (PyYAML) is not yet fully 1.2 compliant, but will handle most cases. to be prepared for problem cases, see http://en.wikipedia.org/wiki/YAML#cite_ref-6
mykhal
mykhal, have you run it on Google App Engine? Seems PyYAML uses C modules and thus cannot be used on GAE.
Serge Tarkovski
pyyaml is much faster, if using libyaml, but it also is written in pure python, and you can choose between CLoader o Loader (pure py). But don't worry, yaml support is already included in app engine, you can try this in interactive shell http://shell.appspot.com/
mykhal
YAML is not a strict superset of JSON as YAML requires the mapping keys to be unique while JSON only suggests to use unique keys (MUST vs. SHOULD).
Gumbo
One more problem: YAML apparently requires a space after the colon. However for the most part this works like a charm.
Adam Ernst
+2  A: 

Since that is the format of a Python dict, you can use ast.literal_eval() to turn the string into a dict. But do get them to fix it in the source regardless.

Ignacio Vazquez-Abrams
+1: This is a great suggestion considering that JSON and basic YAML constructs are syntactically identical to Python's basic `dict` syntax.
jathanism
`ast.literal_eval` is a very cool suggestion in general, but in this case, since `value` and `id` have no quotes, I think it doesn't work.
kaizer.se
A: 

Pyparsing includes a JSON parser example, here is the online source. You could modify the definition of memberDef to allow a non-quoted string for the member name, and then you could use this to parser your not-quite-JSON source text.

This page also has info and a link to my article in the August, 2008 issue of Python Magazine, which has a lot more detailed info about this parser. The page shows some sample JSON, and code that accesses the parsed results like it was a deserialized object.

Paul McGuire