ansaurus

Question

Answer 1

+9 A:

The point of best practices is that in most cases, it's a bad idea to disregard them. If I were you, I'd use a parser to parse JSON into Python. Try out simplejson, it was very straightforward for parsing JSON when I last tried it and it claims to be compatible with Python 2.4.

I disagree that there's little point in distrusting Google. I wouldn't distrust them, but I'd verify the data you get from them. The reason that I'd actually use a JSON parser is right in your question:

My own (brief) investigations suggest that JSON happens to be completely valid Python, with the exception of the "\/" escape sequence — which GC doesn't appear to generate.

What makes you think that Google Code will never generate an escape sequence like that?

Parsing is a solved problem if you use the right tools. If you try to take shortcuts like this, you'll eventually get bitten by incorrect assumptions, or you'll do something like trying to hack together a parser with regex's and boolean logic when a parser already exists for your language of choice.

James Thompson 2009-07-05 00:40:58

If I weren't running in a hosted environment, I probably *would* be using simplejson. Unfortunately, I don't have a lot of control over my Python environment and I suspect that working out how to add custom packages will take longer than writing the actual script; we're talking 50 lines, tops. Similarly, I *don't* know that GC won't start generating that escape sequence, but if it does, the script will naturally fail safely, it'll be obvious that it's broken, and the fix is easy.

Ben Blank 2009-07-05 00:52:31

If the fix is easy, why not do it the fixed way first?

Matthew Scharley 2009-07-05 01:29:07

Because it involves using a regex to count backslashes. Right now, there aren't any regexes in the script, and I'll leave it that way if I can. :-)

Ben Blank 2009-07-05 01:35:24

here's a simplejson port in pure python, intended for a similar situation to yours: http://aaronland.info/python/s60-simplejson/s60-simplejson.py . I've never used it, but I suspect it's a better idea than eval()ing

llimllib 2009-07-05 02:04:50

The port looks nice. Might be worth running the unit tests from the original simplejson on it to see if they pass: http://simplejson.googlecode.com/svn/tags/simplejson-2.0.9/simplejson/tests/

Kiv 2009-07-05 13:12:10

Answer 2

+9 A:

If you're comfortable with your script working fine for a while, and then randomly failing on some obscure edge case, I would go with eval.

If it's important that your code be robust, I would take the time to add simplejson. You don't need the C portion for speedups, so it really shouldn't be hard to dump a few .py files into a directory somewhere.

As an example of something that might bite you, JSON uses Unicode and simplejson returns Unicode, whereas eval returns str:

>>> simplejson.loads('{"a":1, "b":2}')
{u'a': 1, u'b': 2}
>>> eval('{"a":1, "b":2}')
{'a': 1, 'b': 2}

Edit: a better example of where eval() behaves differently:

>>> simplejson.loads('{"X": "\uabcd"}')
{u'X': u'\uabcd'}
>>> eval('{"X": "\uabcd"}')
{'X': '\\uabcd'}
>>> simplejson.loads('{"X": "\uabcd"}') == eval('{"X": "\uabcd"}')
False

Edit 2: saw yet another problem today pointed out by SilentGhost: eval doesn't handle true -> True, false -> False, null -> None correctly.

>>> simplejson.loads('[false, true, null]')
[False, True, None]
>>> eval('[false, true, null]')
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'false' is not defined
>>>

Kiv 2009-07-05 01:28:43

+1 for "failing on some obscure edge case". You make a good point about Unicode, too. I'm pretty sure it isn't relevant for my particular use-case, but that's something I hadn't considered before.

Ben Blank 2009-07-05 01:38:33

Interesting; I could've sworn I tried \uXXXX escape codes and had them work. Testing them now, they fail just as you show here. I must have been hallucinating. ^.^

Ben Blank 2009-07-05 01:58:43

also if you do use the C speedups, then simplejson can return a mix of string and unicode, which there seems to be no interest in fixing -- http://code.google.com/p/simplejson/issues/detail?id=40

Gregg Lind 2009-07-09 21:17:50

Wow, that's actually pretty freaky.

Kiv 2009-07-09 23:30:42

Answer 3

+1 A:

evaling JSON is a bit like trying to run XML through a C++ compiler.

eval is meant to evaluate Python code. Although there are some syntactical similarities, JSON isn't Python code. Heck, not only is it not Python code, it's not code to begin with. Therefore, even if you can get away with it for your use-case, I'd argue that it's a bad idea conceptually. Python is an apple, JSON is orange-flavored soda.

Jason Baker 2009-07-05 01:40:50

You just made me want to run XML through a C++ compiler and see if I can get it to compile. Oh templates.

Kiv 2009-07-05 02:05:54

Heh... I wonder if there is a way? 'twould be interesting.

Jason Baker 2009-07-05 13:46:09

ansaurus

tags:

views:

answers:

Running JSON through Python's eval()?

related questions