views:

89

answers:

3

I'm trying to load JSON back into an object. The "loads" method seems to work without error, but the object doesn't seem to have the properties I expect.

How can I go about examining/inspecting the object that I have (this is web-based code).

  results = {"Subscriber": {"firstname": "Neal", "lastname": "Walters"}}
  subscriber = json.loads(results)


  for item in inspect.getmembers(subscriber): 
     self.response.out.write("<BR>Item")
     for subitem in item: 
         self.response.out.write("<BR>&nbsp;SubItem=" + subitem)

The attempt above returned this:

   Item
     SubItem=__class__

I don't think it matters, but for context: The JSON is actually coming from a urlfetch in Google App Engine to a rest web service created using this utility: http://code.google.com/p/appengine-rest-server. The data is being retrieved from a datastore with this definition:

class Subscriber(db.Model):
    firstname    = db.StringProperty()
    lastname     = db.StringProperty()

Thanks, Neal

Update #1: Basically I'm trying to deserialize JSON back into an object. In theory it was serialized from an object, and I want to now get it back into an object. Maybe the better question is how to do that?

Update #2: I was trying to abstract a complex program down to a few lines of code, so I made a few mistakes in "pseudo-coding" it for purposes of posting here.

Here's a better code sample, now take out of website where I can run on PC.

results = '{"Subscriber": {"firstname": "Neal", "lastname": "Walters"}}'
subscriber = json.loads(results)
for key, value in subscriber.items():
    print " %s: %s" %(key, value)

The above runs, what it displays doesn't look any more structured than the JSON string itself. It displays this: Subscriber: {u'lastname': u'Walters', u'firstname': u'Neal'}

I have more of a Microsoft background, so when I hear serialize/deserialize, I think going from an object to a string, and from a string back to an object. So if I serialize to JSON, and then deserialize, what do I get, a dictionary, a list, or an object? Actually, I'm getting the JSON from a REST webmethod, that is on my behalf serializing my object for me.

Ideally I want a subscriber object that matches my Subscriber class above, and ideally, I don't want to write one-off custom code (i.e. code that would be specific to "Subscriber"), because I would like to do the same thing with dozens of other classes. If I have to write some custom code, I will need to do it generically so it will work with any class.

Update #3: This is to explain more of why I think this is a needed tool. I'm writing a huge app, probably on Google App Engine (GAE). We are leaning toward a REST architecture for several reasons, but one is that our web GUI should access the data store via a REST web layer. (I'm a lot more used to SOAP, so switching to REST is a small challenge in itself). So one of the classic ways of getting and update data is through a business or data tier. By using the REST utility mention above, I have the choice of XML or JSON. I'm hoping to do a small working prototype of both before we develop the huge app). Then, suppose we have a successful app, and GAE doubles it prices. Then we can rewrite just the data tier, and take our Python/Django user tier (web code), and run it on Amazon or somewhere else.

If I'm going to do all that, why would I want everything to be dictionary objects. Wouldn't I want the power of full-blown class structure? One of the next tricks is sort of an object relational mapping (ORM) so that we don't necessarily expose our exact data tables, but more of a logical layer.

We also want to expose a RESTful API to paying users, who might be using any language. For them, they can use XML or JSON, and they wouldn't use the serialize routine discussed here.

A: 

My guess is that loads is returning a dictionary. To iterate over its content, use something like:

for key, value in subscriber.items():
    self.response.out.write("%s: %s" %(key, value))
David Wolever
+3  A: 

results in your snippet is a dict, not a string, so the json.loads would raise an exception. If that is fixed, each subitem in the inner loop is then a tuple, so trying to add it to a string as you are doing would raise another exception. I guess you've simplified your code, but the two type errors should already show that you simplified it too much (and incorrectly). Why not use an (equally simplified) working snippet, and the actual string you want to json.loads instead of one that can't possibly reproduce your problem? That course of action would make it much easier to help you.

Beyyond peering at the actual string, and showing some obvious information such as type(subscriber), it's hard to offer much more help based on that clearly-broken code and such insufficient information:-(.

Edit: in "update2", the OP says

It displays this: Subscriber: {u'lastname': u'Walters', u'firstname': u'Neal'}

...and what else could it possibly display, pray?! You're printing the key as string, then the value as string -- the key is a string, and the value is another dict, so of course it's "stringified" (and all strings in JSON are Unicode -- just like in C# or Java, and you say you come from a MSFT background, so why does this surprise you at all?!). str(somedict), identically to repr(somedict), shows the repr of keys and values (with braces around it all and colons and commas as appropriate separators).

JSON, a completely language-independent serialization format though originally centered on Javascript, has absolutely no idea of what classes (if any) you expect to see instances of (of course it doesn't, and it's just absurd to think it possibly could: how could it possibly be language-independent if it hard-coded the very concept of "class", a concept which so many languages, including Javascript, don't even have?!) -- so it uses (in Python terms) strings, numbers, lists, and dicts (four very basic data types that any semi-decent modern language can be expected to have, at least in some library if not embedded in the language proper!). When you json.loads a string, you'll always get some nested combination of the four datatypes above (all strings will be unicode and all numbers will be floats, BTW;-).

If you have no idea (and don't want to encode by some arbitrary convention or other) what class's instances are being serialized, but absolutely must have class instances back (not just dicts etc) when you deserialize, JSON per se can't help you -- that metainformation cannot possibly be present in the JSON-serialized string itself.

If you're OK with the four fundamental types, and just want to see some printed results that you consider "prettier" than the default Python string printing of the fundamental types in question, you'll have to code your own recursive pretty-printing function depending on your subjective definition of "pretty" (I doubt you'd like Python's own pprint standard library module any more than you like your current results;-).

Alex Martelli
Please see update2 in my original post. thanks!
NealWalters
@Neal, sure, but I have a hard time understanding what else you expect -- editing my answer accordingly.
Alex Martelli
It just seems obvious to me that if you start with an object, and serialize/deserialize, you would want to end up with the same object you started with, not a dictionary representation of that object. Would I be better using XML to serialize? The REST server that I'm experimenting with can return JSON or XML. Yes, JSON is language independent, but SimpleJson is Python specific, right? Seems like its .load would inspect and recreate the object (if there as a way to pass classname). What I'm trying to do is so easy and obvious in C#, but I'm still learning the ins and outs of Python.
NealWalters
My C# analogy: http://biztalk-training.com/articles.php?article_id=8 and the print-out in update2 was following David's sample code from the other answer.
NealWalters
I'm curious about this as well. It seems that my answer does what he is asking for.
aaronasterling
@Neal, REST can serve XML or JSON, but is independent from such details as class names and structures and also the language being used to query **AND** to serve the data. Your C# example depends on serializer and de-serializer being in the same language (or languages with equivalent abilities -- e.g., Python and C++ have multiple inheritance, Java and C# does, so it would be impossible to interoperate between the former two and the latter two!!!) **and** both knowing all details about the classes being encoded and decoded -- all so **incredibly** incompatible with REST to boggle the mind.
Alex Martelli
Thanks, I added more of my architecture/reasoning in update3 of original message. Do you think I'm crazy and barking up the wrong tree? What other alternatives are there for such an enterprise level undertaking?
NealWalters
@Neal, I do think you're coming at it with the wrong attitude -- much like people who can't stand to think of (e.g.) an SQL underlying data layer and absolutely, compulsively needs to pull it back into "objects" before they stop to breathe. The _good_ alternative in each case is accepting the storage and/or serialization layer for what it does offer you, neither more nor less, and absolutely **not** try to force it into your OOP preferences prematurely. But, hey, took me decades to learn that, and plenty of mistakes akin to yours -- experience will be your teacher just as it was mine!-)
Alex Martelli
+1  A: 

json only encodes strings, floats, integers, javascript objects (python dicts) and lists.

You have to create a function to turn the returned dictionary into a class and then pass it to a json.loads using the object_hook keyword argument along with the json string. Heres some code that fleshes it out:

import json

class Subscriber(object):
    firstname = None
    lastname = None


class Post(object):
    author = None
    title = None


def decode_from_dict(cls,vals):
    obj = cls()
    for key, val in vals.items():
        setattr(obj, key, val)
    return obj


SERIALIZABLE_CLASSES = {'Subscriber': Subscriber,
                        'Post': Post}

def decode_object(d):
    for field in d:
        if field in SERIALIZABLE_CLASSES:
            cls = SERIALIZABLE_CLASSES[field]
            return decode_from_dict(cls, d[field])
    return d


results = '''[{"Subscriber": {"firstname": "Neal", "lastname": "Walters"}},
              {"Post": {"author": {"Subscriber": {"firstname": "Neal",
                                                  "lastname": "Walters"}}},
                        "title": "Decoding JSON Objects"}]'''
result = json.loads(results, object_hook=decode_object)
print result
print result[1].author

This will handle any class that can be instantiated without arguments to the constructor and for which setattr will work.

Also, this uses json. I have no experience with simplejson so YMMV but I hear that they are identical.

Note that although the values for the two subscriber objects are identical, the resulting objects are not. This could be fixed by memoizing the decode_from_dict class.

aaronasterling
Looks interesting, but if the class had 10-30 or more properties, I would have almost need a code generator make the __init__ method. That's kind of why I was thinking along the lines of "inspection". I'm using the db.model class of Google App Engine (not sure that makes any difference), but each class is basically a database table/row, each of which will often contain 20-50 attributes/columns. I'll try to walk thru your code more tomorrow, kind of burned out tonight. Thanks.
NealWalters
Alex Martelli
@Neal, I updated it to work with the situation that you are describing. You will just have to create a dictionary entry for each class that you want to be able to serialize/deserialize as json.
aaronasterling
@Alex If you look at the code, there is only 13 lines that are needed for this to work and it uses a pretty clean format for encoding the objects. I would hardly consider this complicated and building one protocol on top of another is a fairly well broken in practice. If I needed to decode that json, 13 lines of code wouldn't stop me.
aaronasterling
@Aaron, the problem's not the 13 lines -- it's with the **mutual knowledge needed** (and language commonalities needed) by the agent doing the encoding, and the agent doing the decoding. Such proprietary metadata, requiring common knowledge by both parties beyond JSON and REST standards, is the antithesis of the core concepts on which REST and JSON founded their success and growing mindshare (and the common crufty layering of this kind in XML and SOAP is arguably a reason why JSON and REST, lighter and **free** from such "thick glue", are often preferred over them nowadays).
Alex Martelli
Aaron - I really like your code and appreciate it! I'll have to tweak it, because my real app obviously has more than lastname, firstname, and it already choked on a datetime: "dateAdded": "2010-09-14T03:47:24.992000" ... but I can deal with that. I'll update my original question with more of the purpose of why I'm doing this.
NealWalters