views:

2720

answers:

5

I have nested dictionaries:

{'key0': {'attrs': {'entity': 'p', 'hash': '34nj3h43b4n3', 'id': '4130'},
          u'key1': {'attrs': {'entity': 'r',
                              'hash': '34njasd3h43b4n3',
                              'id': '4130-1'},
                    u'key2': {'attrs': {'entity': 'c',
                                        'hash': '34njasd3h43bdsfsd4n3',
                                        'id': '4130-1-1'}}},
          u'key3': {'attrs': {'entity': 'r',
                              'hash': '34njasasasd3h43b4n3',
                              'id': '4130-2'},
                    u'key4': {'attrs': {'entity': 'c',
                                        'hash': '34njawersd3h43bdsfsd4n3',
                                        'id': '4130-2-1'}},
                    u'key5': {'attrs': {'entity': 'c',
                                        'hash': '34njawersd3h43bdsfsd4n3',
                                        'id': '4130-2-2'}}}},
 'someohterthing': 'someothervalue',
 'something': 'somevalue'}

given an id - one of all the ids like 4130 to 4130-2-2.
whats the easiest way to navigate to the correct dictionary?

Like if the given id is 4130-2-1 then it should reach the dictionary with key=key5

non xml approaches please.

Edit(1): The nesting is between 1 to 4 levels, but I know the nesting before I parse.

Edit(2): Fixed the code.

Edit(3):Fixed code again for string values of ids. Please excuse for the confusion created. This is final I hope :)

A: 

Well, if you have to do it only a few times, you can just use nested dict.iteritems() to find what you are looking for.

If you plan to do it several times, performances will quickly becomes an issue. In that case you could :

  • change the way you data is returned to you to something more suitable.

  • if you can't, convert the data once the fly to a dict between id and keys (using iteritems). Then use it.

e-satis
the idea when we created this structure was to access it through keys - like - key1, key2, etc. Now i stumbled upon a requirement to access thru ids. The second bullet point is a nice suggestion though, will try that.
JV
+4  A: 

If you want to solve the problem in a general way, no matter how many level of nesting you have in your dict, then create a recursive function which will traverse the tree:

def traverse_tree(dictionary, id=None):
    for key, value in dictionary.items():
        if key == 'id':
            if value == id:
                print dictionary
        else:
             traverse_tree(value, id)
    return

>>> traverse_tree({1: {'id': 2}, 2: {'id': 3}}, id=2)
{'id': 2}
Mapad
This doesn't work when I try it on my machine.
PEZ
I fixed the example code in question please take a re-look
JV
I have voted you up, don't know how to select 2 answers otherwise I would have selected this one also. :)
JV
A: 

Mapad has the right solution, but - hey, I fixed the dictionaries syntax:

d = {   u'somekey'      : 'somevalue',
        u'someotherkey' : 'someothervalue',
        u'key1' : {
            'attrs' : {
                'entity' : 'Post',
                'hash'   : '63e1ca180b86850ce51a6eb13bd2654a',
                'id'     : '4130'
            },
        },
        u'key2' : {
            'attrs' : {
                'entity' : 'Rev',
                'hash'   : '34176d03995d23251a8ce2c45e11f4ae',
                'id'     : '4130-1'
            },
        },
        u'key3' : {
            'attrs' : {
                'entity' : 'Comment',
                'hash'   : '34176d03995d23251a8ce2c45e11f4er',
                'id'     : '4130-1-1'
            },
        },
        u'key4' : {
            'attrs' : {
                'entity' : 'Rev',
                'hash'   : '34176d03995d23251a8ce2c45e11f4qe',
                'id'     : '4130-2'
            },
        },
        u'key5' : {
            'attrs' : {
                'entity' : 'Comment',
                'hash'   : '34176d03995d23251a8ce2c45e11f4er',
                'id'     : '4130-2-1'
            },
        },
        u'key6' : {
            'attrs' : {
                'entity' : 'Comment',
                'hash'   : '34176d03995d23251a8ce2c4er1f4er',
                'id'     : '4130-2-2'
            },
        },
    }
Tim
+5  A: 

Your structure is unpleasantly irregular. Here's a version with a Visitor function that traverses the attrs sub-dictionaries.

def walkDict( aDict, visitor, path=() ):
    for  k in aDict:
        if k == 'attrs':
            visitor( path, aDict[k] )
        elif type(aDict[k]) != dict:
            pass
        else:
            walkDict( aDict[k], visitor, path+(k,) )

def printMe( path, element ):
    print path, element

def filterFor( path, element ):
    if element['id'] == '4130-2-2':
        print path, element

You'd use it like this.

walkDict( myDict, filterFor )

This can be turned into a generator instead of a Visitor; it would yield path, aDict[k] instead of invoking the visitor function.

You'd use it in a for loop.

for path, attrDict in walkDictIter( aDict ):
    # process attrDict...
S.Lott
I have a huge collection of these, if you can suggest a better structure with arbitrary level support, ease of insertion and retrieval, that will be great. By the time you figure that structure I will try your solution. Thanks.
JV
@JV: The internal "attrs" dictionaries are ill-advised. Those a candidates for being objects of some defined class, not just anonymous dictionaries.
S.Lott
+1 for using Visitor
JV
+6  A: 

This kind of problem is often better solved with proper class definitions, not generic dictionaries.

class ProperObject( object ):
    """A proper class definition for each "attr" dictionary."""
    def __init__( self, path, attrDict ):
        self.path= path
        self.__dict__.update( attrDict )
    def __str__( self ):
        return "path %r, entity %r, hash %r, id %r" % (
            self.path, self.entity, self.hash, self.id )

masterDict= {} 
def builder( path, element ):
    masterDict[path]= ProperObject( path, element )

# Use the Visitor to build ProperObjects for each "attr"
walkDict( myDict, builder )

# Now that we have a simple dictionary of Proper Objects, things are simple
for k,v in masterDict.items():
    if v.id == '4130-2-2':
        print v

Also, now that you have Proper Object definitions, you can do the following

# Create an "index" of your ProperObjects
import collections
byId= collections.defaultdict(list)
for k in masterDict:
    byId[masterDict[k].id].append( masterDict[k] )

# Look up a particular item in the index
print map( str, byId['4130-2-2'] )
S.Lott
If you do a lot of lookups, the cost to transform to Objects and then to an index on 'id' is amortized over the lookups. Building the objects is O(n). Building the index is O(n) and can be done as the objects are being built. Lookup on id is O(1).
S.Lott