views:

944

answers:

6

Does python have any sort of built in functionality of notifying what dictionary elements changed upon dict update? For example I am looking for some functionality like this:

>>> a = {'a':'hamburger', 'b':'fries', 'c':'coke'}
>>> b = {'b':'fries', 'c':'pepsi', 'd':'ice cream'}
>>> a.diff(b)
{'c':'pepsi', 'd':'ice cream'}
>>> a.update(b)
>>> a
{'a':'hamburger', 'b':'fries', 'c':'pepsi', 'd':'ice cream'}

I am looking to get a dictionary of the changed values as shown in the result of a.diff(b)

A: 

Not built in, but you could iterate on the keys of the dict and do comparisons. Could be slow though.

Better solution is probably to build a more complex datastructure and use a dictionary as the underlying representation.

Max
+2  A: 

No, it doesn't. But it's not hard to write a dictionary diff function:

def diff(a, b):
  diff = {}
  for key in b.keys():
    if (not a.has_key(key)) or (a.has_key(key) and a[key] != b[key]):
      diff[key] = b[key]
  return diff
Can Berk Güder
+7  A: 

No, but you can subclass dict to provide notification on change.

class ObservableDict( dict ):
    def __init__( self, *args, **kw ):
        self.observers= []
        super( ObservableDict, self ).__init__( *args, **kw )
    def observe( self, observer ):
        self.observers.append( observer )
    def __setitem__( self, key, value ):
        for o in self.observers:
            o.notify( self, key, self[key], value )
        super( ObservableDict, self ).__setitem__( key, value )
    def update( self, anotherDict ):
        for k in anotherDict:
            self[k]= anotherDict[k]

class Watcher( object ):
    def notify( self, observable, key, old, new ):
        print "Change to ", observable, "at", key

w= Watcher()
a= ObservableDict( {'a':'hamburger', 'b':'fries', 'c':'coke'} )
a.observe( w )
b = {'b':'fries', 'c':'pepsi'}
a.update( b )

Note that the superclass Watcher defined here doesn't check to see if there was a "real" change or not; it simply notes that there was a change.

S.Lott
Why does ObservableDict override the update method? Does the built in update method not call __setitem__?
adam
It didn't appear to in my quick test. Perhaps I missed something.
S.Lott
+1  A: 
def diff_update(dict_to_update, updater):
    changes=dict((k,updater[k]) for k in filter(lambda k:(k not in dict_to_update or updater[k] != dict_to_update[k]), updater.iterkeys()))
    dict_to_update.update(updater)
    return changes

a = {'a':'hamburger', 'b':'fries', 'c':'coke'}
b = {'b':'fries', 'c':'pepsi'}
>>> print diff_update(a, b)
{'c': 'pepsi'}
>>> print a
{'a': 'hamburger', 'c': 'pepsi', 'b': 'fries'}
AKX
I like how this function handles both updating and returning the changes. However if I had to come back to this in 6 months to fix a bug, I probably would wish I had used a simple for loop instead of a lambda.
adam
+1  A: 

A simple diffing function is easy to write. Depending on how often you need it, it may be faster than the more elegant ObservableDict by S.Lott.

def dict_diff(a, b):
    """Return differences from dictionaries a to b.

    Return a tuple of three dicts: (removed, added, changed).
    'removed' has all keys and values removed from a. 'added' has
    all keys and values that were added to b. 'changed' has all
    keys and their values in b that are different from the corresponding
    key in a.

    """

    removed = dict()
    added = dict()
    changed = dict()

    for key, value in a.iteritems():
        if key not in b:
            removed[key] = value
        elif b[key] != value:
            changed[key] = b[key]
    for key, value in b.iteritems():
        if key not in a:
            added[key] = value
    return removed, added, changed

if __name__ == "__main__":
    print dict_diff({'foo': 1, 'bar': 2, 'yo': 4 }, 
                    {'foo': 0, 'foobar': 3, 'yo': 4 })
Lars Wirzenius
After-the-fact difference comparison has to be slower than a per-update log of differences. Collecting differences as the updates occur doesn't require a the after-the-fact loop.
S.Lott
Keeping track of changes as you make updates has cost. If you only need a diff rarely, it can be cheaper to compute it rarely than accumulate cost during every change.
Lars Wirzenius
+1  A: 

one year later

I like the following solution:

>>> def dictdiff(d1, d2):                                              
        return dict(set(d2.iteritems()) - set(d1.iteritems()))
... 
>>> a = {'a':'hamburger', 'b':'fries', 'c':'coke'}
>>> b = {'b':'fries', 'c':'pepsi', 'd':'ice cream'}
>>> dictdiff(a, b)
{'c': 'pepsi', 'd': 'ice cream'}
remosu