ansaurus

Question

Fast comparison between two Python dictionary

Answer 1

+3 A:

not sure whether its "fast" or not, but normally, one can do this

dicta = {"a":1,"b":2,"c":3,"d":4}
dictb = {"a":1,"d":2}
for key in dicta.keys():
    if not key in dictb:
        print key

ghostdog74 2009-07-22 13:48:39

You have to swap `dicta` and `dictb` since he wants to know those keys of `dictb` that are not in `dicta`.

Gumbo 2009-07-22 13:51:45

thanks. i am bad with english :)

ghostdog74 2009-07-22 14:07:11

Answer 2

+8 A:

As Alex Martelli wrote, if you simply want to check if any key in B is not in A, any(True for k in dictB if k not in dictA) would be the way to go.

To find the keys that are missing:

diff = set(dictB)-set(dictA) #sets

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA = dict(zip(range(1000),range (1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=set(dictB)-set(dictA)" 10000 loops, best of 3: 107 usec per loop

diff = [ k for k in dictB if k not in dictA ] #lc

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA = dict(zip(range(1000),range (1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=[ k for k in dictB if k no t in dictA ]" 10000 loops, best of 3: 95.9 usec per loop

So those two solutions are pretty much the same speed.

THC4k 2009-07-22 13:53:29

Answer 3

A:

You can use set operations on the keys:

diff = set(dictb.keys()) - set(dicta.keys())

Here is a class to find all the possibilities: what was added, what was removed, which key-value pairs are the same, and which key-value pairs are unchanged.

class DictDiffer(object):
    """
    Calculate the difference between two dictionaries as:
    (1) items added
    (2) items removed
    (3) keys same in both but changed values
    (4) keys same in both and unchanged values
    """
    def __init__(self, current_dict, past_dict):
     self.current_dict, self.past_dict = current_dict, past_dict
     self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
     self.intersect = self.set_current.intersection(self.set_past)
    def added(self):
     return self.set_current - self.intersect 
    def removed(self):
     return self.set_past - self.intersect 
    def changed(self):
     return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
    def unchanged(self):
     return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])

hughdbrown 2009-07-22 14:11:51

Answer 4

+1 A:

If you really mean exactly what you say (that you only need to find out IF "there are any keys" in B and not in A, not WHICH ONES might those be if any), the fastest way should be:

if any(True for k in dictB if k not in dictA): ...

If you actually need to find out WHICH KEYS, if any, are in B and not in A, and not just "IF" there are such keys, then existing answers are quite appropriate (but I do suggest more precision in future questions if that's indeed what you mean;-).

Alex Martelli 2009-07-22 16:39:16

This will not work if there's a key in B that's not in A and it evaluates to False. For example: `a = {}; b = {'': 'sample'}; any(k for k in b if k not in a)`

Steve Losh 2009-07-22 16:51:38

any(True for k in b if k not in a)

THC4k 2009-07-22 16:58:05

**@THC4k** Yep, that's likely the best way.

Steve Losh 2009-07-22 17:05:02

Good points @Steve and @thc4k, thanks - editing the answer to fix my bug now.

Alex Martelli 2009-07-23 00:23:22

Answer 5

A:

Here's a way that will work, allows for keys that evaluate to False, and still uses a generator expression to fall out early if possible. It's not exceptionally pretty though.

any(map(lambda x: True, (k for k in b if k not in a)))

EDIT:

THC4k posted a reply to my comment on another answer. Here's a better, prettier way to do the above:

any(True for k in b if k not in a)

Not sure how that never crossed my mind...

Steve Losh 2009-07-22 16:56:13

Answer 6

A:

Thanks for your responses.

Apologies for not stating my question properly. My scenario is like this - i have a dictA which can be the same as dictB or may have some keys missing as compared to dictB or else the value of some keys might be different which has to be set to that of dictA key's value.

Problem is the dictionary has no standard and can have keys which can be dict of dict....

Say

dictA={'key1':a, 'key2':b, 'key3':{'key11':cc, 'key12':dd}, 'key4':{'key111':{....}}}
dictB={'key1':a, 'key2:':newb, 'key3':{'key11':cc, 'key12':newdd, 'key13':ee}.......

So 'key2' value has to be reset to the new value and 'key13' has to be added inside the dict. The key value does not have a fixed format. It can be a simple value or a dict or a dict of dict....

2009-07-23 11:29:08

ansaurus

tags:

views:

answers:

Fast comparison between two Python dictionary

related questions