tags:

views:

93

answers:

4

I have 2 lists of dictonaries and want to return items which have the same id but different title. i.e.

list1 = [{'id': 1, 'title': 'title1'}, {'id': 2, 'title': 'title2'}, {'id': 3, 'title': 'title3'}]

list2 = [{'id': 1, 'title': 'title1'}, {'id': 2, 'title': 'title3'}, {'id': 3, 'title': 'title4'}]

Would return [{'id': 2, 'title': 'title2'}, {'id': 3, 'title': 'title3'}] as the titles are different in list2 to list1.

A: 

Different dictionaries are equal if their contents are equal. So you can just do:

for i in list1:
    if i not in list2:
        result.append(i)
Ivo
Just keep in mind that the lookup `if i not in list2` runs in linear time, so this becomes expensive when `list2` is large. When repeated lookups are necessary, always think about using a dictionary or a set, instead. Since dictionaries can't be hashed, you can't just turn `list2` into a set, but marr75's answer shows how you can get around this.
gotgenes
+2  A: 

I propose that you refactor your design to not be a list of dictionaries, but 2 dictionaries of id: title pairs. The algorithm is trivial at that point and the performance is better.

Code example (edited to reflect SilentGhost's correct assertion):

titles1 = {1: "title1", 2: "title2", 3: "title3"}
titles2 = {1: "title1", 2: "not_title2", 3: "title3"}
for id, title in titles1.iteritems():
    # verify the key is in titles2, compare title to titles2[id]

Code example to convert list of dictionary to dictionary with id as key:

titles1 = dict([(x["id"], x) for x in list1])
marr75
Thanks for the advise. I'm new to Python so could you tell me how I can convert my list of dictionariesto dictionaries of id: title pairs? Thanks
John
I'd like to note that this would work for more complex objects than just a string as the values in each dictionary. Additionally, if you don't intend any side effects to either dictionary in your checking loop, you can construct a new list of mismatched objects using a functional style, I recommend this.
marr75
there is no need for `.keys()`
SilentGhost
Just change the way you construct the dictionaries to the example. Are you asking how you can CONVERT your list of dictionaries into a dictionary of integers and strings at runtime?
marr75
oh, dear. is correct syntax too much to ask? it's one of many recent replies where people cannot get iteration of dicts and their content right. and I'll be blamed in the end.
SilentGhost
The dictonaries are created from a mysql db using the python library MySQLdb so I'm not sure i can change the way they are created. So what is the best way to convert them into the structure you suggest?
John
`dict1=dict((d['id'], d['title']) for d in list1)` (or `dict1={d['id']: d['title'] for d in list1}` in Python 3.1)
Tim Pietzcker
@John gave you an example of how to create a dictionary out of the lists you have, it's almost exactly the same as Tim's except my example constructs a dictionary of dictionaries (would work for more complicated data) and so will require you to access the 'title' element of each item in titles1.
marr75
@SilentGhost thanks for the good catch, I'm at work on a machine meant for .Net Development, while I do have python installed, I don't have the environment set up to quickly and easily check my examples. Thanks again.
marr75
actually, this seems to work since Python 2.7...
Tim Pietzcker
+1  A: 
[dc for dc in list1 if dc['id'] in [d["id"] for d in list2] and dc not in list2]
dekomote
This scales poorly. For one, you have an inner loop (`[d["id"] for d in list2]`) that runs for every iteration of the outer loop (`dc for dc in list1`). You should have at least cached that inner loop list. Secondly, list lookup is linear time, whereas dictionary or set lookup is amortized constant time. The answer by marr75 is on the right track.
gotgenes
If he worries about scaling, he should really rethink the whole structure. I just pointed out a solution that works with his current data structure.
dekomote
A: 

If you refactor your data structure (assuming id is unique within one dictionary), a comparison could be implemented more efficiently (namely in O(n). Dictionary lookups are O(1). Example:

#!/usr/bin/env python

d1 = {
    1 : {"title" : "title1"},
    2 : {"title" : "title2"},
    3 : {"title" : "title3"},
}

d2 = {
    1 : {"title" : "title1"},
    2 : {"title" : "title3"},
    3 : {"title" : "title4"},
}

for key, value in d1.items():
    if not value == d2[key]:
        print "@", key, "values differ:", d1[key], "vs", d2[key]

# @ 2 values differ: {'title': 'title2'} vs {'title': 'title3'}
# @ 3 values differ: {'title': 'title3'} vs {'title': 'title4'}

Or shorter:

print [ (k, (d1[k], d2[k])) for k in d1 if not d2[k] == d1[k] ]
# [(2, ({'title': 'title2'}, {'title': 'title3'})), \ 
#    (3, ({'title': 'title3'}, {'title': 'title4'}))]
The MYYN