views:

197

answers:

5

How can I determine if any of the list elements are a key to a dict? The straight forward way is,

for i in myList:
   if i in myDict:
      return True
return False

but is there a faster / more concise way?

+17  A: 
#!python
any(x in MyDict for x in MyList)
set(MyList)&set(MyDict)
Ronny
+1 for clarity, efficiency (of the first version only!), and simplicity.
Alex Martelli
btw, 'set' version is faster than 'any' version if key from `MyDict' is not at the beginning of `MyList` http://stackoverflow.com/questions/1737778/dict-has-key-from-list/1737862#1737862
J.F. Sebastian
the `set` version is less clear than the `any` version tough...
João Portela
+1  A: 

Assuming you are talking about python, an alternate method of doing this would be:

return len([x for x in myList if x in myDict]) > 0
Pablo Santa Cruz
Two problems: this unnecessarily builds a list and looks for all occurrences without short-circuiting.
hughdbrown
+3  A: 

In addition to any(item in my_dict for item in my_list) from @Ronny's answer:

any(map(my_dict.__contains__, my_list)) # Python 3.x

Or:

from itertools import imap
any(imap(my_dict.__contains__, my_list)) # Python 2.x

Measure relative performance

Cases to consider:

  1. Item from the start of list is in in dictionary.
  2. Item from the end of list is in dictionary.
  3. No items from the list are in dictionary.

Functions to compare (see main.py):

def mgag_loop(myDict, myList):
    for i in myList:
        if i in myDict:
            return True
    return False

def ronny_any(myDict, myList):
    return any(x in myDict for x in myList)

def ronny_set(myDict, myList):
    return set(myDict) & set(myList)

def pablo_len(myDict, myList):
    return len([x for x in myList if x in myDict]) > 0

def jfs_map(my_dict, my_list):
    return any(map(my_dict.__contains__, my_list))

def jfs_imap(my_dict, my_list):
    return any(imap(my_dict.__contains__, my_list))

Results: mgag_loop() is the fastest in all cases.

1. Item from the start of list is in in dictionary.

def args_key_at_start(n):
    'Make args for comparison functions "key at start" case.'
    d, lst = args_no_key(n)
    lst.insert(0, n//2)
    assert (n//2) in d and lst[0] == (n//2)
    return (d, lst)

key at start

2. Item from the end of list is in dictionary.

def args_key_at_end(n):
    'Make args for comparison functions "key at end" case.'
    d, lst = args_no_key(n)
    lst.append(n//2)
    assert (n//2) in d and lst[-1] == (n//2)
    return (d, lst)

key at end

3. No items from the list are in dictionary.

def args_no_key(n):
    'Make args for comparison functions "no key" case.'
    d = dict.fromkeys(xrange(n))
    lst = range(n, 2*n+1)
    assert not any(x in d for x in lst)
    return (d, lst)

no key

How to reproduce

Download main.py, make-figures.py, run python main.py (numpy, matplotlib should be installed to create plots).

To change maximum size of input list, number of points to plot supply --maxn, --npoints correspondingly. Example:

$ python main.py --maxn 65536 --npoints 16
J.F. Sebastian
The key advantage of `any` is that it short-circuits, and `any(map(...` loses that key advantage!
Alex Martelli
Brilliant. I don't know python very well and learned a lot today by reading all the answer. Your summary is awesome. You should be awarded with the answer! :-)
Pablo Santa Cruz
OMG. I will go back and test tomorrow night.
mgag
@Alex: You're right. I've clarified that 'map' variant is suitable for Python 3.x
J.F. Sebastian
@mgag: Beware that micro-benchmarks do not matter much. Always measure performance of *your* code *before* applying any optimizations.
J.F. Sebastian
if it helps, the list always has between 1-4 elements, and the dict has ~100-150 elements.
mgag
I need to work with python 2.5+. Speed is a issue for customer of my code - but thru other optimizations I am in the good now. But I will go back and test given result from Alex - I may have abandoned my approach too soon and not gone back and tested while I did other optimizations....
mgag
I tested between the two, specifically, len([x for x in nodeList if x in removeNode]) > 0vs def mgag_loop(myDict, myList): for i in myList: if i in myDict: return True return FalseAnd the results are nearly identical for three runs (~80sec for each run), within 1%. The second option was always faster, but the margin is small.
mgag
+1 for the hard work :p
João Portela
+1 for pretty graphics.
bcat
+1  A: 

This was a popular answer to a related question:

>>> if all (k in foo for k in ("foo","bar")):
...     print "They're there!"
...
They're there!

You can adapt it to check if any appears in the dictionary:

>>> if any(k in myDict for k in ("foo","bar")):
...     print "Found one!"
...
Found one!

You can check against a list of keys:

>>> if any(k in myDict for k in myList):
...     print "Found one!"
...
Found one!
hughdbrown
+1  A: 

Thank you all so much. I tested the performance of all the answers and the fastest was

return len([x for x in myList if x in myDict]) > 0

but I didn't try the "set" answer because I didn't see how to turn it into one line.

mgag
Premature optimization is the root of all evil.The any() solution was much more elegant.
Virgil Dupras
FYI, You'll want to use the checkmark to denote your accepted answer rather than post another answer like this.
Joe Holloway
I can't see how `len()` can be faster. What input have you used? I've measured performance for all answers: http://stackoverflow.com/questions/1737778/dict-has-key-from-list/1737862#1737862
J.F. Sebastian