views:

5164

answers:

4

Python's list type has an index() method that takes one parameter and returns the index of the first item in the list matching the parameter. For instance:

>>> some_list = ["apple", "pear", "banana", "grape"]
>>> some_list.index("pear")
1
>>> some_list.index("grape")
3

Is there a graceful (idiomatic) way to extend this to lists of complex objects, like tuples? Ideally, I'd like to be able to do something like this:

>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> some_list.getIndexOfTuple(1, 7)
1
>>> some_list.getIndexOfTuple(0, "kumquat")
2

getIndexOfTuple() is just a hypothetical method that accepts a sub-index and a value, and then returns the index of the list item with the given value at that sub-index. I hope

Is there some way to achieve that general result, using list comprehensions or lambas or something "in-line" like that? I think I could write my own class and method, but I don't want to reinvent the wheel if Python already has a way to do it.

+9  A: 

How about this?

>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> [x for x, y in enumerate(tuple_list) if y[1] == 7]
[1]
>>> [x for x, y in enumerate(tuple_list) if y[0] == 'kumquat']
[2]

As pointed out in the comments, this would get all matches. To just get the first one, you can do:

>>> [y[0] for y in tuple_list].index('kumquat')
2

There is a good discussion in the comments as to the speed difference between all the solutions posted. I may be a little biased but I would personally stick to a one-liner as the speed we're talking about is pretty insignificant versus creating functions and importing modules for this problem, but if you are planning on doing this to a very large amount of elements you might want to look at the other answers provided, as they are faster than what I provided.

Paolo Bergantino
+1 for the speedier approach
Jarret Hardie
Nice solution, but will not produce the desired result: it will not return the index of the first item only, but will iterate over the whole list and return all matches.
van
Good point, van. Fixed.
Paolo Bergantino
Still creates a new list of size N in memory, which isn't necessary. Also runs in O(n) average case, which can be improved to O(n/2). Yes I know that's still O(n) technically.
Triptych
The issue van raise is easily resolved by just picking the first result ([0]) from the list of multiple matches. Interestingly, if I run the same speed test as I did in the comments to my answer with a) Paolo's original enumerate comphrehension, b) Paolo's revised comprehension and index, and c) the map/operator/index approach from my answer, option C is the when there's more than one match in tuple_list (ie: more than one "kumquat"). B is next best. A is slowest. This is fun!
Jarret Hardie
Can you throw Triptych's in that test? :)
Paolo Bergantino
Sure... and we have a winner. I used Triptych's second (super performant) example that returns as soon as it finds the result as his first example was essentially the same as mine, but with an extra function all. The super-performant version is indeed the fastest.
Jarret Hardie
Though I should repeat that this test is obviously a quick one-off, done under conditions that may not mirror someone's production need, and is hardly well-planned, so take it with a truck-load of salt.
Jarret Hardie
Naturally. Thanks Jarret. I wish I could upvote you more than once. :)
Paolo Bergantino
This is pretty much exactly what I wanted to do. Thanks!
Ryan B. Lynch
+4  A: 

One possibility is to use the itemgetter function from the operator module:

import operator

f = operator.itemgetter(0)
print map(f, tuple_list).index("cherry") # yields 1

The call to itemgetter returns a function that will do the equivalent of foo[0] for anything passed to it. Using map, you then apply that function to each tuple, extracting the info into a new list, on which you then call index as normal.

map(f, tuple_list)

is equivalent to:

[f(tuple_list[0]), f(tuple_list[1]), ...etc]

which in turn is equivalent to:

[tuple_list[0][0], tuple_list[1][0], tuple_list[2][0]]

which gives:

["pineapple", "cherry", ...etc]
Jarret Hardie
That's neat. I wonder if this or the list comprehension is faster? Either way, +1.
Paolo Bergantino
The problem with this is that you are iterating twice to get the index.
Nadia Alramli
Paolo asks an interesting question... as I think everyone suspects, the list comprehension and enumerate approach is slightly faster... over 100000 runs on my ever-so-scientific test, the enumerate approach was about 10milliseconds faster.
Jarret Hardie
Cool. Thanks for doing the test.
Paolo Bergantino
I think Paolo and I should blend answers :-) After he edited his answer, I re-ran the speed tests for cases where there are more than one match in the tuple_list... and the operator approach is fastest... see my comment in Paolo's answer.
Jarret Hardie
+1  A: 

You can do this with a list comprehension and index()

tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
[x[0] for x in tuple_list].index("kumquat")
2
[x[1] for x in tuple_list].index(7)
1
Alasdair
+5  A: 

Those list comprehensions are messy after a while.

I like this Pythonic approach:

from operator import itemgetter

def collect(l, index):
   return map(itemgetter(index), l)

# And now you can write this:
collect(tuple_list,0).index("cherry")   # = 1
collect(tuple_list,1).index("3")        # = 2

If you need your code to be all super performant:

# Stops iterating through the list as soon as it finds the value
def getIndexOfTuple(l, index, value):
    for pos,t in enumerate(l):
        if t[index] == value:
            return pos

    # Matches behavior of list.index
    raise ValueError("list.index(x): x not in list")

getIndexOfTuple(tuple_list, 0, "cherry")   # = 1
Triptych
+1 as the super performant is indeed the fastest solution posted. I would personally still stick to the one liner as the speed difference at this level is pretty meaningless but it's good to know anyways.
Paolo Bergantino
Thanks. Normally I would use the collect() version - looks so much nicer.
Triptych