tags:

views:

1489

answers:

8

Hi,

I have a list of two-item lists and need to search for things in it.

If the list is:

list =[ ['a','b'], ['a','c'], ['b','d'] ]

I can search for a pair easily by doing

['a','b'] in list

Now, is there a way to see if I have a pair in which a string is present in just the second position? I can do this:

for i in range (0, len(list)):
    if list[i][1]==search:
       found=1

But is there a (better) way without the for loop? I don't need to know i or keep the loop going after it's found.

+3  A: 

You're always going to have to have a loop - someone might come along with a clever one-liner that hides the loop within a call to map() or similar, but it's always going to be there.

My preference would always be to have clean and simple code, unless performance is a major factor.

Here's perhaps a more Pythonic version of your code:

data = [['a','b'], ['a','c'], ['b','d']]
search = 'c'
for sublist in data:
    if sublist[1] == search:
        print "Found it!", sublist
        break
# Prints: Found it! ['a', 'c']

It breaks out of the loop as soon as it finds a match.

(You have a typo, by the way, in ['b''d'].)

RichieHindle
At this point in my python "career" I favor this approach since it is very easy to read. I might come back to try the other ones for performance, though. My list gets quite big at one point. Is there any place where I can compare performance of the different approaches?
greye
Use the `timeit` module for performance testing of this kind of thing: http://docs.python.org/library/timeit.html
RichieHindle
+4  A: 
>>> my_list =[ ['a', 'b'], ['a', 'c'], ['b', 'd'] ]
>>> 'd' in (x[1] for x in my_list)
True

Editing to add:

Both David's answer using any and mine using in will end when they find a match since we're using generator expressions. Here is a test using an infinite generator to show that:

def mygen():
    ''' Infinite generator '''
    while True:
        yield 'xxx'  # Just to include a non-match in the generator
        yield 'd'

print 'd' in (x for x in mygen())     # True
print any('d' == x for x in mygen())  # True
# print 'q' in (x for x in mygen())     # Never ends if uncommented
# print any('q' == x for x in mygen())  # Never ends if uncommented

I just like simply using in instead of both == and any.

Anon
That's what it's supposed to do.
Glenn Maynard
I think any() is clearer, but I guess it's just personal preference. +1 then...
David Zaslavsky
+6  A: 

Nothing against RichieHindle's and Anon's answers, but here's how I'd write it:

data = [['a','b'], ['a','c'], ['b','d']]
search = 'c'
any(e[1] == search for e in data)

Like RichieHindle said, there is a hidden loop in the implementation of any (although I think it breaks out of the loop as soon as it finds a match).

David Zaslavsky
+5  A: 
>>> the_list =[ ['a','b'], ['a','c'], ['b''d'] ]
>>> any('c' == x[1] for x in the_list)
True
Brandon E Taylor
+1  A: 
>>> the_list =[ ['a','b'], ['a','c'], ['b','d'] ]
>>> "b" in zip(*the_list)[1]
True

zip() takes a bunch of lists and groups elements together by index, effectively transposing the list-of-lists matrix. The asterisk takes the contents of the_list and sends it to zip as arguments, so you're effectively passing the three lists separately, which is what zip wants. All that remains is to check if "b" (or whatever) is in the list made up of elements with the index you're interested in.

Markus
+1  A: 

Markus has one way to avoid using the word for -- here's another, which should have much better performance for long the_lists...:

import itertools
found = any(itertools.ifilter(lambda x:x[1]=='b', the_list)
Alex Martelli
Ah, good, Alex is here. ;-) Obviously, the gen exp's use the word 'for' - but if we allow that, interpreting the goal as avoiding the standard for loop structure instead of the word 'for' itself, how do all the answers given compare in terms of performance?
Anon
@Anon, I have no time right now to run the usual -mtimeit thingies (OSCON is on, -), but from previous experience I know that itertools tend to perform like greased lightning. All answers save Markus's stop at the first match so they're all equally fast in this sense.
Alex Martelli
NP at all. Thanks. ;-)
Anon
+1  A: 

Nothing wrong with using a gen exp, but if the goal is to inline the loop...

>>> import itertools, operator
>>> 'b' in itertools.imap(operator.itemgetter(1), the_list)
True

Should be the fastest as well.

Coady
+2  A: 

the above all look good

but do you want to keep the result?

if so...

you can use the following

result = [element for element in data if element[1] == search]

then a simple

len(result)

lets you know if anything was found (and now you can do stuff with the results)

of course this does not handle elements which are length less than one (which you should be checking unless you know they always are greater than length 1, and in that case should you be using a tuple? (tuples are immutable))

if you know all items are a set length you can also do:

any(second == search for _, second in data)

or for len(data[0]) == 4:

any(second == search for _, second, _, _ in data)

...and I would recommend using

for element in data:
   ...

instead of

for i in range(len(data)):
   ...

(for future uses, unless you want to save or use 'i', and just so you know the '0' is not required, you only need use the full syntax if you are starting at a non zero value)

Terence Honles