tags:

views:

141

answers:

5

I have a list of strings. I have a function that given a string returns 0 or 1. How can I delete all strings in the list for which the function returns 0?

+6  A: 
[x for x in lst if fn(x) != 0]

This is a "list comprehension", one of Python's nicest pieces of syntactical sugar that often takes lines of code in other languages and additional variable declarations, etc.

See: http://docs.python.org/tutorial/datastructures.html#list-comprehensions

dkamins
Note that this creates a new list; it doesn't mutate an existing list.
intuited
`if not fn(x)` is more idiomatic and works correctly with any value Python considers true/false (your function happens to return 0 or 1 but there is no reason to hardcode this assumption).
Beni Cherniavsky-Paskin
+1  A: 

edit: see the bottom for the best answer.

If you need to mutate an existing list, for example because you have another reference to it somewhere else, you'll need to actually remove the values from the list.

I'm not aware of any such function in Python, but something like this would work (untested code):

def cull_list(lst, pred):
    """Removes all values from ``lst`` which for which ``pred(v)`` is false."""
    def remove_all(v):
        """Remove all instances of ``v`` from ``lst``"""
        try:
             while True:
                 lst.remove(v)
        except ValueError:
             pass

    values = set(lst)

    for v in values:
        if not pred(v):
            remove_all(v)

A probably more-efficient alternative that may look a bit too much like C code for some people's taste:

def efficient_cull_list(lst, pred):
    end = len(lst)
    i = 0
    while i < end:
        if not pred(lst[i]):
            del lst[i]
            end -= 1
        else:
            i += 1

edit...: as Aaron pointed out in the comments, this can be done much more cleanly with something like

def reversed_cull_list(lst, pred):
    for i in range(len(lst) - 1, -1, -1):
        if not pred(lst[i]):
            del lst[i]

...edit

The trick with these routines is that using a function like enumerate, as suggested by (an) other responder(s), will not take into account the fact that elements of the list have been removed. The only way (that I know of) to do that is to just track the index manually instead of allowing python to do the iteration. There's bound to be a speed compromise there, so it may end up being better just to do something like

lst[:] = (v for v in lst if pred(v))

Actually, now that I think of it, this is by far the most sensible way to do an 'in-place' filter on a list. The generator's values are iterated before filling lst's elements with them, so there are no index conflict issues. If you want to make this more explicit just do

lst[:] = [v for v in lst if pred(v)]

I don't think it will make much difference in this case, in terms of efficiency.

Either of these last two approaches will, if I understand correctly how they actually work, make an extra copy of the list, so one of the bona fide in-place solutions mentioned above would be better if you're dealing with some "huge tracts of land."

intuited
`enumerate` does seem to adjust indexes appropriately. I didn't think that it did until I tried it. Write a simple loop over a list from 1 to 20 that removes every item that is an even multiple of 3. Print the list and index every time that you do a removal. You will remove indexes 2, 4, 5, 6, 10, and 12.
D.Shawley
you can make the loop simpler by looping over it in reverse. Then when you delete an element, you don't have to decrement (or even have) `end`. Also, no branch in the while.
aaronasterling
@D.Shawley: Either you're confused or I am. I wrote some [doctests](http://gist.github.com/618883) to test the various presented solutions to this question; maybe I got something wrong with that?
intuited
@Aaron: ah, very good point. I added a reversed-order indexed version.
intuited
@D.Shawley: repeat your experiment with a predicate that specifies every item that is NOT an even multiple of 3.
John Machin
Yup. `reversed` is required. The generator solution is probably the nicest except that it doesn't modify the list in-place.
D.Shawley
+4  A: 

I would use a generator expression over a list comprehension to avoid a potentially large, intermediate list.

result = (x for x in l if f(x))
# print it, or something
print list(result)

Like a list comprehension, this will not modify your original list, in place.

Nick Presta
A: 
>>> s = [1, 2, 3, 4, 5, 6]
>>> def f(x):             
...     if x<=2: return 0
...     else: return 1   
>>> for n,x in enumerate(s):
...     if f(x) == 0: s[n]=None
>>> s=filter(None,s)
>>> s
[3, 4, 5, 6]
ghostdog74
This doesn't work: for the list `[1, 2, 3, 4, 5, 6]` and `f = lambda v: v <= 2` I get `[1, 2, 4, 6]`.
intuited