tags:

views:

213

answers:

9

Howdy, codeboys and codegirls!

I have came across a simple problem with seemingly easy solution. But being a Python neophyte I feel that there is a better approach somewhere.

Say you have a list of mixed strings. There are two basic types of strings in the sack - ones with "=" in them (a=potato) and ones without (Lady Jane). What you need is to sort them into two lists.

The obvious approach is to:

for arg in arguments:
   if '=' in arg:
       equal.append(arg)
   else:
       plain.append(arg)

Is there any other, more elegant way into it? Something like:

equal = [arg for arg in arguments if '=' in arg]

but to sort into multiple lists?

And what if you have more than one type of data?

+4  A: 

Try

for arg in arguments:
    lst = equal if '=' in arg else plain
    lst.append(arg)

or (holy ugly)

for arg in arguments:
    (equal if '=' in arg else plain).append(arg)

A third option: Create a class which offers append() and which sorts into several lists.

Aaron Digulla
I find more readable the form `list = ('=' in arg) and equal or plain`.
giorgian
you're shadowing built-in
SilentGhost
I think "a if cond else b" is very readable English; I just don't like to put too much into a single line.
Aaron Digulla
and it should be `else plain`.
SilentGhost
@giorgian, what a truly **horrible** idea you have there: when `equal` is empty (i.e. at the start) your "more readable" form ALWAYS gives `plain`, so nothing will ever be appended to `equal` -- so much for "more readable" (eek!-).
Alex Martelli
+4  A: 

You can use itertools.groupby() for this:

import itertools
f = lambda x: '=' in x
groups = itertools.groupby(sorted(data, key=f), key=f)
for k, g in groups:
    print k, list(g)
unbeknown
I thought he asked for elegant
Steg
The sorting makes it nlog(n) while a simpler O(n) solution exists.
Rafał Dowgird
+1  A: 

Another approach is to use the filter function, although it's not the most efficient solution.
Example:

>>> l = ['a=s','aa','bb','=', 'a+b']
>>> l2 = filter(lambda s: '=' in s, l)
>>> l3 = filter(lambda s: '+' in s, l)
>>> l2
['a=s', '=']
>>> l3
['a+b']
Nick D
+2  A: 
def which_list(s):
    if "=" in s: 
        return 1
    return 0

lists = [[], []]

for arg in arguments:
    lists[which_list(arg)].append(arg)

plain, equal = lists

If you have more types of data, add an if clause to which_list, and initialize lists to more empty lists.

Ned Batchelder
which_list can be a lot simpler - see my answer. Otherwise, our answers are pretty close.
Paul McGuire
Indeed, it can be, but the OP asked about the case of more than two possibilities, which I wanted to deal with in the initial design.
Ned Batchelder
In that case, I think a dictionary of lists would be clearer. The list indices are meaningless.
Roberto Bonvallet
Yes, a dict of lists is better!
Ned Batchelder
+3  A: 

I would just go for two list comprehensions. While that does incur some overhead (two loops on the list), it is more Pythonic to use a list comprehension than to use a for. It's also (in my mind) much more readable than using all sorts of really cool tricks, but that less people know about.

Edan Maor
+1  A: 

I put this together, and then see that Ned Batchelder was already on this same tack. I chose to package the splitting method instead of the list chooser, though, and to just use the implicit 0/1 values for False and True.

def split_on_condition(source, condition):
    ret = [],[]
    for s in source:
        ret[condition(s)].append(s)
    return ret

src = "z=1;q=2;lady jane;y=a;lucy in the sky".split(';')

plain,equal = split_on_condition(src, lambda s:'=' in s)
Paul McGuire
+1  A: 

Your approach is the best one. For sorting just into two lists it can't get clearer than that. If you want it to be a one-liner, encapsulate it in a function:

def classify(arguments):
    equal, plain = [], []
    for arg in arguments:
        if '=' in arg:
            equal.append(arg)
        else:
            plain.append(arg)
    return equal, plain


equal, plain = classify(lst)
Roberto Bonvallet
+2  A: 

I would go for Edan's approach, e.g.

equal = [arg for arg in arguments if '=' in arg]
plain = [arg for arg in arguments if '=' not in arg]
Kimvais
+2  A: 

Hi,

I read somewhere here that you might be interested in a solution that will work for more than two identifiers (equals sign and space).

The following solution just requires you update the uniques set with anything you would like to match, the results are placed in a dictionary of lists with the identifier as the key.

uniques = set('= ')
matches = dict((key, []) for key in uniques)

for arg in args:
    key = set(arg) & uniques
    try:
        matches[key.pop()].append(arg)
    except KeyError:
        # code to handle where arg does not contain = or ' '.

Now the above code assumes that you will only have a single match for your identifier in your arg. I.e that you don't have an arg that looks like this 'John= equalspace'. You will have to also think about how you would like to treat cases that don't match anything in the set (KeyError occurs.)

Simon Edwards