views:

248

answers:

7

Is there a pythonic preferred way to do this that I would do in C++:


for s in str:
    if r = regex.match(s):
        print r.groups()

I really like that syntax, imo it's a lot cleaner than having temporary variables everywhere. The only other way that's not overly complex is


for s in str:
    r = regex.match(s)
    if r:
        print r.groups()

I guess I'm complaining about a pretty pedantic issue. I just miss the former syntax.

+1  A: 

There's a recipe to make an assignment expression but it's very hacky. Your first option doesn't compile so your second option is the way to go.

## {{{ http://code.activestate.com/recipes/202234/ (r2)
import sys
def set(**kw):
    assert len(kw)==1

    a = sys._getframe(1)
    a.f_locals.update(kw)
    return kw.values()[0]

#
# sample
#

A=range(10)

while set(x=A.pop()):
    print x
## end of http://code.activestate.com/recipes/202234/ }}}

As you can see, production code shouldn't touch this hack with a ten foot, double bagged stick.

aaronasterling
+1  A: 

This might be an overly simplistic answer, but would you consider this:

for s in str:
    if regex.match(s):
        print regex.match(s).groups()
inspectorG4dget
For performance, you never want to repeat the function call.The extra line to assign the temporary might look clunky, but it has a purpose.
smci
+1 because sometimes simplicity is worth it; but it's not just performance you're losing here, you're also violating DRY - code like this is harder to refactor and more bug prone in the long run - e.g., consider (evil) side effects or code bloat that might eventually hide the symmetry.
Eamon Nerbonne
+1  A: 

There is no pythonic way to do something that is not pythonic. It's that way for a reason, because 1, allowing statements in the conditional part of an if statement would make the grammar pretty ugly, for instance, if you allowed assignment statements in if conditions, why not also allow if statements? how would you actually write that? C like languages don't have this problem, because they don't have assignment statements. They make do with just assignment expressions and expression statements.

the second reason is because of the way

if foo = bar:
    pass

looks very similar to

if foo == bar:
    pass

even if you are clever enough to type the correct one, and even if most of the members on your team are sharp enough to notice it, are you sure that the one you are looking at now is exactly what is supposed to be there? it's not unreasonable for a new dev to see this and just fix it (one way or the other) and now its definitely wrong.

TokenMacGuy
+9  A: 

How about

for r in [regex.match(s) for s in str]:
    if r:
        print r.groups()

or a bit more functional

for r in filter(None, map(regex.match, str)):
    print r.groups()
Ivo van der Wijk
I like the second version. The first version could replace the list comprehension with a generator expression, so if you break out of the loop it will only have done the matches that you needed up to that point, instead of doing them all first then looping through them.
Dave Kirby
I like this. this seems the most straight forward. And list comprehension is my favorite part of python
Falmarri
+2  A: 

Perhaps it's a bit hacky, but using a function object's attributes to store the last result allows you to do something along these lines:

def fn(regex, s):
    fn.match = regex.match(s) # save result
    return fn.match

for s in strings:
    if fn(regex, s):
        print fn.match.groups()

Or more generically:

def cache(value):
    cache.value = value
    return value

for s in strings:
    if cache(regex.match(s)):
        print cache.value.groups()

Note that although the "value" saved can be a collection of a number of things, this approach is limited to holding only one such at a time, so more than one function may be required to handle situations where multiple values need to be saved simultaneously, such as in nested function calls, loops or other threads. So, in accordance with the DRY principle, rather than writing each one, a factory function can help:

def Cache():
    def cache(value):
        cache.value = value
        return value
    return cache

cache1 = Cache()
for s in strings:
    if cache1(regex.match(s)):
        # use another at same time
        cache2 = Cache()
        if cache2(somethingelse) != cache1.value:
            process(cache2.value)
        print cache1.value.groups()
          ...
martineau
what's hacky about that? functions get attributes. use them.
aaronasterling
about it only holding one at a time: nope. `def cache(**kwargs): for key in kwargs: setattr(cache, key, kwargs[key])`
aaronasterling
@AaronMcSmooth. Interesting idea -- not sure how to use that for handling multiple threads though, as each one would need to use a unique key.
martineau
@AaronMcSmooth. Your version doesn't explicitly return a value when it's called which means it defaults to `None`. That's not good because that's a legal value to store and because it doesn't allow the caller to use the value(s) passed in.
martineau
@martineau. problem with code in comments. It should return `any(kwargs.values())` or `all(kwargs.values)`. could be `allcache` or `anycache`. It dosn't solve thread problems but it wasn't meant to.
aaronasterling
@AaronMcSmooth. Nope, that doesn't resolve the issue since the builtins `any()` and `all()` would only be returning whether the named value being set is considered to be logically True or False, not the actual value itself, and so wouldn't work for the general case.
martineau
@martineau. It works because (as I understand it at least) the only place where the return value of `cache` is used is in a context where it's going to be converted to a boolean anyways such as `if cache(i=foo(), j=bar())` or `while cache(i=foo(), j=bar())`.
aaronasterling
@AaronMcSmooth. What you propose is OK if all that was ever wanted was to check if any or all of the value(s) would be considered true in the Pythonian sense. However it wouldn't handle simple cases like `if cache(i=foo()) > 42` properly which C/C++ also supports. Nevertheless, for the boolean-check cases it sounds useful.
martineau
@matineau. good call. so it's three functions, `cache`, `any_cache` and `all_cache`.
aaronasterling
@AaronMcSmooth. At the risk of beating this to death and getting too far off topic, instead of three different functions, you could have a single one with either a positional or reserved keyword argument that tells it what and/or how to determine the return value.
martineau
+1  A: 

Whenever I find that my loop logic is getting complex I do what I would with any other bit of logic: I extract it to a function. In Python it is a lot easier than some other languages to do this cleanly.

So extract the code that just generates the items of interest:

def matching(strings, regex):
    for s in strings:
        r = regex.match(s)
        if r: yield r

and then when you want to use it, the loop itself is as simple as they get:

for r in matching(strings, regex):
    print r.groups()
Duncan
Good programming technique in general, especially in the case where the function returns multiple values. However seems like overkill to me when all you want to do is assign and test a single value.
martineau
Everyone is going to have a different personal limit here. The particular example given is probably borderline: extracting the loop makes it a little cleaner, but leaving it in isn't all that bad really. If you don't think the original code looks messy then leave it as it is. I'm simply pointing out that in the general case this is how I approach loops that look messy.
Duncan
A: 

Yet another answer is to use the "Assign and test" recipe for allowing assigning and testing in a single statement published in O'Reilly Media's July 2002 1st edition of the Python Cookbook and also online at Activestate. It's object-oriented, the crux of which is this:

# from http://code.activestate.com/recipes/66061
class DataHolder:
    def __init__(self, value=None):
        self.value = value
    def set(self, value):
        self.value = value
        return value
    def get(self):
        return self.value

This can optionally be modified slightly by adding the custom __call__() method shown below to provide an alternative way to retrieve instances' values -- which, while less explicit, seems like a completely logical thing for a 'DataHolder' object to do when called, I think.

    def __call__(self):
        return self.value

Allowing your example to be re-written:

r = DataHolder()
for s in strings:
    if r.set(regex.match(s))
        print r.get().groups()
# or
        print r().groups()

As also noted in the original recipe, if you use it a lot, adding the class and/or an instance of it to the __builtin__ module to make it globally available is very tempting despite the potential downsides:

import __builtin__
__builtin__.DataHolder = DataHolder
__builtin__.data = DataHolder()

As I mentioned in my other answer to this question, it must be noted that this approach is limited to holding only one result/value at a time, so more than one instance is required to handle situations where multiple values need to be saved simultaneously, such as in nested function calls, loops or other threads. That doesn't mean you should use it or the other answer, just that more effort will be required.

martineau