ansaurus

Question

Python: item for item until stopterm in item?

Answer 1

+1 A:

Forget this

Leaving the answer, but marking it community. See Stewen Huwig's answer for the correct way to do this.

Well, [x for x in enumerable] will run until enumerable doesn't produce data any more, the if-part will simply allow you to filter along the way.

What you can do is add a function, and filter your enumerable through it:

def enum_until(source, until_criteria):
    for k in source:
        if until_criteria(k):
            break;
        yield k;

def enum_while(source, while_criteria):
    for k in source:
        if not while_criteria(k):
            break;
        yield k;

l1 = [k for k in enum_until(xrange(1, 100000), lambda y: y == 100)];
l2 = [k for k in enum_while(xrange(1, 100000), lambda y: y < 100)];
print l1;
print l2;

Of course, it doesn't look as nice as what you wanted...

Lasse V. Karlsen 2008-12-03 14:20:10

That's a lot of work reimplementing the itertools module in the standard library...

Steven Huwig 2008-12-03 14:36:50

Sure is! NIH in the works!

Lasse V. Karlsen 2008-12-03 14:39:55

I bet you've had to do this for JavaScript, right? I know I have, in the cases where third-party libraries are not permitted...

Steven Huwig 2008-12-03 14:42:50

Answer 2

+4 A:

" I was hoping for a 1 thought->1 Python line mapping." Wouldn't we all love a programming language that somehow mirrored our natural language?

You can achieve that, you just need to define your unique thoughts once. Then you have the 1:1 mapping you were hoping for.

def usefulLines( aFile ):
    for line in aFile:
        yield line
        if line == stopterm:
            break

Is pretty much it.

for line in usefulLines( aFile ):
    # process a line, knowing it occurs BEFORE stopterm.

There are more general approaches. The lassevk answers with enum_while and enum_until are generalizations of this simple design pattern.

S.Lott 2008-12-03 14:26:43

Answer 3

+10 A:

from itertools import takewhile
usefullines = takewhile(lambda x: not re.search(stopterm, x), lines)

from itertools import takewhile
usefullines = takewhile(lambda x: stopterm not in x, lines)

Here's a way that keeps the stopterm line:

def useful_lines(lines, stopterm):
    for line in lines:
        if stopterm in line:
            yield line
            break
        yield line

usefullines = useful_lines(lines, stopterm)
# or...
for line in useful_lines(lines, stopterm):
    # ... do stuff
    pass

Steven Huwig 2008-12-03 14:28:58

You can use x.find(stopterm) instead if this is just matching a string

Mapad 2008-12-03 14:31:49

or indeed, stopterm (not) in x as the original question has it.

Steven Huwig 2008-12-03 14:35:02

Wow, didn't know this one. Of course it exists, it's Python. Silly me. +1

e-satis 2008-12-03 14:43:26

itertools, operator, and (C)StringIO are the unsung modules of the standard library... everyone should learn them in my opinion. :)

Steven Huwig 2008-12-03 14:47:45

Wow - I'm just gonna go and delete my re example - thanks for showing me this!

Patrick Harrington 2008-12-03 14:51:56

V. useful, but I'd like the line that includes stopterm, which is admittedly ambiguous in the list syntax version.Can I retrieve lines.next() after using takewhile?

Phil H 2008-12-03 15:01:43

Yes, you can use lines.next() after using takewhile. I'll update my example.

Steven Huwig 2008-12-03 15:25:41

On second thought you lose the stopterm entirely using takewhile. You'll probably want to write a generator function like S. Lott's answer, but one that returns the stopterm line as well.How irritating of Python. :-|

Steven Huwig 2008-12-03 15:38:14

An addition is if you what to start when then stat when you can do this: linesbetween = takewhile(lambda x: x!=stop_at, list(dropwhile(lambda y: y!=start_at, lines)))

Vincent 2010-02-19 05:21:24

Answer 4

+1 A:

I think it's fine to keep it that way. Sophisticated one-liner are not really pythonic, and since Guido had to put a limit somewhere, I guess this is it...

e-satis 2008-12-03 14:42:23

Answer 5

+2 A:

That itertools solution is neat. I have earlier been amazed by itertools.groupby, one handy tool.

But still i was just tinkering if I could do this without itertools. So here it is (There is one assumption and one drawback though: the file is not huge and its goes for one extra complete iteration over the lines, respectively.)

I created a sample file named "try":

hello
world
happy
day
bye

once you read the file and have the lines in a variable name lines:

lines=open('./try').readlines()

then

    print [each for each in lines if lines.index(each)<=[lines.index(line) for line in lines if 'happy' in line][0]]

gives the result:

['hello\n', 'world\n', 'happy\n']

and

print [each for each in lines if lines.index(each)<=[lines.index(line) for line in lines if 'day' in line][0]]

gives the result:

['hello\n', 'world\n', 'happy\n', 'day\n']

So you got the last line - the stop term line also included.

JV 2008-12-03 15:48:06

Answer 6

A:

I'd go with Steven Huwig's or S.Lott's solutions for real usage, but as a slightly hacky solution, here's one way to obtain this behaviour:

def stop(): raise StopIteration()

usefullines = list(stop() if stopterm in line else line for line in file)

It's slightly abusing the fact that anything that raises StopIteration will abort the current iteration (here the generator expression) and uglier to read than your desired syntax, but will work.

Brian 2008-12-03 16:57:50

ansaurus

tags:

views:

answers:

Python: item for item until stopterm in item?

Forget this

related questions