ansaurus

Question

how to iterate from a specific point in a sequence (Python)

Answer 1

A:

One simpler way (albeit slower) would be to use filter and filter for keys before/after that date. Filter has to process each element in the list as opposed to slicing not needing to.

2010-10-24 01:57:35

Answer 2

+1 A:

yes , you can do like this:

import itertools
from datetime import datetime

data = {
      "2008-11-10 17:53:59":"data",
      "2005-11-10 17:53:59":"data",
}

list_ = data.keys()
new_list = [datetime.strptime(x, "%Y-%m-%d %H:%M:%S") for x in list_]

begin_date = datetime.strptime("2007-11-10 17:53:59", "%Y-%m-%d %H:%M:%S")

for i in itertools.ifilter(lambda x: x > begin_date, new_list):
    print i

singularity 2010-10-24 02:04:49

why not use a generator expression instead of creating a list for `new_list`?

aaronasterling 2010-10-24 02:38:54

@aaronasterling: yes, it will work perfectly and it can also save us some memory usage too when dealing with a large dictionary.

singularity 2010-10-24 02:57:55

Answer 3

A:

You can do

def on_or_after(date):
    from itertools import dropwhile
    sorted_items = sorted(date_dictionary.iteritems())
    def before_date(pair):
        return pair[0] < date
    on_or_after_date = dropwhile(before_date, sorted_items)

which I think is about as efficient as it's going to get if you're just doing one such lookup on each sorted collection. on_or_after_date will iterate (date, value) pairs.

Another option would be to build a dictionary as a separate index into the sorted list:

sorted_items = sorted(date_dictionary.iteritems())
date_index = dict((key, i) for i, key in enumerate(sorted_items.keys()))

and then get the items on or after a date with

def on_or_after(date):
    return sorted_items[date_index[date]:]

This second approach will be faster if you're going to be doing a lot of lookups on the same series of sorted dates (which it sounds like you are).

If you want really speedy slicing of the sorted dates, you might see some improvement by storing it in a tuple instead of a list. I could be wrong about that though.

note the above code is untested, let me know if it doesn't work and you can't sort out why.

intuited 2010-10-24 02:46:10

Answer 4

A:

First off, this question isn't related to dicts. You're operating on a sorted list. You're using the results on a dict, but that's not relevant to the question.

You want the bisect module, which implements binary searching. Starting from your code:

import bisect
mydict = {
      "2001-01-01":"data1",
      "2005-01-02":"data2",
      "2002-01-01":"data3",
      "2004-01-02":"data4",
}

# ['2001-01-01', '2002-01-01', '2004-01-02', '2005-01-02']:
sorted_dates = sorted(mydict)

# Iterates over 2002-01-01, 2004-01-02 and 2005-01-02:
offset = bisect.bisect_left(sorted_dates, "2002-01-01")
for item in sorted_dates[offset:]:
    print item

Glenn Maynard 2010-10-24 03:15:47

(The original question is scattered and confusing and poorly-stated, so I'm not sure if this is what he's actually asking. I read it a couple more times and I'm still not sure. *shrug*)

Glenn Maynard 2010-10-24 03:20:06

Answer 5

+1 A:

If you know for a fact that the items in your sequence are sorted, you can just use a generator expression:

(item for item in seq if item >= 5)

This returns a generator; it doesn't actually traverse the list until you iterate over it, i.e.:

for item in (item for item in seq if item > 5)
    print item

will only traverse seq once.

Using a generator expression like this is pretty much identical to using itertools.ifilter, which produces a generator that iterates over the list returning only values that meet the filter criterion:

>>> import itertools
>>> seq = [1, 2, 3, 4, 5, 6, 7]
>>> list(itertools.ifilter(lambda x: x>=3, seq))
[3, 4, 5, 6, 7]

I'm not sure why (except for backwards compatibility) we need itertools.ifilter anymore now that we have generator expressions, but other methods in itertools are invaluable.

If, for instance, you don't know that your sequence is sorted, and you still want to return everything in the sequence from a known item and beyond, you can't use a generator expression. Instead, use itertools.dropwhile. This produces a generator that iterates over the list skipping values until it finds one that meets the filter criterion:

>>> seq = [1, 2, 4, 3, 5, 6, 7]
>>> list(itertools.dropwhile(lambda x: x != 3, seq))
[3, 5, 6, 7]

As far as searching backwards goes, this will only work if the sequence you're using is actually a sequence (like a list, i.e. something that has an end and can be navigated backwards) and not just any iterable (e.g. a generator that returns the next prime number). To do this, use the reversed function, e.g.:

(item for item in reversed(seq) if item >= 5)

Robert Rossney 2010-10-24 21:21:03

This is the sort of thing I am looking for. Unfortunately, I have no experience using Generators or lambda functions - so I didn't understand most of what you wrote (I'm quite new to Python). I have to admit that most of what I want to do seems to involve generators and/or lambdas - so I guess its time to bite the bullet. I have seen this documentation: http://heather.cs.ucdavis.edu/~matloff/Python/PyIterGen.pdf Is that a good start/introduction or do you have a better resource link?

skyeagle 2010-10-25 00:19:35

On generators, David Beazley to the rescue: http://www.dabeaz.com/generators/. As far as lambda functions go, a lambda function's just a function that's declared inline. You could just as easily write a function called `item_in_range(x)` and write `itertools.ifilter(item_in_range, seq)`.

Robert Rossney 2010-10-25 04:45:34

Answer 6

A:

Given your relevant facts:

>>> import bisect
>>> def find_fwd_iter(S, i):
...     j = bisect.bisect_left(S, i)
...     for k in xrange(j, len(S)):
...         yield S[k]
...
>>> def find_bkwd_iter(S, i):
...     j = bisect.bisect_left(S, i)
...     for k in xrange(j, -1, -1):
...         yield S[k]
...
>>> L = [100, 150, 200, 300, 400]
>>> list(find_fwd_iter(L, 200))
[200, 300, 400]
>>> list(find_bkwd_iter(L, 200))
[200, 150, 100]
>>>

John Machin 2010-10-25 01:13:36

Nice, simple, short, sweet and above all understandable - even to a neophyte like me. Now that's what I call poetry

skyeagle 2010-10-25 10:26:47

ansaurus

tags:

views:

answers:

how to iterate from a specific point in a sequence (Python)

related questions