I tried to make this as efficient as possible.
It uses a generator; those unfamiliar with these beasts are advised to check out their documentation and that of yield expressions.
Basically it creates a generator of values from the subsequence that can be reset by sending it a true value. If the generator is reset, it starts yielding again from the beginning of sub
.
Then it just compares successive values of sequence
with the generator yields, resetting the generator if they don't match.
When the generator runs out of values, i.e. reaches the end of sub
without being reset, that means that we've found our match.
Since it works for any sequence, you can even use it on strings, in which case it behaves similarly to str.find
, except that it returns False
instead of -1
.
As a further note: I think that the second value of the returned tuple should, in keeping with Python standards, normally be one higher. i.e. "string"[0:2] == "st"
. But the spec says otherwise, so that's how this works.
It depends on if this is meant to be a general-purpose routine or if it's implementing some specific goal; in the latter case it might be better to implement a general-purpose routine and then wrap it in a function which twiddles the return value to suit the spec.
def reiterator(sub):
"""Yield elements of a sequence, resetting if sent ``True``."""
it = iter(sub)
while True:
if (yield it.next()):
it = iter(sub)
def find_in_sequence(sub, sequence):
"""Find a subsequence in a sequence.
>>> find_in_sequence([2, 1], [-1, 0, 1, 2])
False
>>> find_in_sequence([-1, 1, 2], [-1, 0, 1, 2])
False
>>> find_in_sequence([0, 1, 2], [-1, 0, 1, 2])
(1, 3)
>>> find_in_sequence("subsequence",
... "This sequence contains a subsequence.")
(25, 35)
>>> find_in_sequence("subsequence", "This one doesn't.")
False
"""
start = None
sub_items = reiterator(sub)
sub_item = sub_items.next()
for index, item in enumerate(sequence):
if item == sub_item:
if start is None: start = index
else:
start = None
try:
sub_item = sub_items.send(start is None)
except StopIteration:
# If the subsequence is depleted, we win!
return (start, index)
return False