views:

119

answers:

2

I am looking for the most pythonic way of splitting a list of numbers into smaller lists based on a number missing in the sequence. For example, if the initial list was:

seq1 = [1, 2, 3, 4, 6, 7, 8, 9, 10]

the function would yield:

[[1, 2, 3, 4], [6, 7, 8, 9, 10]]

or

seq2 = [1, 2, 4, 5, 6, 8, 9, 10]

would result in:

[[1, 2], [4, 5, 6], [8, 9, 10]]

+17  A: 

From the python documentation:

>>> # Find runs of consecutive numbers using groupby.  The key to the solution
>>> # is differencing with a range so that consecutive numbers all appear in
>>> # same group.
>>> data = [ 1,  4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
>>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
...     print map(itemgetter(1), g)
...
[1]
[4, 5, 6]
[10]
[15, 16, 17, 18]
[22]
[25, 26, 27, 28]

The groupby() function from the itertools module generates a break every time the key function changes its return value. The trick is that the return value is the number in the list minus the position of the element in the list. This difference changes when there is a gap in the numbers.

The itemgetter() function is from the operator module, you'll have to import this and the itertools module for this example to work.

Full example with your data:

>>> from operator import itemgetter
>>> from itertools import *
>>> seq2 = [1, 2, 4, 5, 6, 8, 9, 10]
>>> list = []
>>> for k, g in groupby(enumerate(seq2), lambda (i,x):i-x):
...     list.append(map(itemgetter(1), g))
... 
>>> print list
[[1, 2], [4, 5, 6], [8, 9, 10]]

Or as a list comprehension:

>>> [map(itemgetter(1), g) for k, g in groupby(enumerate(seq2), lambda (i,x):i-x)]
[[1, 2], [4, 5, 6], [8, 9, 10]]
Fabian
I don't think you can get more pythonic than this:)
extraneon
+1. If the documentation says to do it that way, it's probably a good way to do it.
Brian
The whole itertools module is worth a look, there is some nice stuff in there.
Fabian
Each time I encounter itertools I get surprised.
Wayne Werner
Exactly what I was looking for. And what better place to look for it than the Python docs -- hiding in plain sight.
billybandicoot
+3  A: 

Another option which doesn't need itertools etc.:

>>> data = [1, 4, 5, 6, 10, 15, 16, 17, 18, 22, 25, 26, 27, 28]
>>> spl = [0]+[i for i in range(1,len(data)) if data[i]-data[i-1]>1]+[None]
>>> [data[b:e] for (b, e) in [(spl[i-1],spl[i]) for i in range(1,len(spl))]]
... [[1], [4, 5, 6], [10], [15, 16, 17, 18], [22], [25, 26, 27, 28]]
muksie