views:

337

answers:

3

I am looking for an python inbuilt function (or mechanism) to segment a list into required segment lengths (without mutating the input list). Here is the code I already have:

>>> def split_list(list, seg_length):
...     inlist = list[:]
...     outlist = []
...     
...     while inlist:
...         outlist.append(inlist[0:seg_length])
...         inlist[0:seg_length] = []
...     
...     return outlist
... 
>>> alist = range(10)
>>> split_list(alist, 3)
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
+8  A: 

You can use list comprehension:

>>> seg_length = 3
>>> a = range(10)
>>> [a[x:x+seg_length] for x in range(0,len(a),seg_length)]
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
OmerGertel
You can also make this a generator, i.e. (a[x:x+seg_length] for x in range(0,len(a),seg_length)) which will be more efficient for large sequences.
mhawke
+1  A: 

not the same output, I still think the grouper function is helpful:

from itertools import izip_longest
def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    return izip_longest(*args, fillvalue=fillvalue)

for Python2.4 and 2.5 that does not have izip_longest:

from itertools import izip, chain, repeat
def grouper(iterable, n, padvalue=None):
    return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)

some demo code and output:

alist = range(10)
print list(grouper(alist, 3))

output: [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, None, None)]

sunqiang
+2  A: 

How do you need to use the output? If you only need to iterate over it, you are better off creating an iterable, one that yields your groups:

def split_by(sequence, length):
    iterable = iter(sequence)
    def yield_length():
        for i in xrange(length):
             yield iterable.next()
    while True:
        res = list(yield_length())
        if not res:
            raise StopIteration
        yield res

Usage example:

>>> alist = range(10)
>>> list(split_by(alist, 3))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]

This uses far less memory than trying to construct the whole list in memory at once, if you are only looping over the result, because it only constructs one subset at a time:

>>> for subset in split_by(alist, 3):
...     print subset
...
[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
[9]
Martijn Pieters
missing parenthesis at the end: list(split_by(alist, 3)
mtasic
Good catch, corrected.
Martijn Pieters
+1. A very sensible approach. I will keep this in mind if my input data grows in size.
kjfletch