



I have a Python script which takes as input a list of integers, which I need to work with four integers at a time. Unfortunately, I don't have control of the input, or I'd have it passed in as a list of four-element tuples. Currently, I'm iterating over it this way:

for i in xrange(0, len(ints), 4):
    # dummy op for example code
    foo += ints[i] * ints[i + 1] + ints[i + 2] * ints[i + 3]

It looks a lot like "C-think", though, which makes me suspect there's a more pythonic way of dealing with this situation. The list is discarded after iterating, so it needn't be preserved. Perhaps something like this would be better?

while ints:
    foo += ints[0] * ints[1] + ints[2] * ints[3]
    ints[0:4] = []

Still doesn't quite "feel" right, though. :-/

There doesn't seem to be a pretty way to do this. Here is a page that has a number of methods, including:

def split_seq(seq, size):
    newseq = []
    splitsize = 1.0/size*len(seq)
    for i in range(size):
    return newseq

In your second method, I would advance to the next group of 4 by doing this:

ints = ints[4:]

However, I haven't done any performance measurement so I don't know which one might be more efficient.

Having said that, I would usually choose the first method. It's not pretty, but that's often a consequence of interfacing with the outside world.

Greg Hewgill
+5  A: 
import itertools
def chunks(iterable,size):
    it = iter(iterable)
    chunk = tuple(itertools.islice(it,size))
    while chunk:
        yield chunk
        chunk = tuple(itertools.islice(it,size))

# though this will throw ValueError if the length of ints
# isn't a multiple of four:
for x1,x2,x3,x4 in chunks(ints,4):
    foo += x1 + x2 + x3 + x4

for chunk in chunks(ints,4):
    foo += sum(chunk)

Another way:

import itertools
def chunks2(iterable,size,filler=None):
    it = itertools.chain(iterable,itertools.repeat(filler,size-1))
    chunk = tuple(itertools.islice(it,size))
    while len(chunk) == size:
        yield chunk
        chunk = tuple(itertools.islice(it,size))

# x2, x3 and x4 could get the value 0 if the length is not
# a multiple of 4.
for x1,x2,x3,x4 in chunks2(ints,4,0):
    foo += x1 + x2 + x3 + x4
+1 for using generators, seams like the most "pythonic" out of all suggested solutions
It's rather long and clumsy for something so easy, which isn't very pythonic at all. I prefer S. Lott's version
+11  A: 

I'm a fan of

chunkSize= 4
for i in xrange(0, len(ints), chunkSize):
    chunk = ints[i:i+chunkSize]
    # process chunk of size <= chunkSize
+27  A: 
def chunker(seq, size):
    return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))

Simple. Easy. Fast. Works with any sequence:

text = "I am a very, very helpful text"

for group in chunker(text, 7):
   print repr(group),
# 'I am a ' 'very, v' 'ery hel' 'pful te' 'xt'

print '|'.join(chunker(text, 10))
# I am a ver|y, very he|lpful text

animals = ['cat', 'dog', 'rabbit', 'duck', 'bird', 'cow', 'gnu', 'fish']

for group in chunker(animals, 3):
    print group
# ['cat', 'dog', 'rabbit']
# ['duck', 'bird', 'cow']
# ['gnu', 'fish']
Clear and compact. Very pythonic. :-)
Ben Blank
@Carlos Crasborn's version works for any iterable (not just sequences as the above code); it is concise and probably just as fast or even faster. Though it might be a bit obscure (unclear) for people unfamiliar with `itertools` module.
J.F. Sebastian
@J.F. Sebastian — Now that I've gotten the chance to figure out *why* his code works, I feel compelled to change my accepted answer (which I *hate* doing). I love this answer, too, @nosklo, but that izip_longest trick seems tailor-made for my situation.
Ben Blank
Agreed. This is the most generic and pythonic way. Clear and concise. (and works on app engine)
Matt Williamson

If the list is large, the highest-performing way to do this will be to use a generator:

def get_chunk(iterable, chunk_size):
    result = []
    for item in iterable:
        if len(result) == chunk_size:
            yield tuple(result)
            result = []
    if len(result) > 0:
        yield tuple(result)

for x in get_chunk([1,2,3,4,5,6,7,8,9,10], 3):
    print x

(1, 2, 3)
(4, 5, 6)
(7, 8, 9)
Robert Rossney
(I think that MizardX's itertools suggestion is functionally equivalent to this.)
Robert Rossney
(Actually, on reflection, no I don't. itertools.islice returns an iterator, but it doesn't use an existing one.)
Robert Rossney

If the lists are the same size, you can combine them into lists of 4-tuples with zip(). For example:

# Four lists of four elements each.

l1 = range(0, 4)
l2 = range(4, 8)
l3 = range(8, 12)
l4 = range(12, 16)

for i1, i2, i3, i4 in zip(l1, l2, l3, l4):

Here's what the zip() function produces:

>>> print l1
[0, 1, 2, 3]
>>> print l2
[4, 5, 6, 7]
>>> print l3
[8, 9, 10, 11]
>>> print l4
[12, 13, 14, 15]
>>> print zip(l1, l2, l3, l4)
[(0, 4, 8, 12), (1, 5, 9, 13), (2, 6, 10, 14), (3, 7, 11, 15)]

If the lists are large, and you don't want to combine them into a bigger list, use itertools.izip(), which produces an iterator, rather than a list.

from itertools import izip

for i1, i2, i3, i4 in izip(l1, l2, l3, l4):
Brian Clapper
+4  A: 
from itertools import izip_longest

def chunker(iterable, chunksize, filler):
    return izip_longest(*[iter(iterable)]*chunksize, fillvalue=filler)
Pedro Henriques
+1 iterators and conciseness.
A readable way to do it is
J.F. Sebastian
I've removed spaces around '=' in the arguments list (see PEP8).
J.F. Sebastian
+13  A: 

Modified from the recipes section of Python's itertools docs:

def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    return izip_longest(*args, fillvalue=fillvalue)

In pesudocode to keep the example terse.

grouper('ABCDEFG', 3, 'x') --> 'ABC' 'DEF' 'Gxx'

Note: izip_longest is new to Python 2.6

I know it is taken literally from documentation but I'd change the order of parameters: `grouper(iterable, chunksize)` and `izip_longest(*args, fillvalue=fillvalue)`
J.F. Sebastian
Very nice! Probably the most compact method here, considering it even combines chunking and padding. Unfortunately, it's pretty opaque. Even having read up on izip_longest, I'm still not sure how this works. :-/
Ben Blank
@J.F. Sebastian: Thanks. That does follow common convention.
Finally got a chance to play around with this in a python session. For those who are as confused as I was, this is feeding the same iterator to izip_longest multiple times, causing it to consume successive values of the same sequence rather than striped values from separate sequences. I love it!
Ben Blank
What's the best way to filter back out the fillvalue? ([item for item in items if item is not fillvalue] for items in grouper(iterable))?
+1  A: 

Since nobody's mentioned it yet here's a zip() solution:

>>> def chunker(iterable, chunksize):
...     return zip(*[iter(iterable)]*chunksize)

It works only if your sequence's length is always divisible by the chunk size or you don't care about a trailing chunk if it isn't.


>>> s = '1234567890'
>>> chunker(s, 3)
[('1', '2', '3'), ('4', '5', '6'), ('7', '8', '9')]
>>> chunker(s, 4)
[('1', '2', '3', '4'), ('5', '6', '7', '8')]
>>> chunker(s, 5)
[('1', '2', '3', '4', '5'), ('6', '7', '8', '9', '0')]

Or using itertools.izip to return an iterator instead of a list:

>>> from itertools import izip
>>> def chunker(iterable, chunksize):
...     return izip(*[iter(iterable)]*chunksize)

Padding can be fixed using @ΤΖΩΤΖΙΟΥ's answer:

>>> from itertools import chain, izip, repeat
>>> def chunker(iterable, chunksize, fillvalue=None):
...     it   = chain(iterable, repeat(fillvalue, chunksize-1))
...     args = [it] * chunksize
...     return izip(*args)
J.F. Sebastian

Using itertools and iter, this simple function works both for sequences (tuples, lists) and iterables (no padding):

def grouper(n, it):
  """grouper(3, 'ABCDEFG') --> ABC DEF G"""
  return iter(lambda: list(itertools.islice(it, n)), [])

>>> list(grouper(2, iter([1,2,3,4,5])))
[[1,2], [3,4], [5]]