views:

551

answers:

7

hi all,

what is the best way to divide a list into roughly equal parts? for example, if I have a list with 54 elements and i want to split it into 3 roughly equal parts? I'd like the parts to be as even as possible, hopefully assigning the elements that do not fit in a way that gives the most equal parts. to give a concrete case, if the list has 7 elements and I want to split it into 2 parts, I'd ideally want to get the first 3 elements in one bin, the second should have 4 elements.

so to summarize I'm looking for something like even_split(l, n) that breaks l into roughly n-different parts.

the best I can think of is something that was posted here:

def chunks(l, n):
    """ Yield successive n-sized chunks from l.
    """
    for i in xrange(0, len(l), n):
        yield l[i:i+n]

example:

l = range(54)
chunks_of_three = chunks(l, 3)

but this gives chunks of 3, rather than 3 equal parts. I could simply iterate over this and take the first element of each column, call that part one, then take the second and put it in part two, etc. but that seems inefficient and inelegant. it also breaks the order of the list.

any ideas on this?

thanks.

A: 

Uses grouper():

def chunks(seq, num):
  return grouper(int(math.ceil(len(seq) / float(num))), seq)
Ignacio Vazquez-Abrams
thanks -- is there a way to do this without a filler at all? I.e. have the last list not be the same size as the previous.
Not with this method.
Ignacio Vazquez-Abrams
A: 

Here's one that could work:

def chunkIt(seq, num):
  avg = len(seq) / float(num)
  out = []
  last = 0.0

  while last < len(seq):
    out.append(seq[int(last):int(last + avg)])
    last += avg

  return out

Testing:

>>> chunkIt(range(10), 3)
[[0, 1, 2], [3, 4, 5], [6, 7, 8, 9]]
>>> chunkIt(range(11), 3)
[[0, 1, 2], [3, 4, 5, 6], [7, 8, 9, 10]]
>>> chunkIt(range(12), 3)
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
Max Shawabkeh
+3  A: 

Changing the code to yield n chunks rather than chunks of n:

def chunks(l, n):
    """ Yield n successive chunks from l.
    """
    newn = int(len(l) / n)
    for i in xrange(0, n-1):
        yield l[i*newn:i*newn+newn]
    yield l[n*newn-newn:]

l = range(56)
three_chunks = chunks (l, 3)
print three_chunks.next()
print three_chunks.next()
print three_chunks.next()

which gives:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
[18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
[36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]

This will assign the extra elements to the final group which is not perfect but well within your specification of "roughly N equal parts" :-) By that, I mean 56 elements would be better as (19,19,18) whereas this gives (18,18,20).

You can get the more balanced output with the following code:

#!/usr/bin/python
def chunks(l, n):
    """ Yield n successive chunks from l.
    """
    newn = int(1.0 * len(l) / n + 0.5)
    for i in xrange(0, n-1):
        yield l[i*newn:i*newn+newn]
    yield l[n*newn-newn:]

l = range(56)
three_chunks = chunks (l, 3)
print three_chunks.next()
print three_chunks.next()
print three_chunks.next()

which outputs:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
[19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37]
[38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]
paxdiablo
this gives me a strange result. for p in chunks(range(54), 3): print len(p) returns 18, 18, 51...
Fixed, that, it was the final yield.
paxdiablo
+1  A: 

Here is one that adds None to make the lists equal length

>>> from itertools import izip_longest
>>> def chunks(l, n):
    """ Yield n successive chunks from l. Pads extra spaces with None
    """
    return list(zip(*izip_longest(*[iter(l)]*n)))

>>> l=range(54)

>>> chunks(l,3)
[(0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51), (1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52), (2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47, 50, 53)]

>>> chunks(l,4)
[(0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52), (1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53), (2, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, None), (3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, None)]

>>> chunks(l,5)
[(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50), (1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51), (2, 7, 12, 17, 22, 27, 32, 37, 42, 47, 52), (3, 8, 13, 18, 23, 28, 33, 38, 43, 48, 53), (4, 9, 14, 19, 24, 29, 34, 39, 44, 49, None)]
gnibbler
+1, you are right! thanks gnibbler.
S.Mark
A: 

You can write it fairly simply as a list generator:

def split(a, n):
    k, m = len(a) / n, len(a) % n
    return (a[i * k + min(i, m):(i + 1) * k + min(i + 1, m)] for i in xrange(n))

Example:

>>> list(split(range(11), 3))
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10]]
tixxit
A: 

As long as you don't want anything silly like continuous chunks:

>>> def chunkify(lst,n):
...     return [ lst[i::n] for i in xrange(n) ]
... 
>>> chunkify( range(13), 3)
[[0, 3, 6, 9, 12], [1, 4, 7, 10], [2, 5, 8, 11]]
job
I wouldn't say continuous chunks are silly. Perhaps you'd like to keep the chunks sorted (ie. chunk[0] < chunk[1]), for instance.
tixxit
I was kidding. But if you really didn't care, this way with list comprehension is nice and concise.
job
+1  A: 

Have a look at numpy.split:

>>> a = numpy.array([1,2,3,4])
>>> numpy.split(a, 2)
[array([1, 2]), array([3, 4])]
dalloliogm