views:

195

answers:

5

I have a list of lists (generated with a simple list comprehension):

>>> base_lists = [[a, b] for a in range(1, 3) for b in range(1, 6)]
>>> base_lists

[[1,1],[1,2],[1,3],[1,4],[1,5],[2,1],[2,2],[2,3],[2,4],[2,5]]

I want to turn this entire list into a tuple containing all of the values in the lists, i.e.:

resulting_tuple = (1,1,1,2,1,3,1,4,1,5,2,1,2,2,2,3,2,4,2,5)

What would the most effective way to do this be? (A way to generate this same tuple with list comprehension would also be an acceptable answer.) I've looked at answers here and in the Python documentation, however I have been unable to find a suitable one.

EDIT:

Many thanks to all who answered!

+9  A: 
tuple(x for sublist in base_lists for x in sublist)

Edit: note that, with base_lists so short, the genexp (with unlimited memory available) is slow. Consider the following file tu.py:

base_lists = [[a, b] for a in range(1, 3) for b in range(1, 6)]

def genexp():
  return tuple(x for sublist in base_lists for x in sublist)

def listcomp():
  return tuple([x for sublist in base_lists for x in sublist])

def withsum():
  return tuple(sum(base_lists,[]))

import itertools as it

def withit():
  return tuple(it.chain(*base_lists))

Now:

$ python -mtimeit -s'import tu' 'tu.genexp()'
100000 loops, best of 3: 7.86 usec per loop
$ python -mtimeit -s'import tu' 'tu.withsum()'
100000 loops, best of 3: 5.79 usec per loop
$ python -mtimeit -s'import tu' 'tu.withit()'
100000 loops, best of 3: 5.17 usec per loop
$ python -mtimeit -s'import tu' 'tu.listcomp()'
100000 loops, best of 3: 5.33 usec per loop

When lists are longer (i.e., when performance really matters) things are a bit different. E.g., putting a 100 * on the RHS defining base_lists:

$ python -mtimeit -s'import tu' 'tu.genexp()'
1000 loops, best of 3: 408 usec per loop
$ python -mtimeit -s'import tu' 'tu.withsum()'
100 loops, best of 3: 5.07 msec per loop
$ python -mtimeit -s'import tu' 'tu.withit()'
10000 loops, best of 3: 148 usec per loop
$ python -mtimeit -s'import tu' 'tu.listcomp()'
1000 loops, best of 3: 278 usec per loop

so for long lists only withsum is a performance disaster -- the others are in the same ballpark, although clearly itertools has the edge, and list comprehensions (when abundant memory is available, as it always will be in microbenchmarks;-) are faster than genexps.

Using 1000 *, genexp slows down by about 10 times (wrt the 100 *), withit and listcomp by about 12 times, and withsum by about 180 times (withsum is O(N squared), plus it's starting to suffer from serious heap fragmentation at that size).

Alex Martelli
Does precisely what I need. Thank you Alex!
Sean Vieira
+3  A: 
>>> sum(base_lists,[])
[1, 1, 1, 2, 1, 3, 1, 4, 1, 5, 2, 1, 2, 2, 2, 3, 2, 4, 2, 5]
>>> tuple(sum(base_lists,[]))
(1, 1, 1, 2, 1, 3, 1, 4, 1, 5, 2, 1, 2, 2, 2, 3, 2, 4, 2, 5)
gnibbler
Using `sum` for anything but numbers is a bad idea (as I've often tried to explain in SO -- I was the originator of Python's `sum` so I feel a pang of guilt whenever I see it misused;-). Hint: O(N squared).
Alex Martelli
Ouch, good to know thanks Alex. I guess there'll be a good reason why it can't be O(N) :/
gnibbler
+2  A: 

resulting_tuple = tuple(item for l in base_lists for item in l)

Tendayi Mawushe
+4  A: 
from itertools import chain
base_lists = [[a, b] for a in range(1, 3) for b in range(1, 6)]

print tuple(chain(*base_lists))
CTT
Voted up for using itertools, which is an underestimated module
Daniel Goldberg
A: 
>>> arr=[]
>>> base_lists = [[a, b] for a in range(1, 3) for b in range(1, 6)]
>>> [ arr.extend(i) for i in base_lists ]
[None, None, None, None, None, None, None, None, None, None]
>>> arr
[1, 1, 1, 2, 1, 3, 1, 4, 1, 5, 2, 1, 2, 2, 2, 3, 2, 4, 2, 5]
>>> tuple(arr)
(1, 1, 1, 2, 1, 3, 1, 4, 1, 5, 2, 1, 2, 2, 2, 3, 2, 4, 2, 5)