views:

102

answers:

3
a = [(1,2),(3,1),(4,4),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5),(5,5)]
# Quite a lot tuples in the list, 6 digits~
# I want to split it into rows and columns.
rows = 5
cols = 5

Data structure is
rows and cols are the index for the bit list
[rows, cols, (data)]

I use loop to do this, but it takes too long for processing a big amount of tuples.

processed_data = []
index = 0
for h in range(0, rows - 1):
    for w in range(0, cols - 1):
        li = []
        li = [h, w, a[index]]
        processed_data.append(li)
        index += 1

This operation takes too long, is there a way to do optimization? Thanks very much!

+2  A: 

It's not at all clear to me what you want but here's a shot at the same loop in a more optimized manner:

import itertools as it

index = it.count(0) 
processed_data = [[h, w, a[next(index)]] 
                 for h in xrange(0, rows - 1)
                 for w in xrange(0, cols - 1)]

or, since you've already imported itertools,

index = ite.count(0)
indices = it.product(xrange(0, rows-1), xrange(0, cols-1))
processed_data = [[h, w, a[next(index)]] for h, w in indices]

The reason that these are faster is that they use list comprehensions instead of for loops. List comprehensions have their own opcode, LIST_APPEND, which routes directly to the append method on the list that's being constructed. In a normal for loop, the virtual machine has to go through the whole processes of looking up the append method on the list object which is fairly pricey.

Also, itertools is implemented in C so if it's not faster for the same algorithm, then there's a bug in itertools.

aaronasterling
Yes, it's about 3 seconds shorter!
@user469652, how long was it taking before?
aaronasterling
8 seconds before.
+1. Looks good to me
Merlyn Morgan-Graham
A: 

Sounds like you want to split it into evenly-sized chunks.

Ignacio Vazquez-Abrams
no, that shouldn't be right because only one element of the old list ends up in each element of the new list in the code given.
aaronasterling
@aaronsterling: The only difference is that it doesn't include `h` and `w`. But this isn't necessarily a problem since you can derive them from where you are in the nested structure via `enumerate()`.
Ignacio Vazquez-Abrams
@ignacio, no, the difference is that slices of the list are being returned by the generator you linked too because OP of that question wanted chunks of data. OP of this question wants essentially the same list decorated with additional indices.
aaronasterling
@aaronsterling: The adornments can be derived on the other side. Yes, it means a little more work on the other side, but it means less work on this side.
Ignacio Vazquez-Abrams
+2  A: 

Fine, if you really want the indices that badly...

[divmod(i, cols) + (x,) for i, x in itertools.izip(itertools.count(), a)]
Ignacio Vazquez-Abrams