views:

276

answers:

5
def flattenList(toFlatten):
 final=[]
 for el in toFlatten:
  if isinstance(el, list):
   final.extend(flattenList(el))
  else:
   final.append(el)
 return final

When I don't know how deeply the lists will nest, this is the only way I can think to do this.

+1  A: 

This blog post covers a variety of flattening strategies. There's a little more discussion over here.

Callahad
That second link is full of sorely-outdated, originally-awful code.
Mike Graham
+7  A: 
  1. You should avoid typechecking in Python. In this case, this means avoiding arbitrarily-nested structures where you distinguish by type. You can build your own node type which you can traverse by methods other than typechecking, like looking at a specific attribute.

  2. For flattening one level or exactly n levels, look at itertools.chain.from_iterable.

  3. I don't know what you mean by "functional". This code is pretty functional: it uses recursion (not to its credit!) and it doesn't mutate its argument. (Strictly speaking, it does use mutable state for building a list, but that is just how you do it in Python.

  4. I suppose one more functional attribute would be lazy evaluation. You could implement this thusly

    def flatten(toFlatten):
        for item in toFlatten:
            if isinstance(item, list): # Ewww, typchecking
                for subitem in flatten(item): # they are considering adding 
                    yield subitem             # "yield from" to the  language
                                              # to give this pattern syntax
            else:
                yield item
    
  5. Recursion is very limited in Python (at least, in all its major implementations) and should generally be avoided for arbitrary depth. It is quite possible to rewrite this (and all recursive code) to use iteration, which will make this more scalable (and less functional, which is a good thing in Python, which is not especially suited for FP.)

Mike Graham
@Mike Graham: I want to be able to pass in lists containing lists containing lists, containing lists, etc., and flatten them all the way down to one single result. I want: [1,2,[3,4,5,6], 7,8,[9,10,[11,12,[13,[14,15],16],17],20]]To come out as: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 20]If I knew in advance how deeply the lists nested, this would suffice: def mergeLists(seq): return reduce(lambda x,y: x+y, seq)
Ishpeck
1) Stop wanting that. Define your structure differently. 2) Your `reduce` strategy is multiplicative in big-O order; the linear algorithms are obvious.
Mike Graham
A more general answer: http://stackoverflow.com/questions/2158395/flatten-an-irregular-list-of-lists-in-python/2158532#2158532
J.F. Sebastian
I like that answer less than this bad answer. Typechecking sucks, but if you're using type to indicate data, it should be a *very specific type*. For example, I should be free to make my *own* string-ish type that iterates of length-1 sequences of itself and not subclass basestring. If I *was* going to use type to indicate this information, not only would I limit it to exactly `list`, I'd probably subclass `list` so that I could typecheck for *exactly* what I want.
Mike Graham
It's worth noting nested lists like [1,2,[3,4,5,6], 7,8,[9,10,[11,12,[13,[14,15],16],17],20]] are really trees: Node([Leaf(1),Leaf(2),Node([Leaf(3),Leaf(4)... etc.
sdcvvc
+1  A: 

Here's another option (though there may be something cleaner than type-checking, like testing if something is iterable and hence not an "atom"):

def flatten(lst):
    if not isinstance(lst,list):
        return [lst]
    else:
        return reduce(lambda x,y:x+y,[flatten(x) for x in lst],[])

It's based on something scheme-like.

mitmatt
`try: return reduce(...); except TypeError: return [lst]`
Debilski
That's perfect. Thanks.
Ishpeck
@mitmatt, 1) The function `lambda x, y: x + y` is called `operator.add`. 2) Your algorithm is so unnecessarily O(n * m)ish, when the linear algorithm is more obvious.
Mike Graham
@Debilski, Normally exception handling would be better form, but recursive flattening is tricky. Try thinking about what will happen if you have a string in there!
Mike Graham
@Mike Graham: True. That wouldn’t be nice.
Debilski
@Mike, I'm not sure what your m and n refer to, but the code I provided is linear in the number of atoms in the deep-list argument: each atom is listified exactly once, and the reduce will only call + once for each of its recursively-flattened args (meaning only one append per atom). Calling operator.add means another import statement, but it's certainly a valid alternative!
mitmatt
@mitmatt, I posted an answer that hopefully explains the quadratic nature of your algorithm.
Mike Graham
Easier yet: replace `reduce(lambda x,y:x+y,` with `sum(`.
Debilski
@Mike, your post seems to have a mistake: concatenating linked lists doesn't require copying the elements, but rather just a single pointer operation. Each of those + operations is constant time.
mitmatt
@Debliski, This is still quadratic, which is completely unavoidable. Also, `sum` defaults to starting to sum with 0, which isn't appropriate in this case; you'd have to pass a `start` param.
Mike Graham
@mitmatt, Python lists **aren't** linked lists. They are mutable array-like objects that copy when you use `+`.
Mike Graham
@Mike Graham: The start param would be there, if following the replacement rule I gave. But I did not mean that it would generally be faster, it was just another option to calling `operator.add`.
Debilski
Though the `sum` is faster than `reduce` + `lambda`.
Debilski
@Mike, that's actually implementation-dependent and not specified by the Python language, but I think you'll find that the CPython implementation provides (amortized) constant-time appends:http://wiki.python.org/moin/TimeComplexity
mitmatt
@Deblinski, Using `reduce` or `sum` are both slow, because they are O(n^2) and the right algorithm to use is O(n)
Mike Graham
My mistake: the appropriate function in that reference is probably "extend", which does indeed require more operations. From a data structures perspective, that's not entirely necessary (e.g. [1]), but it does seem to be the case with CPython.[1] http://en.wikipedia.org/wiki/Deque#Implementations
mitmatt
@mittmatt, `collections.deque` is implemented using a doubly-linked list. `list`is implemented using a resizing array, and supports the appropriate operations (like O(1) indexing). In any event, even if list *was* a doubly linked list (which it isn't!), it isn't clear whether the right thing to do with concatenation is copying or not, if it is mutable.
Mike Graham
@mitmatt, anything that doesn't have algorithmic complexities identical to `list` in CPython for its list and that is trying to claim to be Python should have its developers shot.
Mike Graham
+2  A: 

Under the doc for itertools, there's a flatten() function

ghostdog74
Not on that page there isn't.
Andrew Aylett
@Andrew Aylett, It is a recipe, not in the module itself. It's on that page.
Mike Graham
@Mike: Admit it, you edited the documentation after I posted my comment. Not sure how I missed that -- it didn't come up when I searched the page at work :P.
Andrew Aylett
+2  A: 

This answer explains why you do not want to use reduce for this in Python.

Consider the snippet

reduce(operator.add, [[1], [2], [3], [4], [5]])

What does this have to do?

[1] + [2] => [1, 2]
[1, 2] + [3] => This makes a new list, having to go over 1, then 2, then 3. [1, 2, 3]
[1, 2, 3] + [4] => This has to copy the 1, 2, and 3 and then put 4 in the new list
[1, 2, 3, 4] + [5] => The length of stuff I have to copy gets bigger each time!

This quadratic behavior is completely avoidable: the original solution (and any number of other solutions) does not form these intermediate copying steps.

Mike Graham