views:

43

answers:

1

I have an iterator with a __len__ method defined. Questions:

If you call list(y) and y has a __len__ method defined, then __len__ is called.

   1) Why?

In my output, you will see that the len(list(y)) is 0 on the first try. If you look at the list output, you will see that on the first call, I receive an empty list, and on the second call I receive the "correct" list.

   2) Why is it returning a list of length zero at all?

   3) Why does the list length correct itself on all subsequent calls?

Also notice that calling "enumerate" is not the issue. Class C does the same thing but using a while loop and calls to next().

Code:

showcalls = False

class A(object):
    _length = None
    def __iter__(self):
        if showcalls:
            print "iter"
        self.i = 0
        return self        
    def next(self):
        if showcalls:
            print "next"
        i = self.i + 1
        self.i = i
        if i > 2:
            raise StopIteration
        else:
            return i

class B(A):
    def __len__(self):
        if showcalls:
            print "len"
        if self._length is None:
            for i,x in enumerate(self):
                pass
            self._length = i
            return i
        else:
            return self._length

class C(A):
    def __len__(self):
        if showcalls:
            print "len"
        if self._length is None:
            i = 0
            while True:
                try:
                    self.next()
                except StopIteration:
                    self._length = i
                    return i
                else:
                    i += 1
        else:
            return self._length

if __name__ == '__main__':
    a = A()
    print len(list(a)), len(list(a)), len(list(a))
    print
    b = B()
    print len(list(b)), len(list(b)), len(list(b))
    print
    c = C()
    print len(list(c)), len(list(c)), len(list(c))

Output:

2 2 2

0 2 2

0 2 2
+6  A: 

If you call list(y) and y has a len method defined, then len is called. why?

Because it's faster to build the resulting list with the final length, if known from the start, than to begin with an empty list and append one item at a time. And __len__ is, and must be, 100% guaranteed to be reliable.

IOW, do not implement special methods like __len__ if and when you can't return a reliable value.

As for the second question, your implementations of __len__ are broken because they consume the iterator (and don't return it to its pristine state) -- so they leave no items for following .next calls, so the list constructor gets a StopIteration and decides that your __len__ was just flaky (it's unfortunately flakier than poor list can guess...!-).

Alex Martelli
So if I set i back equal to zero in my `__len__` function, that should do the trick. Note: This is actually coming from a class that wraps around a file containing multi-line items. So I don't know the number of items in the file until I scan it once (unfortunately there is no header information in the file that tells me this).
PyProg
@PyProg, yep, making sure your `__len__` is "idempotent" (fancy talk for: won't alter the object's state if called repeatedly) is a good rule of thumb!
Alex Martelli