In Python the interface of an iterable is a subset of the iterator interface. This has the advantage that in many cases they can be treated in the same way. However, there is an important semantic difference between the two, since for an iterable __iter__
returns a new iterator object and not just self
. How can I test that an iterable is really an iterable and not an iterator? Conceptually I understand iterables to be collections, while an iterator only manages the iteration (i.e. keeps track of the position) but is not a collection itself.
The difference is for example important when one wants to loop multiple times. If an iterator is given then the second loop will not work since the iterator was already used up and directly raises StopIteration
.
It is tempting to test for a next
method, but this seems dangerous and somehow wrong. Should I just check that the second loop was empty?
Is there any way to do such a test in a more pythonic way? I know that this sound like a classic case of LBYL against EAFP, so maybe I should just give up? Or am I missing something?
Edit: S.Lott says in his answer below that this is primarily a problem of wanting to do multiple passes over the iterator, and that one should not do this in the first place. However, in my case the data is very large and depending on the situation has to be passed over multiple times for data processing (there is absolutely no way around this).
The iterable is also provided by the user, and for situations where a single pass is enough it will work with an iterator (e.g. created by a generator for simplicity). But it would be nice to safeguard against the case were a user provides only an iterator when multiple passes are needed.
Edit 2:
Actually this is a very nice Example for Abstract Base Classes. The __iter__
methods in an iterator and an iterable have the same name but are sematically different! So hasattr
is useless, but isinstance
provides a clean solution.