When I'm teaching someone programming (just about any language) I introduce for
loops with terminology similar to this code example:
for eachItem in someList:
doSomething(eachItem)
... which, conveniently enough, is syntactically valid Python code.
The Python range()
function simple returns or generates a list of integers from some lower bound (zero, by default) up to (but not including) some upper bound, possibly in increments (steps) of some other number (one, by default).
So range(5)
returns (or possibly generates) a sequence: 0, 1, 2, 3, 4 (up to but not including the upper bound).
A call to range(2,10)
would return: 2, 3, 4, 5, 6, 7, 8, 9
A call to `range(2,12,3) would return: 2, 5, 8, 11
Notice that I said, a couple times, that Python's range()
function returns or generates a sequence. This is a relatively advanced distinction which usually won't be an issue for a novice. In older versions of Python range()
built a list (allocated memory for it and populated with with values) and returned a reference to that list. This could be inefficient for large ranges which might consume quite a bit of memory and for some situations where you might want to iterate over some potentially large range of numbers but were likely to "break
" out of the loop early (after finding some particular item in which you were interested, for example).
Python supports more efficient ways of implementing the same semantics (of doing the same thing) through a programming construct called a generator. Instead of allocating and populating the entire list and return it as a static data structure, Python can instantiate an object with the requisite information (upper and lower bounds and step/increment value) ... and return a reference to that. The object then keeps track of which number it returned most recently and computes the new values until it hits the upper bound (and which point it signals the end of the sequence to the caller using an exception called "StopIteration"). This technique (computing values dynamically rather than all at once, up-front) is referred to as "lazy evaluation."
Other constructs in the language (such as those underlying the for
loop) can then work with that object as though it were a list.
For most cases don't have to know whether your version of Python is using the old implementation of range()
or the newer one based on generators. You can just use it and be happy. If you're working with ranges of millions of items, or creating thousands of different ranges of thousands each, then you might notice a performance penalty for using range()
on an old version of Python. In such cases you could re-think your design and use while
loops, or create objects which implement the "lazy evaluation" semantics of a generator, or use the xrange()
version of range()
if your version of Python includes it, or the range()
function from a version of Python that uses the generators implicitly.
Concepts such as generators, and more general forms of lazy evaluation, permeate Python programming as you go beyond the basics. They are usually things you don't have to know for simple programming tasks but which become significant as you try to work with larger data sets or within tighter constraints (time/performance or memory bounds, for example).