What is the rationale behind the advocated use of the for i in xrange(...)
-style looping constructs in Python? For simple integer looping, the difference in overheads is substantial. I conducted a simple test using two pieces of code:
File idiomatic.py
:
#!/usr/bin/env python
M = 10000
N = 10000
if __name__ == "__main__":
x, y = 0, 0
for x in xrange(N):
for y in xrange(M):
pass
File cstyle.py
:
#!/usr/bin/env python
M = 10000
N = 10000
if __name__ == "__main__":
x, y = 0, 0
while x < N:
while y < M:
y += 1
x += 1
Profiling results were as follows:
bash-3.1$ time python cstyle.py
real 0m0.109s
user 0m0.015s
sys 0m0.000s
bash-3.1$ time python idiomatic.py
real 0m4.492s
user 0m0.000s
sys 0m0.031s
I can understand why the Pythonic version is slower -- I imagine it has a lot to do with calling xrange N times, perhaps this could be eliminated if there was a way to rewind a generator. However, with this deal of difference in execution time, why would one prefer to use the Pythonic version?
Edit: I conducted the tests again using the code Mr. Martelli provided, and the results were indeed better now:
I thought I'd enumerate the conclusions from the thread here:
1) Lots of code at the module scope is a bad idea, even if the code is enclosed in an if __name__ == "__main__":
block.
2) *Curiously enough, modifying the code that belonged to thebadone
to my incorrect version (letting y grow without resetting) produced little difference in performance, even for larger values of M and N.