views:

309

answers:

6

Python has a flag -O that you can execute the interpreter with. The option will generate "optimized" bytecode (written to .pyo files), and given twice, it will discard docstrings. From Python's man page:

   -O     Turn on basic optimizations.  This changes the  filename  exten‐
          sion  for  compiled  (bytecode)  files from .pyc to .pyo.  Given
          twice, causes docstrings to be discarded.

This option's two major features as I see it are:

  • Strip all assert statements. This trades defense against corrupt program state for speed. But don't you need a ton of assert statements for this to make a difference? Do you have any code where this is worthwhile (and sane?)

  • Strip all docstrings. In what application is the memory usage so critical, that this is a win? Why not push everything into modules written in C?

What is the use of this option? Does it have a real-world value?

+4  A: 

On stripping assert statements: this is a standard option in the C world, where many people believe part of the definition of ASSERT is that it doesn't run in production code. Whether stripping them out or not makes a difference depends less on how many asserts there are than on how much work those asserts do:

def foo(x):
    assert x in huge_global_computation_to_check_all_possible_x_values()
    # ok, go ahead and use x...

Most asserts are not like that, of course, but it's important to remember that you can do stuff like that.

As for stripping docstrings, it does seem like a quaint holdover from a simpler time, though I guess there are memory-constrained environments where it could make a difference.

Ned Batchelder
history is important, good point. However, I don't want to see toy examples, I want to see what asserts are used in real-world code and if it makes a difference.
kaizer.se
Memory speed is growing far slower than CPU speed, *especially* if you consider that we keep adding processors faster than adding memory bandwidth. So, memory is the new disk and L2 cache is the new memory. And L2 caches are *tiny* (compared to memory), and they actually keep getting smaller. (Core2 has 6144KiB, i7 only 256KiB, for example.) So, counting bytes is actually becoming useful again.
Jörg W Mittag
+2  A: 

You've pretty much figured it out: It does practically nothing at all. You're almost never going to see speed or memory gains, unless you're severely hurting for RAM.

Jay P.
+1  A: 

I imagine that the heaviest users of -O are py2exe py2app and similar.

I've personally never found a use for -O directly.

Nick Craig-Wood
and *why* does py2exe use it?
kaizer.se
When creating the stand-alone executable, there is no need for docstrings. They only take up space in memory.
gnud
+1  A: 

If you have assertions in frequently called code (e.g. in an inner loop), stripping them can certainly make a difference. Extreme example:

$ python    -c 'import timeit;print timeit.repeat("assert True")'
[0.088717937469482422, 0.088625192642211914, 0.088654994964599609]
$ python -O -c 'import timeit;print timeit.repeat("assert True")'
[0.029736995697021484, 0.029587030410766602, 0.029623985290527344]

In real scenarios, savings will usually be much less.

Stripping the docstrings might reduce the size of your code, and hence your working set.

In many cases, the performance impact will be negligible, but as always with optimizations, the only way to be sure is to measure.

oefe
this question is about real-world code. btw, this is more practical: `python -mtimeit "" "assert(True)"` (setup in first argument)
kaizer.se
This seems to be a strange example to me. You reduce code that is trivial to code that is nonexistant—that doesn't show much about practical speed gains I think. A realistic use case would be an operation that makes a lot of assumptions that are expensive to check compared to performing the operation, but you believe they should always be satisfied. For example, if I'm trying to return the roots of a parabola, I could check that b**2 - 4*a*c > 0 to ensure real roots, if that's what I am interested in. Many useful formulae have lots of constraints.
Mike Graham
Also, `assert` is a statement that I meant to be used like "`assert True`", not `assert(True)`. This becomes important when you add the message, as `assert a == b, "Must be true"` is very different than `assert(a == b, "Must be true")`, and in particular the latter always passes.
Mike Graham
@kaizer.se: no stmt is first argument, setup is second; in your example, the assert would be in the setup, so that -O has no measurable effect
oefe
@Mike: of course it's strange, as most examples reduced to the most extreme. Basically, the optimized version example measures the overhead of the timeit loop, and the unoptimized version shows the overhead of assert itself. Real-life savings may be more or less, depending on what's more epensive: your working code or the assertions. Often, but not always, assertions are relatively trival, thus may claim that usually the savings will be less.Thanks for the reminder about the parentheses, I removed them!
oefe
@oefe: setup is the first argument, then stmt, on the command-line.
kaizer.se
+2  A: 

I have never encountered a good reason to use -O. I have always assumed its main purpose is in case at some point in the future some meaningful optimization is added.

Mike Graham
The Python devs fell for YAGNI!?
kaizer.se
Well, it does do a couple things, they just aren't typically all that useful.
Mike Graham
+1  A: 

Another use for the -O flag is that the value of the __debug__ builtin variable is set to False.

So, basically, your code can have a lot of "debugging" paths like:

if __debug__:
     # output all your favourite debugging information
     # and then more

which, when running under -O, won't even be included as bytecode in the .pyo file; a poor man's C-ish #ifdef.

Remember that docstrings are being dropped only when the flag is -OO.

ΤΖΩΤΖΙΟΥ
well that makes me think of another fundamental problem with -O: Generally, it is the developer's role to decide what happens with/without optimization mode, but the user who can decide how the python interpreter is invoked, normally! This makes your example next to useless since you can't rely on a specific mode being used.
kaizer.se
Wow. I thought you wanted to know what is the real world use of this option. Thanks for finding my answer next to useless. By the way, if you want someone to justify the choices of Guido and the rest of the Python core team, you shouldn't be asking questions here; finally, you *can* rely on a specific mode being used, the programmer can control whether optimization is used or not; ask a relevant question in SO as to how. I hereby declare your assumptions next to wrong and my time next to lost. Cheers. Sorry for disappointing you.
ΤΖΩΤΖΙΟΥ
There is no reason for me to be disappointed about getting lots of answers to my question -- I like the conversations in stackoverflow. I mean what I say but I talk about the example you showed. The fact that you showed it or you yourself are not judged negatively at all.
kaizer.se