views:

316

answers:

8

I know this question is kind of stupid, maybe it's a just a part of writing code but it seems defining simple functions can really hurt performance severely... I've tried this simple test:

def make_legal_foo_string(x):
    return "This is a foo string: " + str(x)

def sum_up_to(x):
    return x*(x+1)/2

def foo(x):
    return [make_legal_foo_string(x),sum_up_to(x),x+1]

def bar(x):
    return ''.join([str(foo(x))," -- bar !! "])

it's very good style and makes code clear but it can be three times as slow as just writing it literally. It's inescapable for functions that can have side effects but it's actually almost trivial to define some functions that just should literally be replaced with lines of code every time they appear, translate the source code into that and only then compile. Same I think for magic numbers, it doesn't take a lot of time to read from memory but if they're not supposed to be changed then why not just replace every instance of 'magic' with a literal before the code compiles?

A: 

Figuring out what to make into a function and what to just include inline is something of an art. Many factors (performance, readability, maintainability) feed into the equation.

I actually find your example kind of silly in many ways - a function that just returns it's argument? Unless it's an overload that's changing the rules, it's stupid. A function to square things? Again, why bother. Your function 'foo' should probably return a string, so that it can be used directly:

''.join(foo(x)," -- bar !! "])

That's probably a more correct level of encapsulation in this example.

As I say, it really depends on the circumstances. Unfortunately, this is the sort of thing that doesn't lend itself well to examples.

Michael Kohne
The examples are very silly I agree. Still, why call foo every time? I could define foo as a special function that should not be compiled, just literally 'translated' in my code every time it appears! Then I don't have to sacrifice anything
ooboo
Why call anything? Why make functions at all? You could write everything with cut and paste and have one big ball of code. But that's not very readable. You have a draw lines somewhere, and I'd probably draw it at foo. If performance was a big problem for a given app, then I might do what you suggest.Basically, there ARE NO GOOD RULES. And toy examples like this don't give us anything real to sink our teeth into. It's not a real function, so why does performance matter? There's simply not enough information in the toy for us to make good decisions, they are all just arbitrary.
Michael Kohne
+1  A: 

I don't know how good python compilers are, but the answer to this question for many languages is that the compiler will optimize calls to small procedures / functions / methods by inlining them. In fact, in some language implementations you generally get better performance by NOT trying to "micro-optimize" the code yourself.

Stephen C
Except in Python :)
chrispy
Python does not do this; its compiler is quite naive.
David Seiler
Sounds like someone ought to fix it then.
Stephen C
Wish I could upvote twice. Decent compilers will inline as needed (or not), so no need to worry about. If you use a compiler/interpreter that cannot do this, then you obviously don't care about performance anyway :-).
sleske
I don't think it's because the compiler is naive, it's because functions are objects they can be changed in runtime
ooboo
@ooboo: a smart compiler can cope with that kind of thing. For example, a Java JIT compiler can cope with optimization assumptions that are invalidated later on when new classes are loaded dynamically.
Stephen C
@Stephen: so in this scenario, does it mean Java recompiles the whole module if it sees that an inlined function was changed by somebody?
ilya n.
This particular scenario does not arise in Java ... but basically yes. (The Java scenario I'm thinking of is eliminating of method dispatching when the compiler knows that a given method is not overridden in any subclass. Then someone dynamically loads a new subclass.)
Stephen C
+6  A: 

Function call overheads are not big; you won't normally notice them. You only see them in this case because your actual code (x*x) is itself so completely trivial. In any real program that does real work, the amount of time spent in function-calling overhead will be negligably small.

(Not that I'd really recommend using foo, identity and square in the example, in any case; they're so trivial it's just as readable to have them inline and they don't really encapsulate or abstract anything.)

if they're not supposed to be changed then why not just replace every instance of 'magic' with a literal before the code compiles?

Because programs are written for to be easy for you to read and maintain. You could replace constants with their literal values, but it'd make the program harder to work with, for a benefit that is so tiny you'll probably never even be able to measure it: the height of premature optimisation.

bobince
I don't mean to replace them with literals. I mean something like:let M = 42def f(x): return x + MThis is what the viewer sees. Then after compiling it should just replace, literally, in the source code, every instance of M with 42, and then compile the new code.
ooboo
Python doesn't have immutable constants; a script might come along and change ‘M’ later. Python is a late-binding language and it would go against the basic way the language works to change it. Having said that, I would *love* to see a scripting language like Python with early binding (more usually associated with functional languages). Not for performance reasons, as the performance increase is minimal, but because it solves some of the difficulties of closures.
bobince
+1  A: 

What you are talking about is the effect of inlining functions for gaining efficiency.

It is certainly true in your Python example, that encapsulation hurts performance. But there are some counter example to it:

  1. In Java, defining getter&setter instead of defining public member variables does not result in performance degradation as the JIT inline the getters&setters.

  2. sometimes calling a function repetedly may be better than performing inlining as the code executed may then fit in the cache. Inlining may cause code explosion...

Pierre
A: 

IMO, this is related to Function Call Costs. Which are negligible usually, but not zero. Splitting the code in a lot of very small functions may hurt. Especially in interpreted languages where full optimization is not available.

Inlining functions may improve performance but it may also deteriorate. See, for example, C++ FQA Lite for explanations (“Inlining can make code faster by eliminating function call overhead, or slower by generating too much code, causing instruction cache misses”). This is not C++ specific. Better leave optimizations for compiler/interpreter unless they are really necessary.

By the way, I don't see a huge difference between two versions:

$ python bench.py 
fine-grained function decomposition: 5.46632194519
one-liner: 4.46827578545
$ python --version
Python 2.5.2

I think this result is acceptable. See bench.py in the pastebin.

jetxee
A: 

There is a performance hit for using functions, since there is the overhead of jumping to a new address, pushing registers on the stack, and returning at the end. This overhead is very small, however, and even in peformance critial systems worrying about such overhead is most likely premature optimization.

Many languages avoid these issues in small frequenty called functions by using inlining, which is essentially what you do above.

Python does not do inlining. The closest thing you can do is use macros to replace the function calls.

This sort of performance issue is better served by another language, if you need the sort of speed gained by inlining (mostly marginal, and sometimes detremental) then you need to consider not using python for whatever you are working on.

windfinder
+2  A: 

Encapsulation is about one thing and one thing only: Readability. If you're really so worried about performance that you're willing to start stripping out encapsulated logic, you may as well just start coding in assembly.

Encapsulation also assists in debugging and feature adding. Consider the following: lets say you have a simple game and need to add code that depletes the players health under some circumstances. Easy, yes?

def DamagePlayer(dmg):
    player.health -= dmg;

This IS very trivial code, so it's very tempting to simply scatter "player.health -=" everywhere. But what if later you want to add a powerup that halves damage done to the player while active? If the logic is still encapsulated, it's easy:

def DamagePlayer(dmg):
    if player.hasCoolPowerUp:
        player.health -= dmg / 2
    else
        player.health -= dmg

Now, consider if you had neglected to encapulate that bit of logic because of it's simplicity. Now you're looking at coding the same logic into 50 different places, at least one of which you are almost certain to forget, which leads to weird bugs like: "When player has powerup all damage is halved except when hit by AlienSheep enemies..."

Do you want to have problems with Alien Sheep? I don't think so. :)

In all seriousness, the point I'm trying to make is that encapsulation is a very good thing in the right circumstances. Of course, it's also easy to over-encapsulate, which can be just as problematic. Also, there are situations where the speed really truly does matter (though they are rare), and that extra few clock cycles is worth it. About the only way to find the right balance is practice. Don't shun encapsulation because it's slower, though. The benefits usually far outweigh the costs.

Toji
A: 

There is a good technical reason why what you suggested is impossible. In Python, functions, constants, and everything else are accessible at runtime and can be changed at any time if necessary; they could also be changed by outside modules. This is an explicit promise of Python and one would need some extremely important reasons to break it.

For example, here's the common logging idiom:

# beginning of the file xxx.py
log = lambda *x: None 

def something():
    ...
    log(...)
    ...

(here log does nothing), and then at some other module or at the interactive prompt:

import xxx
xxx.log = print
xxx.something()

As you see, here log is modified by completely different module --- or by the user --- so that logging now works. That would be impossible if log was optimized away.

Similarly, if an exception was to happen in make_legal_foo_string (this is possible, e.g. if x.__str__() is broken and returns None) you'd be hit with a source quote from a wrong line and even perhaps from a wrong file in your scenario.

There are some tools that in fact apply some optimizations to Python code, but I don't think of the kind you suggested.

ilya n.