views:

334

answers:

2

How much slower can I reasonably expect perform: to be than a literal message send, on average? Should I avoid sending perform: in a loop, similar to the admonishment given to Perl/Python programmers to avoid calling eval("...") (Compiler evaluate: in Smalltalk) in a loop?

I'm concerned mainly with Squeak, but interested in other Smalltalks as well. Also, is the overhead greater with the perform:with: variants? Thank you

+6  A: 

#perform: is not like eval(). The problem with eval() (performance-wise, anyway) is that it has to compile the code you're sending it at runtime, which is a very slow operation. Smalltalk's #perform:, on the other hand, is equivalent to Ruby's send() or Objective-C's performSelector: (in fact, both of these languages were strongly inspired by Smalltalk). Languages like these already look up methods based on their name — #perform: just lets you specify the name at runtime rather than write-time. It doesn't have to parse any syntax or compile anything like eval().

It will be a little slower (the cost of one extra method call at least), but it isn't like eval(). Also, the variants with more arguments shouldn't show any difference in speed vs. just plain perform:whatever. I can't talk with that much experience about Squeak specifically, but this is how it generally works.

Chuck
+1  A: 

Here are some numbers from my machine (it is Smalltalk/X, but I guess the numbers are comparable - at least the ratios should be):

The called methods "foo" and "foo:" are a noops (i.e. consist of a ^self):

self foo                               ...  3.2 ns
self perform:#foo                      ...  3.3 ns
[self foo] value                       ... 12.5 ns (2 sends and 2 contexts)
[ ] value                              ...  3.1 ns (empty block)
Compiler valuate:('TestClass foo')     ...  1.15 ms

self foo:123                           ...  3.3 ns
self perform:#foo: with:123            ...  3.6 ns
[self foo:123] value                   ...   15 ns (2 sends and 2 contexts)
[self foo:arg] value:123               ...   23 ns (2 sends and 2 contexts)
Compiler valuate:('TestClass foo:123') ...  1.16 ms

Notice the big difference between "perform:" and "evaluate:"; evaluate is calling the compiler to parse the string, generate a throw-away method (bytecode), execute it (it is jitted on the first call) and finally discarded. The compiler is actually written to be used mainly for the IDE and to fileIn code from external streams; it has code for error reporting, warning messages etc. In general, eval is not what you want when performance is critical.

Timings from a Dell Vostro; your milage may vary, but the ratios not. I tried to get the net execution times, by measuring the empty loop time and subtracting; also, I ran the tests 10 times and took the best times, to eliminate OS/network/disk/email or whatever disturbances. However, I did not really care for a load-free machine. The measure code was (replaced the second timesRepeat-arg with the stuff above):

callFoo2
    |t1 t2|

    t1 :=
        TimeDuration toRun:[
            100000000 timesRepeat:[]
        ].

    t2 :=
        TimeDuration toRun:[
            100000000 timesRepeat:[self foo:123]
        ].

    Transcript showCR:t2-t1

EDIT: PS: I forgot to mention: these are the times from within the IDE (i.e. bytecode-jitted execution). Statically compiled code (using the stc-compiler) will generally be a bit faster (20-30%) on these low-level micro benchmarks, due to a better register allocation algorithm.

EDIT: I tried to reproduce these numbers the other day, but got completely different results (8ns for the simple call, but 9ns for the perform). So be very careful with these micro-timings, as they run completely out of the first-level cache (and empty messages even omit the context setup, or get inlined) - they are usually not very representative of the overall performance.

blabla999