views:

254

answers:

4

A presentation by Mikhael Goikhman from a 2003 Perl conference includes a pair of examples of prime-number-finding scripts. One is threaded, and the other is not. Upon running the scripts (print lines commented out), I got an execution time of 0.011s on the non-threaded one, and 2.343 (!) seconds on the threaded version. What accounts for the stunning difference in times?

I have some experience with threads in Perl and have noticed before that thread creation times can be particularly brutal, but this doesn't seem to be the bottleneck in Goikham's example.

+6  A: 

How many processors do you have? In general, any calculation intensive task will be slower when # of threads > # of processors. This is because it is expensive to switch between threads ("context switch"). Context switches involve stopping 1 thread, saving its context, then putting in another thread's context into the processor so it can run. And all for what? So thread A can calculate if 12321 is divisible by 7 instead of thread B?

If you have 2 procs, I would bet that a version with 2 threads might be the fastest, 4 procs -> use 4 threads, etc.

RichAmberale
I tried it on a 1x-single-core box and a 2x-quad-core box. Both had similarly contrastive results.
BipedalShark
+9  A: 

I'm a Python guy, not Perl, so I only have a vague idea of what the code is doing. However, always be careful when you see Queues. Python has a thread-safe Queue, and it looks like Perl does too. They're fantastic in that they take care of thread-safety for you, but they typically involve lots of expensive locking and unlocking of the queue, which is probably where all your time is going.

Jay P.
On an aside, CPython has a notion of the "GIL" (Global Interpreter Lock) which essentially makes CPython useless for "threading for performance" (it will NOT scale across cores) although threading in python can still be used to get around the limitation of blocking (system) calls. (This excludes cases invoking thread-aware C extensions not tied to the GIL, of course).
pst
+13  A: 

Jay P. is right:

~$ strace -c ./threads.pl
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.80    0.116007       10546        11           futex
  0.20    0.000229           6        36           mmap2
  0.00    0.000000           0        31           read
  0.00    0.000000           0        49        13 open
  0.00    0.000000           0        36           close

Compare that with:

~$ strace -c ./no-threads.pl
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 90.62    0.000261         261         1           execve
  9.38    0.000027           0       167           write
  0.00    0.000000           0        12           read
  0.00    0.000000           0        38        13 open
  0.00    0.000000           0        25           close
bish
Thanks for the confirmation. I wish I could accept two answers. ;)
BipedalShark
+1  A: 

It's a bit of a pathological case. The real answer is: Before starting to use Perl ithreads, you need to know a bit about how things work. They're notoriously inefficient at some things (sharing data) and good at others (they're concurrent).

If the chunks of work that you let the sub-threads do would be increased by a significant about in comparison to the number of times you send data from one thread to another, things would look quite different.

Comparing with Python threads like Jay P: As he correctly states, Python threads are cooperative and run on one core only. Perl's ithreads are very different. They can run on a core each, but being able to do this is paid for with having essentially a separate interpreter per thread. That makes communication between threads similar to inter-process communication including the associated overhead.

tsee