tags:

views:

63

answers:

3

I have one perl program, where using some form of parallelism would really be helpful.

However, I have quite a lot of data in variables, that I don't need at all at that part of the program.

If I use perl threads, it copies all variables every time I create a new thread. In my case, that hurts a lot.

What should I use to make a new thread without the copying? Or are there some better thread implementations, that don't copy everything?

+3  A: 

Use the fork(2) system call to take advantage of Copy-on-write.

daxim
This is good.However, I need the child proccess/thread to return a real number. I have no other way than a pipe, right?
Karel Bílek
@Karel: Returning a number is easy, print the result from the child to the parent process (use 'open $fh, "-|"' in the parent). For more complex data structures, use JSON or something similar to serialize the data, or return a string pointing to a file on where to find result data.
runrig
+2  A: 

Really, you just have to avoid ithreads. They're horrible, and unlike every other form of threads on the planet they're more expensive than regular heavyweight processes. My preferred solution is to use an event-based framework like POE or AnyEvent (I use POE) and break out any tasks that can't be made nonblocking into subprocesses using POE::Wheel::Run (or fork_call for AnyEvent). It does take more up-front design work to write an app in that manner, but done right, it will give you some efficient code. From time to time I've also written code that simply uses fork and pipe (or open '-|') and IO::Select and waitpid directly within its own event loop, but you should probably consider that a symptom of my having learned C before perl, and not a recommendation. :)

A word to the wise, though: if you're running on Windows, then this approach might be almost as bad as using ithreads directly, since Perl makes up for win32's lack of fork() by using ithreads, so you'll pay that same ithread-creation cost (in CPU and memory) on every fork. There isn't really a good solution to that one.

hobbs
I am working on both Mac OS X and Linux, but not on windows.
Karel Bílek
@hobbs => on win32, do you know if `open '-|'` uses ithreads to do it's dirty work, or is it done more efficiently?
Eric Strom
@Eric Strom I don't know absolutely for certain (I can find out) but I think it's a pretty safe assumption that it uses the same fork-emulation as everything else.
hobbs
@Karel Bilek that's good news, of a sort :)
hobbs
@Eric: open $fh, '-|' doesn't work on ActiveState perl. Don't know about Strawberry or Cygwin perl.
runrig
I disagree about ithreads being *horrible*. They're certainly not what most people expect of threads, but they're about the only fork-like mechanism that works portably.
tsee
@tsee they're *acceptable* as fork-emulation on top of the win32 process model. They're horrible as *threads* :)
hobbs
Too bad pthreads weren't safe. But 'real' threads in a language such as Perl seem to be impossible to do. Python, Ruby, et al failed to get it right as well. They just did different trade offs. By the way, I'm surprised nobody mentioned Coro as an alternative.
tsee
+5  A: 

Like the syntax an ease of threads but not all the fat? Use the amazing forks module! It implements the threads interface using fork and IPC making it easy to share data between child processes.

Schwern
You are a godsend. I am not sure how fast it is (I will have to do performance tests) but I like it so far.
Karel Bílek
just a comment: it really is fast. although, it is UNIX only (no problem for me, but can be problematic for somebody else).
Karel Bílek
Windows emulates fork() with something like threads, so you're probably better off just using threads on Windows.
Schwern