Clearly if performance is critical it makes sense to prototype and profile. But all the same, wisdom and advice can be sought on StackOverflow :)
For the handling of highly parallel tasks where inter-task communication is infrequent or suits message-passing, is there a performance disadvantage to using processes (fork() etc) or threads?
Is the context switch between threads cheaper than that between processes? Some processors have single-instruction context-switching don't they? Do the mainstream operating systems better utilise SMP with multiple threads or processes? Is the COW overhead of fork() more expensive than threads if the process never writes to those pages?
And so on. Thanks!