Hi,
I have the following problem.
First my environment, I have two 24-CPU servers to work with and one big job (resampling a large dataset) to share among them. I've setup multicore and (a socket) Snow cluster on each. As a high-level interface I'm using foreach.
What is the optimal sharing of the job? Should I setup a Snow cluster using CPUs from both machines and split the job that way (i.e. use doSNOW for the foreach loop). Or should I use the two servers separately and use multicore on each server (i.e. split the job in two chunks, run them on each server and then stich it back together).
Basically what is an easy way to: 1. Keep communication between servers down (since this is probably the slowest bit). 2. Ensure that the random numbers generated in the servers are not highly correlated.