views:

21

answers:

0

We're using the MATLAB Parallel Computing Toolbox for one of our projects and we intermittently get the following error:

The client lost connection to lab 2. This might be due to network problems, or the pmode parallel job might have errored. This is causing: java.io.IOException: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full

All of the machines involved are running Windows XP SP3 32-bit, using MATLAB 2009a. The error happens intermittently from different labs.

Any ideas would be much appreciated.

Here's a rough/generic snippet of code as to what it's doing:

R = findResource();
matlabpool('open', 'local', R.ClusterSize);
spmd
... some loadlibrary() calls ...
end

parfor i = 1:num
  ... big calculations ...
end
%clear
matlabpool close