ansaurus

Question

Answer 1

+2 A:

Assuming that you are doing embarrassingly parallel (that is the job and task structure) and that you want the log file updated at the end of each task, I would use the taskFinish callback

http://www.mathworks.com/access/helpdesk/help/toolbox/distcomp/taskfinish.html

You could also just write to a file somehow in the middle of your task as you would in MATLAB normally, but I think you are asking about callbacks at the end of the task.

MatlabDoug 2009-07-02 18:20:39

Answer 2

+1 A:

When multiple processes output to a single file, you could run into some potential problems, like messages being overwritten or intermingled. I've had this happen with programs in other languages (like C), and I assume the same problem could arise in MATLAB, but I freely admit I could be wrong about this. Assuming I'm not wrong...

If you want to reliably output data from multiple worker processes to a single log file while the processes are running, one way to do this is to make one process be responsible for all the file operations (i.e. a "master" process). The "master" process would collect messages from the other workers (i.e. "slaves") and output this data to the log file.

Since I don't know what specifically you are having each process do, it's hard to suggest specific code changes to make. Here are some steps and sample code for how you might do this in MATLAB. These code samples assume you are running the same function (process_fcn) on each process:

The "master" process first has to open the file. This code (using the labindex function) should be run at the beginning of process_fcn:
```
if (labindex == 1),
  fid = fopen('log.txt','at');  % Open text file for appending
end
```
While each process is running, you can collect any data that needs to be output to the log file in a variable called data, which stores a string or character array. This data could be error messages captured within a try-catch block or any other data that you would want to be in the log file.
At periodic points in process_fcn (either when major tasks are completed or within a loop of computation), you would have to have each process check for data that needs to be output (i.e. data is not empty) and have that data sent to the "master" process. The "master" process would then collect and print these messages from other processes, along with any of its own. Here's a sample of how this might be done (using the functions labBarrier, labProbe, labSend, and labReceive):
```
labBarrier;  % All processes are synchronized here
if (labindex == 1),  % This is done by the "master"
  if ~isempty(data),
    fprintf(fid,'%s\n',data);  % Print "master" data
  end
  pause(1);  % Wait a moment for "slaves" to send messages
  while labProbe,  % Loop while messages are available
    data = labReceive;  % Get data from "slaves"
    fprintf(fid,'%s\n',data);
  end
else  % This is done by the "slaves"
  if ~isempty(data),
    labSend(data,1);  % Send data to the "master"
  end
end
data = '';  % Clear data
```
The call to PAUSE is there to ensure that the calls to labSend for each "slave" process occur before the "master" starts looking for sent messages.
Finally, the "master" process has to close the file. This code should be run at the end of process_fcn:
```
if (labindex == 1),
  fclose(fid);
end
```

gnovice 2009-07-04 00:13:30

I'm not sure about the labBarrier function (because my jobs don't have the same running time). But this give me a good starting point.

Eolmar 2009-07-06 16:28:16

If your jobs have vastly different running times then synchronizing them for outputting data gets tricky. I'm unsure if labSend is a *blocking* call (process waits for a matching receive) or *non-blocking* call (message is buffered and the process moves on, whether a receive is posted or not). My above code assumes it's blocking... if it's non-blocking it would probably simplify things for you (i.e. no labBarrier or PAUSE would be needed).

gnovice 2009-07-06 16:38:22

ansaurus

tags:

views:

answers:

Parallel MATLAB and logging

related questions