views:

38

answers:

1

I get strange behavior with respect to memory with Matlab and the cell2mat() function...

what I would like to do is:

cell_array_outer = cell(1,N) 
parfor k = 1:N
  cell_array_inner = cell(1,M);   
  for i = 1:M
    A = do_some_math_and_return_a_sparse_matrix( );
    cell_array_inner{i} = sparse(A); % do sparse() again just to be paranoid
  end
  cell_array_outer{k} = sparse( cell2mat( cell_array_inner ) ); 
end

Giant_Matrix = cell2mat( cell_array_outer ); % DOH! 

But alas the line indicated by "DOH" uses some absurd amount of memory, more than what should end up if you add up the sizes of the sparse matrices... like its making an intermediate structure that's too big.

The following works fine, but double-indexing doesn't work with par-for so I can only use use one core:

cell_array_giant = cell(M,N) 
for k = 1:N   % cannot use parfor with {i,k} dual indices!

  for i = 1:M
    A = do_some_math_and_return_a_sparse_matrix( );
    cell_array_giant{i,k} = sparse(A); % do sparse() again just to be paranoid
  end
end

cell_array_giant = reshape( cell_array_giant, 1, M * N )
Giant_Matrix = sparse( cell2mat( cell_array_giant ) ); % Ok... but no parfor 

My suspicion is that in the latter case, each cell element is much more manageable in size... like a 20,000x1 sparse matrix, but in the former those "outer" elements are now 20,000 x 5,000 and somehow not fitting where Matlab would like to put them as temporary variables, and the memory use gets out of control despite their extreme sparsity.

Any rules to follow regarding memory use and the above? Or how to change my parfor use so it jives in the 2nd case? "parfor" is kind of new so there's less stuff on the web about it than other core features... its much more efficient than running 8 copies of matlab!

A: 

To predict temporary memory use, we'd have to know more about the Matlab internals work - which I don't.

For your second solution, you can use parfor if you do it inside the inner loop, I think (I don't get an m-lint warning, at least). If necessary, transpose your problem so that M>N, because you usually want parfor to do lots of quick calculations, instead of very few long ones, so that you get less of an overhang if the number of operations isn't divisible by 8 (or however many cores you may run).

cell_array_giant = cell(M,N) 
for k = 1:N   %# cannot use parfor with {i,k} dual indices!

  parfor i = 1:M %# but you can use it here!
    A = do_some_math_and_return_a_sparse_matrix( );
    cell_array_giant{i,k} = sparse(A); %# do sparse() again just to be paranoid
  end
end

Also, would it be possible to construct the giant sparse matrix inside the k-loop? This avoids the reshape altogether. Of course, you'd only be able to parfor the M-loop, since otherwise, the giant array would be passed to all the worker, and lots of sadness would ensue.

Jonas

related questions