tags:

views:

1084

answers:

3

I've often found myself doing something like this:

unprocessedData = fetchData();  % returns a vector of structs or objects
processedData = [];             % will be full of structs or objects

for dataIdx = 1 : length(unprocessedData) 
    processedDatum = process(unprocessedData(dataIdx));
    processedData = [processedData; processedDatum];
end

Which, whilst functional, isn't optimal - the processedData vector is growing inside the loop. Even mlint warns me that I should consider preallocating for speed.

were data a vector of int8, I could do this:

% preallocate processed data array to prevent growth in loop
unprocessedData = zeros(length(unprocessedData), 1, 'int8');

and modify the loop to fill vector slots rather than concatenate.

is there a way to preallocate a vector so that it can subsequently hold structs or objects?

Thanks!


Update: inspired by Azim's answer, I've simply reversed the loop order. Processing the last element first forces preallocation of the entire vector in the first hit, as the debugger confirms:

unprocessedData = fetchData();

% note that processedData isn't declared outside the loop - this breaks 
% it if it'll later hold non-numeric data. Instead we exploit matlab's 
% odd scope rules which mean that processedData will outlive the loop
% inside which it is first referenced: 

for dataIdx = length(unprocessedData) : -1 : 1 
    processedData(dataIdx) = process(unprocessedData(dataIdx));
end

This requires that any objects returned by process() have a valid zero-args constructor since Matlab initialises processedData on the first write to it with real objects.

mlint still complains about possible array growth, but I think that's because it can't recognise the reversed loop iteration...

+3  A: 

Of course you know the fields of the structur processedData and you know its length. So, when way would be the follwoing

>> unprocessedData = fetchData();
>> processedData = struct('field1', [], ...
      'field2',[]) % create the processed data struct
>> processedData(length(unprocessedData)) = processedData(1); % create a _processedData_ array with the required length
>> for dataIdx = 1:length(unprocessedData)
      processedData(dataIdx) = process(unprocessedData(dataIdx));
end

This assumes that the process function returns a struct with the same fields as the struct processData.

Azim
+3  A: 

In addition to Azim's answer, another way to do this is using REPMAT:

% Make a single structure element:
processedData = struct('field1',[],'field2',[]);
% Make an object:
processedData = object_constructor(...);
% Replicate data:
processedData = repmat(processedData,1,nElements);

where nElements is the number of elements you will have in the structure or object array.

BEWARE: If the object you are making is derived from the handle class, I don't think you will be replicating the object itself, just handle references to it. Depending on your implementation, you might have to call the object_constructor nElements times.

gnovice
+1 This is one situation for which repmat is useful.
Azim
+2  A: 

You can pass in a cell array to struct of the appropriate size, e.g.,

processedData = struct( 'field1', cell( nElements, 1 ), 'field2', [] )

This will make a cell array that is the same size as the cell array.

ManWithSleeve
+1 This is a good alternative for making structure arrays, especially if you already have cell arrays of data you want to fill the fields with.
gnovice

related questions