ansaurus

Question

Answer 1

+2 A:

That's a difficult problem to vectorize. I can suggest a way to do it using CELLFUN, but I can't guarantee that it will be faster for your problem (you would have to time it yourself on the specific data sets you are using). As discussed in this other SO question, vectorizing doesn't always work faster than for loops. It can be very problem-specific which is the best option. With that disclaimer, I'll suggest two solutions for you to try: a CELLFUN version and a modification of your for-loop version that may run faster.

CELLFUN SOLUTION:

[Y,M] = datevec(allDts);
monthStart = datenum(Y,M,1);  % Start date of each month
[monthStart,sortIndex] = sort(monthStart);  % Sort the start dates
[uniqueStarts,uniqueIndex] = unique(monthStart);  % Get unique start dates

valCell = mat2cell(vals(sortIndex,:),diff([0 uniqueIndex]));
newVals = cellfun(@nansum,valCell,'UniformOutput',false);

The call to MAT2CELL groups the rows of vals that have the same start date together into cells of a cell array valCell. The variable newVals will be a cell array of length numel(uniqueStarts), where each cell will contain the result of performing nansum on the corresponding cell of valCell.

FOR-LOOP SOLUTION:

[Y,M] = datevec(allDts);
monthStart = datenum(Y,M,1);  % Start date of each month
[monthStart,sortIndex] = sort(monthStart);  % Sort the start dates
[uniqueStarts,uniqueIndex] = unique(monthStart);  % Get unique start dates

vals = vals(sortIndex,:);  % Sort the values according to start date
nMonths = numel(uniqueStarts);
uniqueIndex = [0 uniqueIndex];
newVals = nan(nMonths,size(vals,2));  % Preallocate
for iMonth = 1:nMonths,
  index = (uniqueIndex(iMonth)+1):uniqueIndex(iMonth+1);
  newVals(iMonth,:) = nansum(vals(index,:));
end

gnovice 2009-05-14 14:47:56

Thanks. This speeds it up by about 50%!! If I understand the code correctly, this line: valCell = mat2cell(vals,diff([0; uniqueIndex])); is the key - it breaks up the values into cells, that are the length of the second arg long. (Didn't need the sort - the dates and their associated values are guaranteed to be sort

Marc 2009-05-14 19:35:01

Yup, it sounds like you've got it. The second argument to MAT2CELL is a vector of sizes that the rows of the first argument will be broken into. For example, if the first argument is a 6x3 matrix (called A), and the second argument is [1 2 3], then MAT2CELL will return a 3-element cell array (called B) equal to the following: B = {A(1,:); A(2:3,:); A(4:6,:)}

gnovice 2009-05-14 19:46:23

Answer 2

A:

If all you need to do is form the sum or mean on rows of a matrix, where the rows are summed depending upon another variable (date) then use my consolidator function. It is designed to do exactly this operation, reducing data based on the values of an indicator series. (Actually, consolidator can also work on n-d data, and with a tolerance, but all you need to do is pass it the month and year information.)

Find consolidator on the file exchange on Matlab Central

woodchips 2009-05-14 21:12:24

ansaurus

tags:

views:

answers:

matlab Bucketing Algorithm Help

related questions