views:

57

answers:

3

I have a function, ranker, that takes a vector and assigns numerical ranks to it in ascending order. For example,
ranker([5 1 3 600]) = [3 1 2 4] or
ranker([42 300 42 42 1 42] = [3.5 6 3.5 3.5 1 3.5] .

I am using a matrix, variable_data and I want to apply the ranker function to each row for all rows in variable data. This is my current solution, but I feel there is a way to vectorize it and have it as equally fast :p

variable_ranks = nan(size(variable_data));
for i=1:1:numel(nmac_ids)
    variable_ranks(i,:) = ranker(abs(variable_data(i,:)));
end
+1  A: 

If you place the matrix rows into a cell array, you can then apply a function to each cell.

Consider this simple example of applying the SORT function to each row

a = rand(10,3);
b = cell2mat( cellfun(@sort, num2cell(a,2), 'UniformOutput',false) );
%# same as: b = sort(a,2);

You can even do this:

b = cell2mat( arrayfun(@(i) sort(a(i,:)), 1:size(a,1), 'UniformOutput',false)' );

Again, you version with the for loop is probably faster..

Amro
@Amro, isn't cells inherently slower than arrays in Matlab?
Elpezmuerto
I didnt say this was faster :)
Amro
+1 for providing the general solution, and for remembering `tiedrank`
Jonas
+1  A: 

One way would be to rewrite ranker to take array input

sizeData = size(variable_data);

[sortedData,almostRanks] = sort(abs(variable_data),2);
[rowIdx,colIdx] = ndgrid(1:sizeData(1),1:sizeData(2));
linIdx = sub2ind(sizeData,rowIdx,almostRanks);
variable_ranks = variable_data;
variable_ranks(linIdx) = colIdx;

%# break ties by finding subsequent equal entries in sorted data
[rr,cc] = find(diff(sortedData,1,2) == 0);
ii = sub2ind(sizeData,rr,cc);
ii2 = sub2ind(sizeData,rr,cc+1);
ii = sub2ind(sizeData,rr,almostRanks(ii));
ii2 = sub2ind(sizeData,rr,almostRanks(ii2));
variable_ranks(ii) = variable_ranks(ii2);

EDIT

Instead, you can just use TIEDRANK from TMW (thanks, @Amro):

variable_rank = tiedrank(variable_data')';
Jonas
I guess I was thinking of a general solution that would apply to any function..
Amro
@Amro: yes, your solution is certainly more general. Mine could be faster (though I don't know what `ranker` looks like)
Jonas
@Jones...This will not work because it does not properly assign ties, but will arbitrarily rate them, see my second example please
Elpezmuerto
@Elpezmuerto: I guess I'll have to add a tie-breaker then :)
Jonas
the function you're looking for is TIEDRANK: http://www.mathworks.com/access/helpdesk/help/toolbox/stats/tiedrank.html Ex: `r = tiedrank(M')';` to apply it to the rows of a matrix `M`
Amro
@Amro: Gah, if you only had been a little faster! Thanks anyway. However, tiedrank doesn't work for arrays either.
Jonas
@Elpezmuerto: I've added a version where ties lead to equal maximum rank, i.e. [3 2 2 4] becomes [3 1 1 4]. My solution should be adaptable to what you want, though I really should be doing other things at the moment :)
Jonas
@Jonas: eventhough the documentation only mentions vectors, TIEDRANK operates on the columns of a matrix if the input is a matrix (undocumented)
Amro
@Amro, good to know for the future, but for this specific problem, we want to operate on the rows, not columns.
Elpezmuerto
@Elpezmuerto: all you have to do is transpose the input matrix, then transpose the results back :)
Amro
+2  A: 

With collaboration from Amro and Jonas

variable_ranks = tiedrank(variable_data')';

Ranker has been replaced by the Matlab function in the Stat toolbox (sorry for those who don't have it),

[R,TIEADJ] = tiedrank(X) computes the ranks of the values in the vector X. If any X values are tied, tiedrank computes their average rank. The return value TIEADJ is an adjustment for ties required by the nonparametric tests signrank and ranksum, and for the computation of Spearman's rank correlation.

TIEDRANK will compute along columns in Matlab 7.9.0 (R2009b), however it is undocumented. So by transposing the input matrix, rows turn into columns and will rank them. The second transpose is then used to organize the data in the same manner as the input. There in essence is a very classy hack :p

Elpezmuerto
If you don't have the stat toolbox there is also the second argument of <code>sort</code> which returns the sort index.
Matt Mizumi
@Matt: That's what I tried in my answer. However, `sort` won't return the rank.
Jonas

related questions