tags:

views:

369

answers:

2

I have the matrix:

a = [ 1 2 3 4;
      2 4 5 6;
      4 6 8 9]

and I want to compare every row with every other two rows one by one. If they share the same key then the result will tell they have a common key.

+2  A: 

Here's one solution (which is generalizable to larger matrices than the sample in the question):

comparisons = nchoosek(1:size(a,1),2);
N = size(comparisons,1);
keys = cell(N,1);
for i = 1:N
  keys{i} = intersect(a(comparisons(i,1),:),a(comparisons(i,2),:));
end

The function NCHOOSEK is used to generate all of the unique combinations of row comparisons. For the matrix a in your question, you will get comparisons = [1 2; 1 3; 2 3], meaning that we will need to compare rows 1 and 2, then 1 and 3, and finally 2 and 3. keys is a cell array that stores the results of each comparison. For each comparison, the function INTERSECT is used to find the common values (i.e. keys). For the matrix a given in the question, you will get keys = {[2 4], 4, [4 6]}.

gnovice
one more question if a is 100*20 matrix, what should we do to compare 100 rows with each other row and get intersection of each row with each other?
gurwinder
@gurwinder: The code above is general enough to handle whatever size `a` may be. If `a` is 100-by-20, you will end up performing 4950 comparisons, so `comparisons` will be a 4950-by-2 matrix and `keys` will be a cell array with 4950 elements (where each cell contains a set of common keys for the respective comparison).
gnovice
+1 good use of the nchoosek function!
Amro
@Amro: Thanks. I was tempted to use my HANKEL solution again (http://stackoverflow.com/questions/1702140/how-do-i-plot-lines-between-all-points-in-a-vector-in-matlab/1702336#1702336), but NCHOOSEK is a bit clearer, and I figured the OP would be working with relatively small matrices.
gnovice
+2  A: 

Using @gnovice's idea of getting all combinations with nchoosek, I propose yet another two solutions:

  • one using ismember (as noted by @loren)
  • the other using bsxfun with the eq function handle

The only difference is that intersect sorts and keeps only the unique common keys.

a = randi(30, [100 20]);
%# a = sort(a,2);

comparisons = nchoosek(1:size(a,1),2);
N = size(comparisons,1);
keys1 = cell(N,1);
keys2 = cell(N,1);
keys3 = cell(N,1);

tic
for i=1:N
    keys1{i} = intersect(a(comparisons(i,1),:),a(comparisons(i,2),:));
end
toc

tic
for i=1:N
    query = a(comparisons(i,1),:);
    set = a(comparisons(i,2),:);
    keys2{i} = query( ismember(query, set) );             %# unique(...)
end
toc


tic
for i=1:N
    query = a(comparisons(i,1),:);
    set = a(comparisons(i,2),:)';
    keys3{i} = query( any(bsxfun(@eq, query, set),1) );   %'# unique(...)
end
toc

... with the following time comparisons:

Elapsed time is 0.713333 seconds.
Elapsed time is 0.289812 seconds.
Elapsed time is 0.135602 seconds.

Note that even by sorting a beforehand and adding a call to unique inside the loops (commented parts), these two methods are still faster than intersect.

Amro

related questions