views:

3477

answers:

3

I want to delete several specific values from a matrix (if they exist). It is highly probable that there are multiple copies of the values in the matrix.

For example, consider an N-by-2 matrix intersections. If the pairs of values [a b] and [c d] exist as rows in that matrix, I want to delete them.

Let's say I want to delete rows like [-2.0 0.5] and [7 7] in the following matrix:

intersections =

   -4.0000    0.5000
   -2.0000    0.5000
    2.0000    3.0000
    4.0000    0.5000
   -2.0000    0.5000

So that after deletion I get:

intersections = 

   -4.0000    0.5000
    2.0000    3.0000
    4.0000    0.5000

What's the most efficient/elegant way to do this?

+8  A: 

Try this one-liner (where A is your intersection matrix and B is the value to remove):

A = [-4.0 0.5;
     -2.0 0.5;
      2.0 3.0;
      4.0 0.5;
     -2.0 0.5];
B = [-2.0 0.5];
A = A(~all(A == repmat(B,size(A,1),1),2),:);

Then just repeat the last line for each new B you want to remove.

EDIT:

...and here's another option:

A = A((A(:,1) ~= B(1)) | (A(:,2) ~= B(2)),:);

WARNING: The answers here are best used for cases where small floating point errors are not expected (i.e. with integer values). As noted in this follow-up question, using the "==" and "~=" operators can cause unwanted results. In such cases, the above options should be modified to use relational operators instead of equality operators. For example, the second option I added would be changed to:

tolerance = 0.001;   % Or whatever limit you want to set
A = A((abs(A(:,1)-B(1)) > tolerance) | (abs(A(:,2)-B(2)) > tolerance),:);

Just a quick head's up! =)


SOME RUDIMENTARY TIMING:

In case anyone was really interested in efficiency, I just did some simple timing for three different ways to get the subindex for the matrix (the two options I've listed above and Fanfan's STRMATCH option):

>> % Timing for option #1 indexing:
>> tic; for i=1:10000, index = ~all(A == repmat(B,size(A,1),1),2); end; toc;
Elapsed time is 0.262648 seconds.
>> % Timing for option #2 indexing:
>> tic; for i=1:10000, index = (A(:,1) ~= B(1)) | (A(:,2) ~= B(2)); end; toc;
Elapsed time is 0.100858 seconds.
>> % Timing for STRMATCH indexing:
>> tic; for i=1:10000, index = strmatch(B,A); end; toc;
Elapsed time is 0.192306 seconds.

As you can see, the STRMATCH option is faster than my first suggestion, but my second suggestion is the fastest of all three. Note however that my options and Fanfan's do slightly different things: my options return logical indices of the rows to keep, and Fanfan's returns linear indices of the rows to remove. That's why the STRMATCH option uses the form:

A(index,:) = [];

while mine use the form:

A = A(index,:);

However, my indices can be negated to use the first form (indexing rows to remove):

A(all(A == repmat(B,size(A,1),1),2),:) = [];    % For option #1
A((A(:,1) == B(1)) & (A(:,2) == B(2)),:) = [];  % For option #2
gnovice
I almost had a vector solution but not a little more verbose than yours. Nice one-liner.
Azim
WoW...so elegant way....
Kamran
+3  A: 

You can also abuse the strmatch function to suit your needs: the following code removes all occurences of a given row b in a matrix A

A(strmatch(b, A),:) = [];

If you need to delete more than one row, such as all rows from matrix B, iterate over them:

for b = B'
   A(strmatch(b, A),:) = [];
end
Fanfan
+1 Very sneaky using STRMATCH to do this! I'm not sure why someone downvoted it... I just tried it myself and it appears to work just fine. I did some simple timing and found that using STRMATCH to get a matrix index is about the same speed as my first option above, but my second option is fastest.
gnovice
+3  A: 

The simple solution here is to look to set membership functions, i.e., setdiff, union, and ismember.

A = [-4 0.5; -2 0.5; 2 3; 4 0.5; -2 0.5];

B = [-2 .5;7 7];

See what ismember does with the two arrays. Use the 'rows' option.

ismember(A,B,'rows')

ans = 0 1 0 0 1

Since we wish to delete rows of A that are also in B, just do this:

A(ismember(A,B,'rows'),:) = []

A =

  -4          0.5

   2            3

   4          0.5

Beware that set membership functions look for an EXACT match. Integers or multiples of 1/2 such as are in A satisfy that requirement. They are exactly represented in floating point arithmetic in MATLAB.

Had these numbers been real floating point numbers, I'd have been more careful. There I'd have used a tolerance on the difference. In that case, I might have computed the interpoint distance matrix between the two sets of numbers, removing a row of A only if it fell within some given distance of one of the rows of B.

woodchips
+1 I forgot about how ISMEMBER can operate across rows. Also, welcome to SO! ;)
gnovice

related questions