views:

172

answers:

1

I have two clusters of data each cluster has x,y (coordinates) and a value to know it's type(1 class1,2 class 2).I have plotted these data but i would like to split these classes with boundary(visually). what is the function to do such thing. i tried contour but it did not help!

+4  A: 

Imagine the following example:

data = rand([50 2]);
labels = randi(2, [50 1]);    
gscatter(data(:,1), data(:,2), labels, 'rb', 'xo')

screenshot1

As you can see, except for easily separable clusters for which you know the equation of the boundary beforehand, finding the boundary is not a trivial task...

One idea is to use the discriminant analysis function classify to find the boundary (you have a choice between linear and quadratic boundary).

The following is a complete example to illustrate the procedure, adapted from an old script I had:

NUM_K = 2;                          %# number of classes
numInst = 50;                       %# number of instances

classifierType = 'quadratic';       %# 'quadratic', 'linear'
npoints = 100;
clrLite = [1 0.6 0.6 ; 0.6 1 0.6 ; 0.6 0.6 1];
clrDark = [0.7 0 0 ; 0 0.7 0 ; 0 0 0.7];

%# generate data
data = rand([numInst 2]);           %# random 2D data
labels = randi(NUM_K, [numInst 1]); %# random classes

%# discriminant analysis
%# classify the grid space of these two dimensions
[X Y] = meshgrid( linspace(0,1,npoints) , linspace(0,1,npoints) );
X = X(:); Y = Y(:);
[C,err,P,logp,coeff] = classify([X Y], data, labels, classifierType);

%# find incorrectly classified training data
[CPred err] = classify(data, data, labels, classifierType);
bad = (CPred~=labels);

%# plot grid classification color-coded
image(X, Y, reshape(C,npoints,npoints)), hold on
axis xy, colormap(clrLite)

%# plot data points (correctly and incorrectly classified)
gscatter(data(:,1), data(:,2), labels, clrDark, '.', 20, 'off');

%# mark incorrectly classified data
plot(data(bad,1), data(bad,2), 'kx', 'MarkerSize',10)
axis([0 1 0 1])

%# draw decision boundaries between pairs of clusters
for i=1:NUM_K
    for j=i+1:NUM_K
        if strcmp(coeff(i,j).type, 'quadratic')
            K = coeff(i,j).const;
            L = coeff(i,j).linear;
            Q = coeff(i,j).quadratic;
            f = sprintf('0 = %g + %g*x + %g*y + %g*x^2 + %g*x.*y + %g*y.^2',...
                K,L,Q(1,1),Q(1,2)+Q(2,1),Q(2,2));
        else
            K = coeff(i,j).const;
            L = coeff(i,j).linear;
            f = sprintf('0 = %g + %g*x + %g*y', K,L(1),L(2));
        end
        h2 = ezplot(f, [0 1 0 1]);
        set(h2, 'Color','k', 'LineWidth',2)
    end
end

xlabel('x-axis'), ylabel('y-axis')
title( sprintf('accuracy = %.2f%%', 100*(1-sum(bad)/numInst)) )
legend( strcat(num2str([1:NUM_K]','Class %d'),{}) )

hold off

screenshot2

Amro
+1 .... pretty!
Jacob