views:

130

answers:

3

I have a set of vectors. For a vector in that set I like to find the sub set that is closeest to this vector. What algorithm can do this.

+4  A: 

use the cosinus similarity (http://en.wikipedia.org/wiki/Cosine_similarity) among the vectors and then sort them.

excepeiont32
+1 I was going to mention the scalar product only. I didn't think about the vector lengths. Thanks for saving me from ridicule ;)
Mads Elvheim
we don't know what he wants as distance
fa.
+2  A: 

This class of algorithms is called Nearest Neighbor or K Nearest Neighbor.

The cosine similarity as excepeiont says will work if direction of vector is important. If the vector represents a position in a space, then any metric for representing a distance in the space will work.

For example the Euclidean distance: take the square root of the sum of squares difference in each dimension. This will give you a distance for each vector, then sort your set of vectors ascending on this distance.

This process will be O(N) in time. If this is too slow for you, you might want to look at some common K Nearest Neighbour algorithms.

Nick Fortescue
A: 

If your problem relates to large amount of data:

I published a related algorithm on ddj.com, that finds the nearest line to a given point:

Accelerated Search For the Nearest Line

You would have to modify this algorithm by i.e. by converting the given vector to a number of points. This will reduce the number of possible matches drastically. The exact match has then to be checked for each possible match by

  • Find the cutting point of both vectors or
  • Get distance from vector start and end point to the possible match, as described in the article
RED SOFT ADAIR