views:

395

answers:

2

I have to implement k nearest neighbors search for 10 dimensional data in kd-tree.

But problem is that my algorithm is very fast for k=1, but as much as 2000x slower for k>1 (k=2,5,10,20,100)

Is this normal for kd trees, or am I doing something worng?

+1  A: 

Well, it primarily depends on your particular implementation and data set.

A poorly balanced tree will mean you have to search way more data than you need to. Make sure that your tree construction is sane.

It could also depend on how you find the k neighbors. If your algorithm searches the tree for the closest neighbor and stores it, then searches for the second closest and stores it etc, then you're not doing the search very efficiently. Instead keep a running list of the k nearest neighbors and bump points out of the list as you find closer ones traversing the tree. This way you search once, instead of k times.

Either way, it sounds like you're doing this for a course. Try talking to your professor, TAs, or classmates to see if your results are typical.

Andrew
poorly balanced tree was the reason. I have reviewed my tree construction method and it was choosing wrong split dimensions. Thanks for hint :)
Az
A: 

I know this question has been answered, but for more detail on KNN searches with k-d trees, see Bentley (1975:514), Communications of the ACM 18(9), September.

fluffels