tags:

views:

262

answers:

4

I am trying to find the nearest neighbor in recursive depth-first-fashion. There are many elements involve prior to getting to this point, to keep it simple I have only included the section that I am currently having trouble.

My idea is to find nearest neighbor to a given point based on some threshold distance, I decided to separate the process int to two methods. One to really find if the nearest neighbor exists and other method to just call that function over and over recursively.

I have ArrayList with double values that I treat as points... if return 0.0, means I did't find the nearest neighbor (Wondering if I really use Object, might do it once I cleanup the logic).

/**
 * Find the nearest neighbor based on the distance threshold.
 * TODO:
 * @param currentPoint current point in the memory.
 * @param threshold dynamic distance threshold.
 * @return return the neighbor.
 */

private double nearestNeighbor(double currentPoint) {

 HashMap<Double, Double> unsorted = new HashMap<Double, Double>();
 TreeMap<Double, Double> sorted = null; 
 double foundNeighbor = 0.0;

 for (int i = 0; i < bigCluster.length; i++) {
  if (bigCluster[i] != 0.0 && bigCluster[i] != currentPoint) {
   double shortestDistance = Math.abs(currentPoint - bigCluster[i]);
   if (shortestDistance <= this.getDistanceThreshold())
    unsorted.put(shortestDistance, bigCluster[i]);
  }
 }
 if (!unsorted.isEmpty()) {
  sorted = new TreeMap<Double, Double>(unsorted);
  this.setDistanceThreshold(avgDistanceInCluster());
  foundNeighbor = sorted.firstEntry().getValue();
  return foundNeighbor;
 } else {
  return 0.0;
 }
}

And here is my method, that I planned on calling the above in a recursive dfs fashion.

/**
 * Method will check if a point belongs to a cluster based on the dynamic 
 * threshold.
 */
public void isBelongToCluster(double point) {

 for (int i = 0; i < tempList.size(); i++) {

  double aPointInCluster = point;
  cluster.add(aPointInCluster);
  isVisited[i].visited = true;
  double newNeighbor = nearestNeighbor(aPointInCluster);
  if (newNeighbor != 0.0) {
   cluster.add(newNeighbor);
   if (i + 1 != tempList.size() && (isVisited[i].visited != true)) {
    isBelongToCluster(newNeighbor);
   }
  }

 }
 for (int i = 0; i < cluster.size(); i++) {
  if (cluster.get(i) != 0.0)
   System.out.println("whats in the cluster -> " + cluster.get(i));
 }
}

What I am struggling is with the recursive depth first search. In Addition to that my visitor implementation doesn't look right.

This is how I tried to handle the visitor

public class VisitedPoint {

    public double point;
    public boolean visited;

    public VisitedPoint(double pointToCheck) {
     point = pointToCheck;
     visited = false;
    }
}

then I would create a VisitedPoint object, private VisitedPoint[] isVisited; but when I used it in my isBelongTo() method I get a null pointer exception.

Thanks in advance for any help.

A: 

It feels like you are making this more complex that it needs to be.

If I understand things correctly, you have a series of 1 dimensional point data. You are trying to classify these points into groups, where within a group, the distance between any pair of points is less than the 'threshold distance'.

I would start by sorting the 1 dimensional point data. Starting at the first element, I create a cluster object for that element. If the delta between the first element and the second element is smaller than the threshold I add it to the cluster. Now I look at the delta between the second and the third; advancing along the sorted data, and creating a new cluster to classify things into when I find a delta that is larger than the threshold, until I am at the end of the data.

Does that solve the problem or did I miss a key requirement?

Mikeb
Thanks for the quick response, You got the first part right so far i have one dimensional point data. But, should i put the pseudo code of the entire algorithm? Looks like by putting only the two sections might have confuse everyone.
1.1 Starting point P recursive search the nearest neighbor of P → P' 1.2 if found the nearest neighbor P' 1.2.1 Add P' to the cluster Ck 1.2.2 Update the distance threshold calculated from avg distance between every pair of points in the cluster. 1.2.3 recursive call will be made to find the the neighbor of P' 1.3 else 1.3.1 back track to point P
I still don't see a need for recursion. If the data set is sorted, then the only other point that could possibly be closer than P (from P') is the next element.
Mikeb
Here is the complete pseudo code I put to put together before start coding, This is to find clusters with-in the clusters. So, A= {3, 1, 9, 2, 4, 32, 3.2}, lets say I pick 9, I go over find the nearest neighbor moment i find the nearest which is 4 then I want to do recursive call with 4 (DFS).
Sorry I am unable to put entire pseudo code since its longer that 600 characters.
A: 

So far I came up with this solution, I am still debugging to see if I got it correct. But I know stackoverflow have super experts, please let me know if there is a better way to implement this (I am sure there is),

/**
 * Find the nearest neighbor based on the distance threshold.
 * TODO:
 * @param currentPoint current point in the memory.
 * @param threshold dynamic distance threshold.
 * @return return the neighbor.
 */

private double nearestNeighbor(double currentPoint) {

 HashMap<Double, Double> unsorted = new HashMap<Double, Double>();
 TreeMap<Double, Double> sorted = null; 
 double foundNeighbor = 0.0;

 for (int i = 0; i < bigCluster.length; i++) {
  if (bigCluster[i] != 0.0 && bigCluster[i] != currentPoint) {
   double shortestDistance = Math.abs(currentPoint - bigCluster[i]);
   if (shortestDistance <= this.getDistanceThreshold())
    unsorted.put(shortestDistance, bigCluster[i]);
  }
 }
 if (!unsorted.isEmpty()) {
  sorted = new TreeMap<Double, Double>(unsorted);
  this.setDistanceThreshold(avgDistanceInCluster());
  foundNeighbor = sorted.firstEntry().getValue();
  return foundNeighbor;
 } else {
  return 0.0;
 }
} 


/**
 * Method will check if a point belongs to a cluster based on the dynamic 
 * threshold.
 */
public void isBelongToCluster() {


  for (int i=0; i < tempList.size(); i++) {

   double aPointInCluster = tempList.get(i);

   cluster.add(aPointInCluster);
   double newNeighbor = nearestNeighbor(aPointInCluster);
   if ( newNeighbor != 0.0) {
    cluster.add(newNeighbor);
    if (i + 1 > tempList.size() && (visited[i] != true)) {
     isBelongToCluster();
    }
   }

  }

 for (int i=0; i < cluster.size(); i++) {
  if (cluster.get(i) != 0.0)
   System.out.println("whats in the cluster -> " + cluster.get(i)); 
 } 
}
A: 

Look at http://stackoverflow.com/questions/1639755/best-performance-critical-algorithm-for-solving-nearest-neighbor. I hope it helps.

Kaveh Shahbazian
Thanks, It sounds like somewhat similar to what I am trying to do. I will take a look at it.
A: 

Since I cant put the entire pseudo code as a comment I decided to put here, this is what wrote as pseudo code before coding.

 1 Pick sample S points that fits the memory and the distance threshold.
 2 Mark all points as unclustered (by not adding to the cluster)
 3 Set the number of clusters k = 0
3.1 Set threshold to initial value
 4 Randomly choose a point P out of S points
 4.1 Add the point P to the cluster Ck
 4.2 Starting point P recursive search the nearest neighbor of P → P'
 4.3  if found the nearest neighbor P'
 4.3.1 Add P' to the cluster Ck
 4.3.2 Update the distance threshold calculated from avg distance between every pair of points in the cluster. 
 4.3.3 recursive call will be made to find the the neighbor of P'
 4.4 else
 4.4.1 back track to point P 
 5 Check if there are unclustered point if there is move to step First 3.1
 5.1 set number of clusters k = k + 1

Area that I am having trouble is 4.3.3 and thats the code segment I put it earlier.

Thanks in advance.