Hi,
Is there a distance calculation implementation using hadoop map/reduce. I am trying to calculate a distance between a given set of points.
Looking for any resources ..
//edited ............
This is a very intelligent solution. I have tried some how like the first algorithm, and i get almost what i was looking for. I am not concerned about optimizing the program at the moment. but my problem was the dist(X,Y) function was not working. When i got all the points on the reducer, i was unable to go through all the points on an Iterator and calculate the distance. One guy from stackoverflow.com told me that the Iterator on hadoop is different than the normal JAVA Iterator, i am not sure about that. But if i can find a simple way to go through the Iterator on my dist() function, i can use your second algorithm to optimize. //This is your code and i am refering to that code too, just to make my point clear. `map(x,y) { for i in 1:N #number of points emit(i, (x,y)) //i did exactly like this
reduce (i, X) p1 = X[i] for j in i:N emit(dist(X[i], X[j]))` //here is my problem, i can't get the values from the Iterator.
Thanks,