views:

285

answers:

4

given a matrix of distances between points is there an algorithm for determining a set of n-dimensional points that has these distances? (or at least minimises the error)

sort of like a n-dimensional version of the turnpike problem.

The best I can come up with is using multidimensional scaling.

+1  A: 

I can't edit the original, because I don't have enough rep, but I've tried to restate the problem here.

The OP has an input NxN matrix of distances. He wants to create an output array, size N, of N-dimensional coordinates representing points, where the distance between each point is stored in the input matrix.

Note that this is not solvable in the general case:

Suppose I have a matrix like this

....A..B..C
A...x..1..2
B......x..0
C.........x

A is 1 unit of distance (say 1 metre) away from B, and A is one metre away from C. But B and C are in the same spot.

In this particular case the minimal sum of errors is 1 metre, and there are an infinite variety of solutions which achieve that result

Airsource Ltd
i agree it's definitely not solvable generally; hence my "minimise the error clause"
matpalm
+1  A: 

There is an algorithm for doing this in Programming Collective Intelligence, p. 49, "Viewing Data in Two Dimensions", which could be adapted for n-dimensions.

Hey -- it's multidimensional scaling -- so I guess you are on the right track.

Lou Franco
+3  A: 

You are on the right track with multi-dimensional scaling (MDS), but MDS is impractical for large datasets, as its time complexity is quadratic in the number of points. You may want to look at FastMap, which has linear time complexity and is better suited to indexing. See:

Christos Faloutsos and King-Ip Lin: "FastMap: a Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets, in Proc. SIGMOD, 1995, doi:10.1145/223784.223812

Vebjorn Ljosa
cheers, has given me a number of ideas.
matpalm
+2  A: 

You can "cheat" and use an iterative numerical method for this. Take all of the points to be in some "random" positions initially, and then loop through them, moving them away from each other proportionally to the required distance. This will prefer some points, but taking an average of the moves before applying them, then applying the average will remove this problem. This is an O(n²) algorithm, but very simple to implement and understand. In the 2-d example below the error is << 10%, though it may not behave so well if the distances given are unrealistic.

C++ Example:

#include <conio.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>

#define DAMPING_FACTOR 0.99f

class point
{
public:
    float x;
    float y;
public:
    point() : x(0), y(0) {}
};

// symmetric matrix with distances
float matrix[5][5] =    {
                            { 0.0f, 4.5f, 1.5f, 2.0f, 4.0f },
                            { 4.5f, 0.0f, 4.0f, 3.0f, 3.5f },
                            { 1.5f, 4.0f, 0.0f, 1.0f, 5.0f },
                            { 2.0f, 3.0f, 1.0f, 0.0f, 4.5f },
                            { 4.0f, 3.5f, 5.0f, 4.5f, 0.0f }
                        };
int main(int argc, char** argv)
{
    point p[5];
    for(unsigned int i = 0; i < 5; ++i)
    {
        p[i].x = (float)(rand()%100)*0.1f;
        p[i].y = (float)(rand()%100)*0.1f;
    }

    // do 1000 iterations
    float dx = 0.0f, dy = 0.0f, d = 0.0f;
    float xmoves[5], ymoves[5];

    for(unsigned int c = 0; c < 1000; ++c)
    {
        for(unsigned int i = 0; i < 5; ++i) xmoves[i] = ymoves[i] = 0.0f;
        // iterate across each point x each point to work out the results of all of the constraints in the matrix
        // collect moves together which are slightly less than enough (DAMPING_FACTOR) to correct half the distance between each pair of points
        for(unsigned int i = 0; i < 5; ++i)
        for(unsigned int j = 0; j < 5; ++j)
        {
            if(i==j) continue;
            dx = p[i].x - p[j].x;
            dy = p[i].y - p[j].y;
            d = sqrt(dx*dx + dy*dy);
            dx /= d;
            dy /= d;
            d = (d - matrix[i][j])*DAMPING_FACTOR*0.5f*0.2f;

            xmoves[i] -= d*dx;
            ymoves[i] -= d*dy;

            xmoves[j] += d*dx;
            ymoves[j] += d*dy;
        }

        // apply all at once
        for(unsigned int i = 0; i < 5; ++i)
        {
            p[i].x += xmoves[i];
            p[i].y += ymoves[i];
        }
    }

    // output results
    printf("Result:\r\n");
    for(unsigned int i = 0; i < 5; ++i)
    {
        for(unsigned int j = 0; j < 5; ++j)
        {
            dx = p[i].x - p[j].x;
            dy = p[i].y - p[j].y;
            printf("%f ", sqrt(dx*dx + dy*dy));
        }
        printf("\r\n");
    }

    printf("\r\nDesired:\r\n");
    for(unsigned int i = 0; i < 5; ++i)
    {
        for(unsigned int j = 0; j < 5; ++j)
        {
            printf("%f ", matrix[i][j]);
        }
        printf("\r\n");
    }

    printf("Absolute difference:\r\n");
    for(unsigned int i = 0; i < 5; ++i)
    {
        for(unsigned int j = 0; j < 5; ++j)
        {
            dx = p[i].x - p[j].x;
            dy = p[i].y - p[j].y;
            printf("%f ", abs(sqrt(dx*dx + dy*dy) - matrix[i][j]));
        }
        printf("\r\n");
    }

    printf("Press any key to continue...");

    while(!_kbhit());

    return 0;
}
jheriko