views:

1821

answers:

6

I have data stored in relational database mysql and PHP. I have a table called "rel" which has two fields:

from_node  |  to_node
=====================
1               2
1               3
2               3

and so on......

How can I calculate the network Diameter of a network. I know it is the longest or shortest path between any two pairs but how do I calculate it? Is there any PHP script or function which can help me to do it?

A: 

In your example you show that each node links to every other node. If this is the case throughout your setup, then the diameter is 1.

If your setup is a line like so in a linear formation:

n=1, n = 2, n = 3, ... n

Then your diameter is (n+1)/3.

If your setup is more irregular, with a series of N number nodes and K number of links, then your diameter is at least logN/LogK

Edit: To clarify, I'm calculating the average shortest distance between pairs of nodes.

n1 - n2 - n3
(n+1)/3 = 4/3

n1-n2 = 1 hop
n2 - n3 = 1 hop
n1- n2 - n3 = 2 hops
(1+1+2)/3 = 4/3
ttony21
This is true, though I'm suspecting his example probably over-simplifies the situation.
Noldorin
In a linear formation, isn't the diameter (n-1)? It is the longest path between any two nodes, so the linear setup surely is of diameter (n-1) as there are that many hops from the first to the last node. A simple loop (where node n connects to node 1) would have a diameter of (n-1)/2, I think. Also, in general, the length of the path between nodes should be given; we're both assuming each hop is of length 1 in the absence of any better information.
Jonathan Leffler
I was basing it off of being the average shortest distance between pairs of nodes. That's what all the sources I found seem to agree on as the definition.
ttony21
Oh - Wikipedia says:"The eccentricity ε of a vertex v is the greatest distance between v and any other vertex.The diameter of a graph is the maximum eccentricity of any vertex in the graph. That is, it is the greatest distance between any two vertices."
Jonathan Leffler
A: 

See the Wikipedia article on graph (network) terms related to distance and diameter. It mentions a few notes on how to find the diameter. This section of the article on connected components of graphs also suggests an algorithm to discover these connected components, which could be adapted very easily to tell you the diameter of the graph. (If there are more than one components then the diameter is infinite I believe.) The algorithm is a simple one based on a bread/depth-first search, so it shouldn't be much trouble to implement, and efficiency shouldn't be a great problem either.

If you're not up to writing this (though I don't think it would take that much effort), I recommend looking for a good network/graph analysis library. There's a few out there, though I'm not sure which ones you'd want to look at using PHP. (You'd probably have to use some sort of interop.)

Hope that helps, anyway.

Noldorin
A: 

I really think you meant you wanted to find the cluster coefficient of a network. Furthermore, you'd like to do it in PHP. I don't know how many good network analysis libraries have been ported to PHP extensions.

However, if you follow the article, it should not be (too) difficult to come up with your own. You don't need to produce pretty graphs, you just have to find the coefficient.

If that's not what you meant, please update / clarify your question.

Tim Post
+1  A: 

Assuming you have a connected graph (otherwise the max distance is infinite) and all your node points are numbers....

Seed a table (from, to, distance) with (from_node, to_node, 1). For each tuple, you must ensure that the value of from_node is always less than the value of to_node

CREATE TABLE hops (
    from_node int not null,
    to_node int not null,
    distance int not null default 0,
    primary key (from_node, to_node, distance)
)

-- first load:
INSERT INTO hops (from_node, to_node)
SELECT from_node, to_node FROM rel;

-- iterative step
INSERT INTO hops (from_node, to_node, distance)
SELECT a.from_node, b.to_node, min(a.distance+b.distance)
FROM hops a, hops b
WHERE a.to_node = b.from_node
  AND a.from_node <> b.from_node  -- not self
  AND a.to_node <> b.to_node      -- still not self
  AND a.from_node <> b.from_node  -- no loops
  AND NOT EXISTS (                -- avoid duplicates
          SELECT * FROM hops c
          WHERE c.from_node = a.from_node
            AND c.to_node = b.to_node
            AND c.distance = a.distance+b.distance)
GROUP BY a.from_node, b.to_node

Execute the insert repeatedly until no rows are inserted. Then select max distance to get your diameter.

EDIT: For a graph with weighted vertices, you would just seed the distance field with the weights rather than using 1.

jmucchiello
A: 

Network is a connected graph. So why don't you try to build some graph representation from your data and perform BFS or DFS on this? You will get exactly that you are looking for.

Artem Barger
A: 

That's simple:

  • Prepare
    • Add a colum named distance
    • Give all nodes the distance of -1
  • First Iteration
    • Pick any node (e.g. the first)
    • give it the distance of 1
    • Now iterate until there are nodes with distance -1
      • UPDATE table SET distance=:i+1 WHERE from_node IN (SELECT to_node FROM table WHERE distance=:i)
  • Second Iteration
    • Pick a node that has the maximum distance (any) - remember it
    • Set all distances back to -1
    • Set your remebered node to 1
    • Call the iteration a second time

This time the maximum distance is the diameter of your graph/network.

MrFox