views:

139

answers:

3

Suppose we have a connected and undirected graph: G=(V,E).

Definition of connected-set: a group of points belonging to V of G forms a valid connected-set iff every point in this group is within T-1 edges away from any other point in the same group, T is the number of points in the group.

Pls note that a connected set is just a connected subgraph of G without the edges but with the points.

And we have an arbitrary function F defined on connected-set, i.e given an arbitrary connected-set CS F(CS) will give us a real value.

Two connected-sets are said disjoint if their union is not a connected set.

For an visual explanation, pls see the graph below: In the graph, the red,black,green point sets are all valid connected-sets, green set is disjoint to red set, but black set is not disjoint to the red one. alt text

Now the question: We want to find a bunch of disjoint connected-sets from G so that: (1)every connected-set has at least K points. (K is a global parameter). (2)the sum of their function values,i.e max(Σ F(CS)) are maximized.

Is there any efficient algorithm to tackle such a problem other than an exhaustive search? Thx!

For example, the graph can be a planar graph in the 2D Euclidean plane, and the function value F of a connected-set CS can be defined as the area of the minimum bounding rectangle of all the points in CS(minimum bounding rectangle is the smallest rectangle enclosing all the points in the CS).

+1  A: 

I doubt there is an efficient algorithm since for a complete graph for instance, you cannot solve the problem without knowing the value of F on every subgraph (except if you have assumptions on F: monotonicity for instance).

Nevertheless, I'd go for a non deterministic algorithm. Try simulated annealing, with transitions being:

  • Remove a point from a set (if it stays connected)
  • Move a point from a set to another (if they stay connected)
  • Remove a set
  • Add a set with one point

Good luck, this seems to be a difficult problem.

Alexandre C.
Thx for you helpful suggestion, i will try it in my implementation.
outlaw
A: 

For such a general F, it is not an easy task to draft an optimized algorithm, far from the brute force approach.
For instance, since we want to find a bunch of CS where F(CS) is maximized, should we assume we want actually to find max(Σ F(CS)) for all CS or the highest F value from all possible CS, max(F(csi))? We don't know for sure.
Also, F being arbitrary, we cannot estimate the probability of having F(cs+p1) > F(cs) => F(cs+p1+p2) > F(cs).

However, we can still discuss it:

It seems we can deduce from the problem that we can treat each CS independently, meaning if n = F(cs1) adding any cs2 (being disjoint from cs1) will have no impact on the n value.

It seems also believable, and this is where we should be able to get some gain, that the calculation of F can be made starting from any point of a CS, and, in general, if CS = cs1+cs2, F(CS) = F(cs1+cs2) = F(cs2+cs1).

Then we want to inject memoization in the algorithm in order to speed up the process when a CS is grown up little by little in order to find max(F(cs)) [considering F general, the dynamic programming approach, for instance starting from a CS made of all points, then reducing it little by little, doesn't seem to have a big interest].

Ideally, we could start with a CS made of a point, extending it by one, checking and storing F values (for each subset). Each test would first check if the F value exists in order not to calculate it ; then repeat the process for another point etc..., find the best subsets that maximize F. For a large number of points, this is a very lengthy experience.

A more reasonable approach would be to try random points and grow the CS up to a given size, then try another area distinct from the bigger CS obtained at the previous stage. One could try to assess the probability explained above, and direct the algorithm in a certain way depending on the result.

But, again due to lack of F properties, we can expect an exponential space need via memoization (like storing F(p1,...,pn), for all subsets). And an exponential complexity.

ring0
A: 

I would use dynamic programming. You can start out rephrasing your problem as a node coloring problem:

  • Your goal is to assign a color to each node. (In other words you are looking for a coloring of the nodes)
  • The available colors are black and white.
  • In order to judge a coloring you have to examine the set of "maximal connected sets of black nodes".
    • A set of black nodes is called connected if the induced subgraph is connected
    • A connected set of black nodes is called maximal none of the nodes in the set has a black neighbor in the original graph that is not contained in the set)
  • Your goal is to find the coloring that maximizes ΣF(CS). (Here you sum over the "maximal connected sets of black nodes")
  • You have some extra constraints are specified in your original post.

Perhaps you could look for an algorithm that does something like the following

  • Pick a node
  • Try to color the chosen node white
    • Look for a coloring of the remaining nodes that maximizes ΣF(CS)
  • Try to color the chosen node black
    • Look for a coloring of the remaining nodes that maximizes ΣF(CS)

Each time you have colored a node white then you can examine whether or not the graph has become "decomposable" (I made up this word. It is not official):

  • A partially colored graph is called "decomposable" if it contains a pair of none-white nodes that are not connected by any path that does not contain a white node.

If your partially colored graph is decomposable then you can split your problem in to two sub-problems.

EDIT: I added an alternative idea and deleted it again. :)

nielsle