views:

117

answers:

1

I have the following problem.

I've got a set of items {a1, a2, a3, ... aN}. Each one of those items can contain another set of items {b1, b2, b3, ... bN}. So the end result looks something like this:

a1
  b4
  b22
  b40
  b11
  b9
a2
  b30
  b23
  b9
  b4
  b11
a3
  b22
  b4
  b60
  b9

As a result of the execution of the algorithm I would like to get the groups of b-type objects that fall under following rules:

  1. If more than one b-type object under an a-type object only exists under that a-type object, they should be grouped.
  2. If more than one b-type object is used in more then one a-type object they also should be grouped.

Example:

b4, b9
b30, b23

b40, b60, b11 and b22 shouldn't be grouped because there are no pairs for them.

I would be writing the implementation of this algorithm in C#, so it would be nice to avoid data structures that don't exist in it, such as binary trees, linked lists, etc. But this is not a requirement; all of these could be implemented too.

Clarification: Sets can contain as many objects as needed, but not more than 1 of each. The rules are that all unique objects of b-type within the same a-type should be grouped (more than 1), and if more than 1 b-type object fall within more than 1 a-type object, they should be grouped. The groups should be as large as possible.

Real life example: Web-pages are a-type and the CSS files used on those pages are b-type. In order to speed up the loading of the pages, you want to have as few requests as possible going to the server, so you combine CSS files, but you don't want to combine files that are used by themselves only on a few pages, since they will be cached and you don't have to re-download them again.

+2  A: 

First, create a map that associates every b-type item with the set of the a-type items that contains that b-type item.

In your example, you'll get:

b4: { a1, a2, a3 }

b9: { a1, a2, a3 }

b11: { a1, a2 }

b22: { a1, a3 }

b23: { a2 }

b30: { a2 }

b40: { a1 }

b60: { a3 }

To create this map, scan all the b-type objects in the a-type objects, and if the b-type object has no association, create an entry in the map for it; however add the a-type object to the set associated to the b-type object.

Then, compare every possible pairs of sets (O(n2)). Merge two b-type objects if they have the same set of a-type objects.

It is implemented as a pair of nested loops over the map (I=1 TO N-1, J=N+1 TO N).

To compare two sets, use a set library and its comparison operator.

If the a-type objects may be represented by small integers, you can represent the set as a bit array, that is quite compact and quick to compare.

carlo.milanesi
Great answer. Thanks! And easy too.. Should've thought of that myself:-)
Ilya Volodin