views:

127

answers:

1

I'm facing tho following problem:
let's say u is a social network user and as such has a list of friends, F(u). a partition is a function F->G, where G is a set of groups such as High-school, university, work, etc'.
I need to come up with algorithm to partite F:

  • the input is F and also F(f) for every f in F (the list of friends for each of u's friends).
  • during it's run the algorithm is allowed to ask u questions (for example "what is the best group for some specific user v?").
  • The amount of question should be kept to minimum (what is minimum is not really a clear number, but I would say 5% of the number of friends seem about right).

obviously the resulting partition would not be optimal, but it should be acceptable as a starting point for later refinements.

any thoughts would be greatly appreciated

edit: no it's not homework. i believe homework would have clearer defined requirements and target function. anyway no, this is actually real world problem i'm facing.

also i may have simplified it a bit, but in reality a user could be part of many groups (so it's more like F->P(G), where P(G) is the power group if G), so a better algorithm would be able to do that.

+2  A: 

The basic idea would be to try to partition them into groups based on which of your friends are friends with each other.

For example, if you are Bob and you know Sally and Larry and Sally and Larry both know each other, they are probably in the same "group". You don't know what that group is yet, but since you all know each other, you probably met at the same place - whether that be work, university, etc.

You could implement this as a directed graph where the nodes are people and the edges are connections. You would then need to group those nodes together based on how well connected they are.

Once you've established the groups, then it's just a matter of querying for a sample from the groups and potentially ambiguous nodes to figure out what the groups actually are.

Sounds like homework, so I won't give away anything else, but that should get you started.

Eric Petroelje
I just found an algorithm to cluster a graph here, very interesting in that it seems quite performing: http://www.internetmathematics.org/volumes/1/4/Flake.pdf
Matthieu M.