views:

409

answers:

3

I'm trying to come up with the largest possible group of friends that would theoretically get along with each other, i.e., each person in the group should know at least 50% of the other people in the group.

I'm trying to come up with an algorithm for this that doesn't take ridiculously long; Facebook's API/cross-server talk is pretty slow as is.

I was thinking I could start with the friend that has the most mutual friends with me first, and then add people to the group one by one. But who would I choose next?

Just interested in the theory, no code is necessary.


Edit: When I said "theory", what I really meant what's the next logical step in plain english :) I was hoping I could code this up in an afternoon, but I guess this is a bit more complicated than I anticipated, and I'm not sure I want to spend weeks delving into heavy graph theory. Nevertheless, maybe someone else will find this interesting.

+2  A: 

MIT did some work on social graphing a while back. Although it used mobile phone data, the clustering algorithms and other systems should still apply, even though they are constructed using different inputs and criteria.

There is more MIT chatter about social graphing going on at the moment. Definitely the place to look for technical pointers on this kind of thing.

Whilst the problem of graph enumeration from a given node to it's edges is NP complete for most useful problems ... the application of the graph traversal and the wealth of information might help you make this more efficient:

  1. For any node (profile) N, you could data-scrape using Google or something to find associated edges out. This means that you can harness a cache of the pages and Googles search technology to mitigate having to traverse the edges yourself.

  2. Social profiles contain tons of meta-data. Developing a statistical analysis method for working out the likelyhood of A knowing B without a direct path might be useful. Afterall friends have a) similar locations and b) similar interests

  3. Other data, seemingly irrelevant can provide a means for locating people likely to know eachother and then you can double check the edges. Things such as chatter on boards about a band or gig, or people mentioning "cat fight" when Kate smacked Mary in the mouth.

The data just needs looking at in the right way, in the same way MIT looked at geographical statistics to determine relationships through phones.

Good Luck

Aiden Bell
Interesting articles :) Thanks. I think using probabilities would be a bit overkill for a tiny app like this, but I can see how that would be useful for something larger. Wouldn't have thought of that. Thanks
Mark
+1  A: 

There is an Algorithm called SCAN-Algorithm with some precalculations the algorithm can cluster a network in a good speed.

You can find informations about the algorithm here: SCAN: A Structural Clustering Algorithm for Networks

Janusz
+1  A: 

This is more "broad", but see if it helps to get ideas.

JG
Haha..that's a pretty neat tool. I gave it a couple of my sites, and it found my stack overflow account, even though none of them link to it (but SO links to them). I'm sure I could convert FB data to the proper format, but can this API be used to solve this particular problem? I dunno... I have to delve in a bit deeper. Thanks for the link.
Mark