How to sample a scale-free graph | ansaurus

tags:

views:

27

answers:

1

Q:

How to sample a scale-free graph

Given a large scale-free graph (a social network graph), what's the best way to sample it such that the sample retains an acceptable abstraction of the properties of the original?

I have a large graph (Munmun's twitter dataset, if you know it). But I need a connected sample of that graph with a reasonably large diameter (tl;dr... reasons why on request... a diameter of 10 would be good).

The problem is that any kinda breadth-first search always is likely to come across some massively connected nodes. So I start such a search, getting the friends of all nodes which I come across. I inevitably come across some massively-connected nodes, and have to get all their friends. This is a problem because I end up with a large number of nodes which are close to each other in the graph. To make programmatic analysis feasible, I have to limit the number of nodes (and edges). The whole point of this exercise is to find shortest paths between nodes, so I'm generally interested in ALL of a node's neighbours. And that's the problem.

One hack around this is to limit the max. number of nodes connected to a user which I'm interested in. For instance, if I come across @barackobama in my breadth-first search, I ensure that I only accept some small proportion of his friends and ignore the rest. But would this hacked graph be worth a damn, or am I losing too much information in terms of finding shortest paths??

Hope that makes sense...

A:

I am not sure, if I understand your question correctly. I think the main question you have is, about how you can compute the shortest path of two nodes in a giant, directed graph. Creating a subsample of the graph seems to be your attempt to create an efficient solution. (But I probably misunderstood you completely.)

Perhaps this SO-Question has some pointers for you: Efficiently finding the shortest path in large graphs

The graphs in that question seem to be significantly smaller, though.

bjoernz 2010-10-29 11:31:27

Thanks... the info on that page is useful...

John Conroy 2010-10-30 13:15:43

related questions

is there a summary of technical limitations on apps written for social networking sites somewhere?

How do I write Facebook apps using Django?

How to prevent a user from registering multiple accounts to vote themselves up

Social networks in an Delphi win32 application

Google Latitude API

Need help with FB.Connect.showFeedDialog on facebook!

Social Network

Innovative uses of social networking in a commercial environment

What is your opinon about Community Server like platforms ?

Download and install Ning on other servers

What's The Best Language For creating a Social network?

A Social Software reading digest: What goes in?

OpenSocial server implementation

Database design: Best table structure for capturing the User/Friend relationship?

Best practices for allowing users to choose a display name on a social networking website?

WAP Site vs. Traditional HTML for a Mobile Website

What are the best Java social networking frameworks?

Can an iPhone or iPod Touch communicate with another device in the same room?

Friendship relations tracking algorithm

How do I go about "networking" to find a better job?

What is the right info to cache? What's a good page load time?

What's the best manner of implementing a social activity stream?

A good pattern/solution to the social web user problem of point whoring?

Open source simple CRM database / Social Network site that

Best Ruby on Rails social networking framework