views:

418

answers:

4

Is there a way to crawl all facebook fan pages and collect some information? like for example crawling facebook fan pages and save their names, or how many fans, etc? Or at least, do you have a hint of how this could be possibly done?

+1  A: 

Make a web crawler

Ben
how to make it only crawl facebook fan pages?
Start by choosing a language, python is a good choice for crawlers. Then spot a pattern on the fan pages urls so your web crawler starts to pick them up recursively and access them. Then using regular expressions get the data you need.Sounds hard but it's not.
Ben
the problem is: there is no pattern for fan pages urls, you need to figure them out through looking at the content itself which is why i asked the question
sure there is, there always is, maybe not in the url itself but in the container of the urls, the id of the a tag, the class, etc. Remember that you'll get html, that's what you'll be parsing.
Ben
+1  A: 

Write a crawler.

Stephen
+1  A: 

First select a page that contains your desired category for pages:

For Example: http://www.facebook.com/pages/ or http://www.facebook.com/pages/?browse&ps=93

Then use a crawler to get all pages links.

Now you can parse each page separately using extracted links.

You can use simple html dom for crawling.

NAVEED
A: 

RE: Stephen:

Could you please explain your response in a little more detail? I'm not the most fluent programmer, but would like to be able to export the links to the profiles of all of the fans of a band into a csv file. Could you explain how you parse out the "fans" divider and view its source? In addition - how do you parse out the fans, then parse out the next page link?