views:

43

answers:

2

I have been Googling for sometime but I guess I am using the wrong set of keywords. Does anyone know this URI that lets me request permission from Facebook to let me crawl their network? Last time I was using Python to do this, someone suggested that I look at it but I couldn't find that post either.

+1  A: 

Since this is a community with login & password, I am not sure how much of it is legally crawl-able. If you see even Google indexes just the user profile pages. But not their wall posts or photos etc.

I would suggest you to post this question in Facebook Forum. But you can check it up here -

  1. Facebook Developers
  2. Facebook Developers Documentation
  3. Facebook Developers Forum
MovieYoda
+3  A: 

Amazingly enough, that's given in their robots.txt.

The link you're looking for is this one:

http://www.facebook.com/apps/site_scraping_tos.php

If you're not a huge organization already, don't expect to be explicitly whitelisted there. If you're not explicitly whitelisted, you're not allowed to crawl at all, according to the robots.txt and the TOS. You must use the API instead.

Don't even think about pretending to be one of the whitelisted crawlers. Facebook filters by whitelisted IP for each crawler and anything else that looks at all like crawling gets an instant perma-ban. For a while users who simply clicked too fast could occasionally run into this.

Paul McMillan