Facebook's developer principles and policies and the general terms of use seem to forbid automated data collection, but graph.facebook.com/robots.txt seems to allow it:
User-agent: *
Disallow:
Does anybody know how to make sense of this?
Facebook's developer principles and policies and the general terms of use seem to forbid automated data collection, but graph.facebook.com/robots.txt seems to allow it:
User-agent: *
Disallow:
Does anybody know how to make sense of this?
They don't want you to scrape their data, but they want Google to index the site.
Terms of Use trump robots.txt. Just because they have not taken the measure of preventing you from doing something, does not mean you are allowed to do it.
Yes, they could change robots.txt to prevent crawling of graph.facebook.com. However, that means they for any company they want to allow them access, they'll have to make an exception in that file, which would serve as potential disclosure of their business deals.
On the other hand, they could actually generate that file on the fly and actually return different robots.txt for agents that identify themselves as coming from a company they have private deal with. Not sure it's worth it, though; sometimes establishing a policy is cheaper and more effective than coming up with a technical solution to enforce it.