views:

350

answers:

1

Are all these types of sites just illegally scraping Google or another search engine?
As far as I can tell ther is no 'legal' way to get this data for a commercial site.. The Yahoo! api ( http://developer.yahoo.com/search/siteexplorer/V1/inlinkData.html ) is only for noncommercial use, Yahoo! Boss does not allow automated queries etc.
Any ideas?

+1  A: 

For example, if you wanted to find all the links to Google's homepage, search for

link:http://www.google.com

So if you want to find all the inbound links, you can simply traverse your website's tree, and for each item it finds, build a URL. Then query Google for:

link:URL

And you'll get a collection of all the links that Google has from other websites into your website.

As for the legality of such harvesting, I'm sure it's not-exactly-legal to make a profit from it, but that's never stopped anyone before, has it?

(So I wouldn't bother wondering whether they did it or not. Just assume they do.)

scraimer