I want to prevent automated html scraping from one of our sites while not affecting legitimate spidering (googlebot, etc.). Is there something that already exists to accomplish this? Am I even using the correct terminology?
EDIT: I'm mainly looking to prevent people that would be doing this maliciously. I.e. they aren't going to abide by robots.txt
EDIT2: What about preventing use by "rate of use" ... i.e. captcha to continue browsing if automation is detected and the traffic isn't from a legitimate (google, yahoo, msn, etc.) IP.