hello, i have a home page with some links and mail ids i need to stop scraping my urls and mail-ids form that web page... i have used robots.txt but most of the bad crawlers wont respect that....
Well, you can always try obfuscating your URLs with javascript or images or something. But please don't do that. You'll just anger people with old browsers and blind people who use screen readers. Just use a spam filter to stop people spamming your e-mail address.
If you have a content-heavy site and you want to stop people from scraping your content, you might try limiting visitors to ten hits every ten seconds. That'll be enough for most visitors, but it'll significantly decrease the speed of content scrapers. You can tweak this algorithm as you go, and ban the IPs of serious offenders.
You could encode some links, e.g. foo@bar.com
instead of [email protected]
.
This has already been asked. http://stackoverflow.com/questions/396817/protection-from-screen-scraping might help you.