How to determine if a user accessing your website is not a bot?

views:

304

answers:

+2 Q:

How to determine if a user accessing your website is not a bot?

I know that user agents are one indicator, but that's easy to spoof. What other reliable indicators are there that a visitor is really a bot? Inconsistent headers? Whether images/javascript are requested? Thanks!

+3 A:

"Whether images/javascript are requested?" I would go for this one, however Google and others request images and javascript files nowadays.

How about request time speed? Bots read your content a lot faster than humans do.

Alix Axel 2009-08-27 18:28:16

+1 - beat me to it

DarkSquid 2009-08-27 18:29:26

Isn't that what captcha is invented for?

Janco 2009-08-27 18:29:41

Trying to avoid having bots on your site is not a reason to make life harder for real users... captcha are really a pain, even the times they are not usefull against bots...

Pascal MARTIN 2009-08-27 18:36:53

+4 A:

CVSTrac uses a honeypot page to accomplish this. It's a page linked somewhere on the site where crawlers reach it, but humans usually ignore it. CVSTrac goes one step further by allowing the user to prove that he is human.

Filip Navara 2009-08-27 18:30:47

+2 A:

There are 4 things that we look for:

The user agent string. It is very easy to fake, but often crawlers will use their own unique user agent string.
The speed of access of pages, if they access more than one every half second or so, that's usually a good indication
If they request just the HTML, or if they request the entire page. Some crawlers will only ask for the HTML structure. This is usually a good tip off.
The incoming url

chollida 2009-08-27 18:31:46

pt. 2: Be aware, that it is quite common (for me, at least) to follow several links from the same page within the same second (opening new tabs, obviously).

jensgram 2009-08-27 18:36:09

@jensgram this is why we do it over several seconds and make the interval half a second. We have found it to be an almost perfect indicator. I also open several links at a time from a webpage.

chollida 2009-08-27 18:44:11

Also, I frequently disable image downloading through a web developer plugin, when I am having connection issues and am interested only in reading text.

JYelton 2009-08-27 19:21:31

Take a look at Bad Behavior, a library which employs a wide array of bot detection techniques

Frank Farmer 2009-08-27 19:39:19

+2 A:

A reverse captcha of sorts can help as well; you could create an text input field with display: none; in it's style attribute (or your stylesheet). If it's posted to, chances are you're dealing with a bot.

Edit: This was actually something that had been aggregated in my RSS reader, if I can find the source, I'll link a good example.

Akoi Meexx 2009-08-27 19:42:04

ansaurus

tags:

views:

answers:

How to determine if a user accessing your website is not a bot?

related questions