views:

28

answers:

2

I have a User class in my web app that represents a user currently logged in. Every time a user vists a page, a User instance is populated based on authentication data supplied in cookies.

A User instance is created even if an anonymous user logs in - and a corresponding new record is created in the User table in the database.

This approach allows me to save some state info for the current user regardless of its type.

The problem however with this approach is the Google bot, and other non-human web organisms crawling my pages. Every time a bot starts to walk around the site, thousands of useless records will be created in the database, each of them only to be used for a single page.

Question: what is the best trade off? How to support anonymous users, save their state, and don't get too much overhead because of cookieless bots?

A: 

I think the best strategy here is to manually add the "exception" for bots. You might do either of two:

A. Do not create user object for bots (this is the best thing to do if your application allows the normal flow) B. Create a single User object for bot and use it every time it tries to load a page.

Juriy
good, but there is no reliable way to detect a bot :)
Andy
Speaking about google bot: it has "own" user-agent. So you can detect it. Since you only want to give google bot publicly available pages there's no danger that malicious user will set the same user agent by hand.
Juriy
A: 

Usually the can check the User Agent header of the request, it will include things like YahooSlurp or GoogleBot or SomeOtherTypeOfBot.

If you're using .NET, there is a property in Page.Request.Browser.Crawler that should indicate if it's a bot. I'm not sure if/how this is represented in other platforms

Be aware though that some crawlers had a tendancy to hide the fact that they are a crawler (I've seen MSN do this recently), and just send a User Agent field that looks like a regular browser, so you'd have to filter those by IP range, but that just becomes a game of whackamole, so you may end up just living with those cases.

Mike Mooney