A: 

Detecting bots/spider/crawlers isn't an exact science I have a PHP array that I have used in the past and checked that against the user agent. The array contains most if not all the main spiders / crawlers you would want to let visit you site. Would this be of interest to you? Or are you looking for somthing else?

Lizard
It would be interesting, I'd be happy if you could share.
Geshan
+4  A: 

Storing session data usually requires cookies on the client. If the bot doesn't accept cookies, it won't work. (Unless you enable use_trans_sid, which will add a session ID querystring to every URL.)

Try doing the user agent check around the code that performs the redirect back to the age verification page, rather than on the age verification page itself.

As an aside, don't use eregi -- it's deprecated. Use the perl-compatible regular expression functions instead.

Ben James
+1  A: 

one possibility might be to put a javascript redirect instead of a header redirect .. bots wont be able to process it and normal ppl will be redirected .. however there will always be a group of ppl who will hava javascript disabled ... but that group would be small and generally over 18 :)

Sabeen Malik
+1  A: 

Your logic is correct. I am however not sure if bots accept and echo session cookies. I suggest that on the pages deep inside that require $_SESSION['legalage'] = true; you also add code to ignore this logic for bots.

FYI, Google Webmaster Tools just added a handy new feature that shows you the actual content sent by the server when Google bot accesses it. Go ahead and use it!

Salman A