tags:

views:

98

answers:

3

Some pages on my website appear differently depending on where the user has been, using php sessions.

for example with breadcrumbs:

standard crumb setup:

All Books -> fiction -> Lord Of the Flies

if the visitor has just been on the 'William Golding Page', a session will have been created to say, this visitor is broswing by author, so i would check

if( $_SESSION['browsing by] == 'author' ):

and the breadcrumbs (for the exact same page as before) would now be:

Authors -> William Golding -> Lord Of the Flies

to summarise:

So 1 page exists for each book, but depending where the user has come from, the page will show different breadcrumbs.

the questions:

  1. Can search engines create my 'browsing by' SESSION?
  2. Would they index the same page multiple times (for each variation)?
A: 

Not really - the search engine is likely to see the the different bread-crumbs, but it will only index one version of the page.

The way search engines crawl is by clicking on every available link for urls that haven't been crawled, and seeing the content of that url. It may land on the page again, and notice the different content, but it will figure that this is just a change to the page, and not path-specific.

The best way to achieve what you'd like is to try to and use the URL - that way, the crawler will consider the page an entirely different page and will crawl it again with the different content.

Eg.

  1. User goes to books page, (http://www.mysite.com/books)
  2. User clicks Lord of the Flies link, (http://www.mysite.com/books/LordOfTheFlies)

vs.

  1. User goes to authors page, (http://www.mysite.com/authors)
  2. User clicks William Golding link, (http://www.mysite.com/authors/WilliamGolding)
  3. User clicks Lord of the Flies link, (http://www.mysite.com/authors/WilliamGolding/LordOfTheFlies)

The way you can do this is using mod_rewrite, and changing the url from:

http://www.mysite.com/?authors=WilliamGolding&book=LordOfTheFlies and rewriting it to: http://www.mysite.com/authors/WilliamGolding/LordOfTheFlies

AlishahNovin
yes the URL will be identical. Are sessions the way most sites store shopping cart info?
Haroldo
Yes and no.Amazon writes your shopping cart to a database. That's why, if you're logged in on multiple computers, you'll always have the same items in your shopping cart.Some sites may use sessions, because it's an easy implementation, but that's more reliable for smaller merchants.Larger merchants have a harder time relying on sessions, because a session is written data on the server. If you have multiple servers hosting your site, you'll have to rely on "sticky sessions" which can be a pain. Also, if your site has lots of traffic, you'll have lots of data written to your server.
AlishahNovin
Others may just use cookies, which is a lighter load on the server, because it gets written to the user's computer and not to the server.
AlishahNovin
A: 

To make sure everything is indexed create a sitemap with links to all pages.

I would detect spiders via $_SERVER['HTTP_USER_AGENT'] and treat them differently than regular users, always showing default navigation.

serg
$_SERVER['HTTP_USER_AGENT'] - is this a failsafe way of detecting crawlers?
Haroldo
Well it is a safe way of detecting "good" crawlers like google and yahoo which are not trying to hide themselves. I wouldn't worry about the others much.
serg
A: 

I'm writing this as another answer, because you may prefer it as an alternative solution:

Search Engines don't really distinguish between visible and invisible content. So one thing you could consider doing is always using the most verbose breadcrumb list... like:

All Books -> Authors -> William Golding -> Fiction -> Lord Of the Flies

This would then show everything about the book, and will allow your page to be crawled better. But then, what you do in terms of visual styles is that you hide all irrelevant links, so in one case you have:

All Books -> Fiction -> Lord Of the Flies

and in the other, you have:

Authors -> William Golding -> Lord Of the Flies

In this way, the links are always there, but users will only see what's relevant while the search engine will see all breadcrumbs, so it will scan and crawl the page properly.

AlishahNovin