views:

95

answers:

4

I'm building a site that has products, each of which belongs to one or more categories, which can be nested within parent categories. I'd like to have SEO-friendly URLs, which look like this:

  1. mysite.com/category/
  2. mysite.com/category/product
  3. mysite.com/category/sub-category/
  4. mysite.com/category/sub-category/product

My question is: Is it safe to depend on a the presence of a trailing slash to differentiate between cases 2 and 3? Can I always assume the user wants a category index when a trailing slash is detected, vs a specific product's page with no trailing slash?

I'm not worried about implementing this URI scheme; I've already done as much with PHP and mod_rewrite. I'm simply wondering if anybody knows of any objections to this kind of URL routing. Are there any known issues with browsers stripping/adding trailing URLs from the address bar, or with search engines crawling such a site? Any SEO issues or other stumbling blocks that I'm likely to run into?

+2  A: 

In addition to the other pitfall ideas you mentioned, the user might himself change the URL (by typing the product or category) and add/remove the trailing "/".

To solve your problem, why not have a special sub-category "all" and instead of "mysite.com/category/product" have "mysite.com/category/all/product"?

DVK
True, hadn't considered that. Users will seldom be typing in these URLs however; the target audience is extremely non tech-savvy. I'm still primarily concerned with SEO and browser functionality. I'm also considering the place-holder sub-category, though I'd planned on using 'products': `mysite.com/category/products/product_name`.
meagar
A: 

Never assume the user will do anything BUT the worst case scenario in anything URL related.

unless you're prepared to do redirects in your code, assume you have the equal chance of a URI ending in slash or no slash. Only way to make sure your code is robust and thus won't have to worry about this kind of issue.

Jason M
+2  A: 

To me, it seems very unnatural that http://product/ and http://product would represent two entirely different resources. It is confusing, and it makes your URLs less hackable, since it is difficult to tell when a trailing slash should be present or not.

Also, in RFC 3986, Uniform Resource Identifier (URI): Generic Syntax, there is a note on Protocol-Based Normalization in chapter 6.2.4, which talks about this particular situation with regard to non-human visitors of your site, such as search engines and web spiders:

Substantial effort to reduce the incidence of false negatives is often cost-effective for web spiders. Therefore, they implement even more aggressive techniques in URI comparison. For example, if they observe that a URI such as

http://example.com/data

redirects to a URI differing only in the trailing slash

http://example.com/data/

they will likely regard the two as equivalent in the future. (...)

bzlm
A: 

One way to differentiate would be to make sure product pages have an extension, but category or sub-category pages to not. That is:

  1. mysite.com/category/
  2. mysite.com/category/product.html
  3. mysite.com/category/sub-category/
  4. mysite.com/category/sub-category/product.html

That makes it unambiguous.

Andrew B