tags:

views:

25

answers:

2

What are the implications (SEO-wise) of having the same resource at many different URLs? I've seen some websites that practically never show a 404 page. Any wrong URL path will simply render the homepage. Other sites, for example, redirect http://example.com/path/ to http://example.com/path - (no trailing slash) or vice versa in order to avoid duplicate URLs.

Is this a good practice and why (not)?

+1  A: 

The largest implication to having the same resource at many different URLs is that your search results (notably Google, I'm not sure how SEO works for other search engines) will be diluted/fragemented. Instead of ranking the resource higher in search result relevance, multiple URLs will rank lower even though they point to the same resource.

It's generally good practice to normalize your URLs for SEO. The issue most website administrators have with supporting normalized URLs is that it sometimes requires drastic changes to their URL structure, and this isn't always possible. To alleviate having to change the URLs directly, there's a canonical-url attribute in a link tag that's supported by Google's webcrawler:

http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

It's a step in the right direction. For more information on normalized URLs, the wiki article is helpful:

http://en.wikipedia.org/wiki/URL_normalization

As for trailing slashes, I'm not sure if webcrawlers count these variations distinctly. If in your example, http://example.com/path/ is a directory, then it should have a trailing slash. If path is the name of a file, the trailing slash should be omitted. In IIS at least, when a trailing slash is omitted, the server hunts for a file first, and if not file is found, checks to see if a directory by that name exists. If the directory exists, it redirects internally by adding a trailing slash. This amounts to extra work on the webserver's end that isn't necessary if you're generating internal links on your pages.

Robert Hui
A: 

"Demystifying the 'duplicate content penalty'" is a pretty nice article on various duplicate content issues. Google's Duplicate Content help page seems to be kept up to date on the best ways to handle it from a technical perspective.

gk5885