views:

136

answers:

5

I have noticed that many blogs use URLs that look like this:

http://www.hanselman.com/blog/VirtualCamaraderieAPersistentVideoPortalForTheRemoteWorker.aspx

I assume this is done for search engine optimization.

How is this read from the underlying data model? Do you really search for

VirtualCamaraderieAPersistentVideoPortalForTheRemoteWorker

in the database?

If so, how is the description managed? If it is a key, is the rule that it can never be changed once it is created, without breaking web links?

+2  A: 

Typically when the article is created, that string is stored in the database as a key, yes. Some blog engines like Wordpress allow you (the author) to manually change what that string is, and after you do that, links to the old string will no longer function.

In Wordpress, they call this the "permalink," although different engines have their own names for it. I don't think there is a universal term for it.

OverloadUT
+1 for Permalink
Robert Harvey
A: 

Read up on Dynamic URLs and Apache mod_rewrite.

Also check this mod_rewrite rule generator out.

As an example of what mod_rewrite can do for you as far as descriptive URLs, The permalink might be thought of as the url:

 http://www.somesite.com/catalog.php?cat=widgets&product_id=1234

And the rewrite module will create the much more descriptive and simpler:

 http://www.somesite.com/catalog/widgets-1234.html

dynamically as needed. I am not sure if any of these mappings are cached server-side for future use, but I don't imagine it uses a huge amount of overhead to process the rules. Here is the rule that did the above rewrite which is placed in a .htaccess file:

 RewriteEngine On
 RewriteBase /
 RewriteCond %{QUERY_STRING} ^cat\=([^&]+)\&product_id\=([^&]+)$
 RewriteRule ^$ /catalog/%1-%2.html [R=301,

This example was found here.

It does not take much overhead to dynamically generate a descriptive URL on the fly and have it serve the permalink content. I don't think they are worrying about storing or caching the rules in a database at all.

It seems to be highly recommended by SEO enthusiasts that you create a google sitemap.xml to assist in google indexing these (possibly infinite, or to the upper bound of URL length which is undefined but > 2000 char URL won't work in many browsers) statically generated pages. As long as the rules are deterministic they might as well be permalinks.

Sean A.O. Harney
This is how to do URL rewriting, which is not what his question was about. He's wondering specifically about the database techniques used when dealing with string based URLs instead of ID based.
OverloadUT
Isn't URL rewriting how just about everyone does descriptive URLs?
Sean A.O. Harney
You would still have to look up the description in the database, wouldn't you? If you already have the database record, why bother with URL rewriting?
Robert Harvey
+1  A: 

There are different strategies for search engine friendly URLs. Given your example URL, you could for example search for the whole string, or use the C# hash value as (probably non-unique) key. Either way links to this page will break if the title is changed. One solution is to embed an additional unique key in the URL (see amazon.com for examples).

If you're interested in the way dasBlog handles URLs, you can get the full source code at http://www.dasblog.info/ .

Malte Clasen
+1 for the hash idea, hadn't thought of that before. I suppose your titles would have to be longer than 32 bytes for this to benefit, since a SHA256 would be needed to guarantee uniqueness.
Robert Harvey
+3  A: 

You are correct that it is done for search engine optimization. It works best if you separate the individual words with dashes or underscores however.

These SE-friendly url portions are often called slugs or url slugs. A slug must be unique in your application, and generally the function that creates or checks them must take this into account.

Just like anything else, there are multiple ways to implement something like this. Generally you store a string of text about a database item, eg. an article title. You can convert this into the url slug at load time dynamically if you don't want to store it, or you can save the real title and the url slug at insert/update time, and use the slug as your database selection criteria when loading the relevant page.

If you want to be super-robust with your app, you could automatically save a slug history, and generate "301 Moved Permanently" headers whenever a slug changed.

zombat
I'm afraid that you are incorrect, and these urls are worthless now to Google now. More information here: http://bit.ly/xWc1p
Dan Atkinson
Robert Harvey
@Dan - What am I incorrect about? Assuming everything in that Google blog post is correct, that doesn't make anything in my answer erroneous.You should reconsider your downvote.
zombat
@zombat, you are correct! :)
Dan Atkinson
Appreciate the reconsideration, thanks. :)
zombat
+1  A: 

There's no reason why VirtualCamaraderieAPersistentVideoPortalForTheRemoteWorker couldn't be a key in the 'posts' table, given that even a really big blog won't have more than, say a couple of thousand rows in the post table.

If you did decide to rewrite it, then you could create a 301 redirect for that url without a lot of damage SEO-wise.

But, as I discussed in the comments to your question, the bearing of static urls like this on SEO is no longer relevant. The real benefit is for the user to have a structure that's visually easier to navigate ('hackable' urls).

Google wouldn't care if the url said:

hanselman.com/blog/index.aspx?id=123

or

hanselman.com/blog/foobar.aspx

The ranking would be the same, regardless.

Dan Atkinson
Why do you keep linking to an article that has nothing to do with descriptive URLs?
Robert Harvey
A static url has a LOT to do with a pretty url! Why go to all the trouble of creating a static url if it wasn't the least bit readable?! I agree with what you're saying though, and zombat is correct about the answer to your question. I've sort of taken away from the focus of what the actual question is. My recommendation would be to accept zombat's answer. :)
Dan Atkinson
Alright, well you get +1 for effort. I did look at the Google Webmaster blog, and Google neither confirms nor denies either point of view. They do say there may be a marginal benefit for descriptive URLs in the Google Toolbar. We will run their Content Analysis tool on our own website, and see what it says. Cheers!
Robert Harvey
Thanks! Also, if you have IIS 7, consider using the IIS Search Engine Optimization Toolkit: http://bit.ly/MajYo
Dan Atkinson
Cool. Didn't know about that.
Robert Harvey