views:

289

answers:

4

To make pretty URL's from article titles I am using a simple function. However lately I an concerned about the ideal length of these "slugs". It is said that too many dashes are bad.

However some article titles can be long and a too long URL may not be liked by google. Of course that defeats th whole idea of having URL slugs.

So does anyone have any idea how long a URL slug should be. Should there be a limit on the "dash" charecters used?

+2  A: 

My guess is, if you throw out all unimportant words from your url, there won't be so much left ... for this question it is ideal, length, url and slug.

As far as I understand, google is very keyword-centric, and words like "what", "is", "the", "of", "an" are not really good keywords.

Anyway, if you only keep the most characteristic and important words, the slug cannot become too long in the sense that a shorter URL would contain less important information.

Of course this is just a speculation.

greetz
back2dos

back2dos
+1  A: 

I don't think there a limit on the number of dash characters used will make a difference. You should keep the whole string limited to around 80-100 characters max.

Like back2dos said you can remove some common words but ideally, the slug should make sense as a page title. With this page for example, removing all the common words gives you ideal-length-url-slug which kinda works. But I'd say ideal-length-of-url-slug is better.

DisgruntledGoat
+7  A: 

If you really want to be economic with URL space, I'd remove articles (the, a, an etc.) more aggressively than prepositions and verbs since removed articles don't change the semantics of the sentence that much.

e.g.

What is the ideal length of an URL slug

remove articles

What is ideal length of URL slug

remove "What is"

ideal length of URL slug

normalization

ideal-length-of-url-slug
Timo Westkämper
You could remove `of` and it would still be understandable.
Chacha102
@chacha102, yes, but IMO it hurts the readability much more than removed articles
Timo Westkämper
What do you do if all these transformations still yield an unwieldily long URL? My feeling is that there really isn't a reasonable way to automate this.
dreeves
+1 - but let's not forget that URLs are for humans first and search engines second. Anything you do that benefits SEO but makes things worse for people is ill-conceived.
Sohnee
+1  A: 

I recommend shortening the slug to the point that the whole URL is at the very most 72 characters long. That's an age-old convention for email to allow a few levels of quoting before reaching the standard 80 character limit. I know modern technology means we don't have to care about line length limits but it's still a reasonable convention for various reasons. See this related discussion: http://stackoverflow.com/questions/110928/80-char-width. There's also the practical concern that your URL may not stay intact and clickable in some email clients if they wrap it.

As for how to keep your URLs to a reasonable length, I think URLs should be chosen manually whenever possible. You've written a whole article, might as well make up a concise URL for it as well. Below is the .htaccess file for my blog in case you find it helpful. Every article has a long URL like

myblog.com/2010/05/30/ideal-length-of-url-slug

(Wordpress suggests a default but I typically condense it down a bit by hand.) And then I use a rewrite rule like below to make a short-as-possible version that I can usually remember and mention easily (or for twitter, of course). Something like

myblog.com/slugs

Here's the contents of my .htaccess file, in /var/www/html/myblog:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /

RewriteRule ^admin/?$       wp-admin  [L]

# Aliases for blog posts: (NB: NEVER CHANGE/DELETE THESE; ONLY ADD NEW ONES!)
RewriteRule ^flu/?$         2009/03/21/the-future-is-yesterday [R,L]
RewriteRule ^oracle/?$      2009/03/25/the-oracle-of-brackets [R,L]
RewriteRule ^perfbrack/?$   2009/03/29/the-perfect-bracket [R,L]
RewriteRule ^nytimes/?$     2009/04/01/anon-sources-at-the-new-york-times [R,L]
RewriteRule ^mktbottom/?$   2009/04/07/finding-the-market-bottom [R,L]
RewriteRule ^landlords/?$   2009/05/24/landlords [R,L]
RewriteRule ^buyrent/?$     2009/06/01/buyrent [R,L]
RewriteRule ^sunk/?$        2009/06/23/sunk [R,L]
RewriteRule ^horse/?$       2009/07/01/horse [R,L]
RewriteRule ^wellmanblog/?$ 2009/07/31/wellmanblog [R,L]
RewriteRule ^centmail/?$    2009/08/15/centmail [R,L]
RewriteRule ^longtail/?$    2009/08/31/anatomy-of-the-long-tail [R,L]
RewriteRule ^scarequotes/?$ 2009/09/30/scarequotes [R,L]
RewriteRule ^scare/?$       2009/09/30/scarequotes [R,L]
RewriteRule ^dst/?$         2009/10/31/dst [R,L]
RewriteRule ^searchpred/?$  2009/11/30/what-can-search-predict [R,L]
RewriteRule ^scrooge/?$     2009/12/31/scrooge [R,L]
RewriteRule ^pmhype/?$      2010/01/14/prediction-without-markets [R,L]
RewriteRule ^predmarkets/?$ 2010/01/14/prediction-without-markets [R,L]
RewriteRule ^calibration/?$ 2010/02/28/calibration [R,L]
RewriteRule ^calib/?$       2010/02/28/calibration [R,L]
RewriteRule ^calresults/?$  2010/03/31/calibration-results [R,L]
RewriteRule ^misleadingmeans/?$ 2010/04/30/misleading-means [R,L]

</IfModule>

With hindsight, though, I would ditch the dates in the URL and do something like what StackOverflow does in an attempt to have the best of both worlds with concise vs descriptive URLs. StackOverflow lets you truncate or even change URLs for questions as much as you like after the question ID part. So all of the following are links to this question:

The last one is still too long for my tastes though, so I would have the canonical URL be

myblog.com/foo

and then allow a slash followed by anything else, like

myblog.com/foo/fooing-and-barring-in-the-modern-world

Here's a rewrite rule for that:

RewriteRule ^foo(/.*)?$   foo [R,L]

Additional rules could be added if you wanted, say, myblog.com/bar to be an alias for myblog.com/foo:

RewriteRule ^bar(/.*)?$   foo [R,L]

The downside here is that if you want your article slugs to be in the global namespace of your website -- which I think is preferable to something like myblog.com/articles/foo -- then you have to add a rewrite rule for every article.

dreeves