tags:

views:

297

answers:

7

hi everybody. I was wondering which is the best way to turn a string (e.g. a post title) into a descriptive URL. the simplest way that comes to mind is by using a regex, such in:

public static Regex regex = new Regex(
      "\\W+",
    RegexOptions.IgnoreCase
    | RegexOptions.CultureInvariant
    | RegexOptions.IgnorePatternWhitespace
    | RegexOptions.Compiled
    );

string result = regex.Replace(InputText,"_");

which turns

"my first (yet not so bad) cupcake!! :) .//\."

into

my_first_yet_not_so_bad_cupcake_

then I can strip the last "_" and check it against my db and see if it's yet present. in that case I would add a trailing number to make it unique and recheck.

I could use it in, say

http://myblogsite.xom/posts/my_first_yet_not_so_bad_cupcake

but, is this way safe? should i check other things (like the length of the string) is there any other, better method you prefer? thanks

+1  A: 

string result = regex.Replace(InputText,"-");

instead of under score put hypen (-) that would give added advantage for Google search engine.

See below post for more details

http://www.mattcutts.com/blog/dashes-vs-underscores/

Kthevar
What do hyphens change instead of underscores for Google?
Joey
Apparently this problem did exist, but was neutralized in 2007. A search for "google url underscore hyphen" yielded this post: http://www.seroundtable.com/archives/014260.html
lc
http://www.mattcutts.com/blog/dashes-vs-underscores/
Kthevar
A: 

You could look into a URL re-writing HTTPModule. There are many examples on the net.

Once implemented in your web.config you simply specify the regular expression to map to the "real" page using the SEO friendly name

<!-- Rule 1: example... "/admin/somepage" redirects to..."/UI/Forms/Admin/frmPage.aspx" -->

  <add key="^/admin/(.*)" value="/UI/Forms/Admin/frm$1.aspx" />
Konrad
well, using an urlrewriting is just a part of the bigger picture... I need an "urlified" title to give it to urlrewrite, indeed.
pomarc
+1  A: 

Here's what I do. regStripNonAlpha removes all the non-alpha or "-" characters. Trim() removes trailing and leading spaces (so we don't end up with dashes on either side). regSpaceToDash converts spaces (or runs of spaces) into a single dash. This has worked well for me.

static Regex regStripNonAlpha = new Regex(@"[^\w\s\-]+", RegexOptions.Compiled);
static Regex regSpaceToDash = new Regex(@"[\s]+", RegexOptions.Compiled);

public static string MakeUrlCompatible(string title)
{
    return regSpaceToDash.Replace(
      regStripNonAlpha.Replace(title, string.Empty).Trim(), "-");
}
Keltex
+1  A: 

Here's a method I wrote not too long ago that takes a string and formats it to a permalink.

        private string FormatPermalink(string title)
        {
            StringBuilder result = new StringBuilder();
            title = title.Trim();
            bool lastOneChanged = false;
            for (int i = 0; i < title.Length; i++)
            {
                char c = title[i];
                if (!char.IsLetterOrDigit(c))
                {
                    c = '_';
                    if (lastOneChanged)
                    {
                        continue;
                    }
                    lastOneChanged = true;
                }

                else
                {
                    lastOneChanged = false;
                }

                result.Append(c);
            }

            if (result[result.Length - 1] == '_') //if last one is underscore, remove
            {
                result = result.Remove(result.Length - 1, 1);
            }
            return result.ToString();
        }

This takes into account special characters as well, so if the title has a special character, it just ignores it and moves on to the next one.

BFree
good, but I'm wondering: when the result of this would differ (or be better) from the regex solution?
pomarc
A: 

If you want to avoid doing this yourself, an HttpModule like http://urlrewriter.net/ could help. It's pretty good but requires a bit setting up.

SirDemon
A: 

Personally, I'd couple your special character removing with a date so your example would look like:

http://myblogsite.xom/posts/2009/04/03/my_first_yet_not_so_bad_cupcake

That way, if you content with the same title, it gets differentiated by date too. I see this often on some blogs I visit where they use "Five Random Things Make A Post" a lot (but not within the same day).

BonkaBonka
that would be super cool if it is made as a navigable url system, where http://myblogsite.xom/posts/2009/04/03 would give you all the posts of today, http://myblogsite.xom/posts/2009/04 all the posts of april and so on.
pomarc
A: 

This article very good !

led