tags:

views:

63

answers:

3

I have encountered this piece of code that is supposed to determine the parent url in a hierarchy of dynamic (rewritten) urls. The basic logic goes like this:

"/testing/parent/default.aspx"  --> "/testing/default.aspx"
"/testing/parent.aspx"          --> "/testing/default.aspx"
"/testing/default.aspx"         --> "/default.aspx"
"/default.aspx"                 --> null

...

private string GetParentUrl(string url)
{
    string parentUrl = url;

    if (parentUrl.EndsWith("Default.aspx", StringComparison.OrdinalIgnoreCase))
    {
        parentUrl = parentUrl.Substring(0, parentUrl.Length - 12);

        if (parentUrl.EndsWith("/"))
            parentUrl = parentUrl.Substring(0, parentUrl.Length - 1);
    }

    int i = parentUrl.LastIndexOf("/");

    if (i < 2) return null;

    parentUrl = parentUrl.Substring(0, i + 1);

    return string.Format(CultureInfo.InvariantCulture, "{0}Default.aspx", parentUrl);
}

This code works but it smells to me. It will not work with urls that have a querystring. How can I improve it using regex?

+1  A: 

A straight forward approach will be splitting url by "?" and concatenate query string at the end...

Leon
+4  A: 

Have a look at the answers to SO question "Getting the parent name of a URI/URL from absolute name C#"

This will show you how to use System.Uri to access the segments of an URL. System.Uri also allows to manipulate the URL in the way you want (well, not the custom logic) without the danger of creating invalid URLs. There is no need to hack your own functions to dissect URLs.

f3lix
+1 for not recommending regex! Regex is not the right tool for this job, the standard library is.
Will
Thank you so much. I didn't think of that.
deverop
+1  A: 

I recommend you not to use Regex in this scenario. Regex that solves this task will be "real code smell". Above code isn't so bad, use f3lix and Leon Shmulevich recommendations to make it better.

Roman