tags:

views:

230

answers:

5

I need to improve the method below.

The thing is to extract the first folder of an URL if it exists. The urls that can be pass are with domain or without domain, that is to say: http://www.xxx.com/es/test/test.aspx or http://xxx.com/es/test/ or /us/xxx/xxx.aspx.

public string ExtractURL(string url)
{
    string result = "";
    try
    {
        string[] urlSplitted = url.Split("//".ToCharArray());
        //si encontramos /
        if (urlSplitted.Length > 0)
        {
            string strFin = urlSplitted[urlSplitted.GetUpperBound(0) - 1];
            //comprobamos que exista algo
            if (String.IsNullOrEmpty(strFin))
            {
                result = url;
            }
            else
            {
                //devuelvo la url hasta /ES o /EN
                result = url.Substring(0,url.ToLower().IndexOf("/" +strFin.ToLower()));
            }
        }
        else
        {
            result = url;
        }
    }
    catch
    {
        result = "";
    }
    return result;
}
A: 

Consider using System.Uri class to pre-parse the Uri and extract the relative path with LocalPath property. Then use String.split().

Sergii Volchkov
A: 
public string ExtractURL(string URL)
    {
        string result = "";
        try
        {
            httpindex = URL.ToLower().IndexOf("http://");
            if (httpindex > 0)
            {
                URL = URL.Substring(0, 6);
            }
            URL = URL.ToLower().TrimStart("http://".ToCharArray());
            string[] urlArray = URL.Split('/');
            if (urlArray.Length > 1)
            {
                    result = urlArray[1];
            }
        }
        catch
        {
            result = "";
        }
        return result;
    }

That should do what you want I think

AutomatedTester
A: 

I agree with Sergii Volchkov and I think using System.Uri is the right way to go but instead of using string.split you may want to use Path.GetParentDirectory() on the local path.

This method does not exist, there is Path.GetPathRoot but it does not work
netadictos
Whoops sorry I meant use "GetDirectoryName()". You may have to use this method multiple times to get to the first folder.
+1  A: 

Cast to a Uri and then use the Segments property. You'll actually want the second segment because the first is just the leading slash.

public string ExtractURL(string url)
{
    Uri webAddress = null;
    string firstFolder = null;
    if (Uri.TryCreate(url, UriKind.RelativeOrAbsolute, out webAddress))
    {
        if (!webAddress.IsAbsoluteUri)
        {
            webAddress = new Uri(Request.Url, url);
        }
        if (webAddress.Segments.Length >= 2)
        {
            firstFolder = webAddress.Segments[1];
        }
    }
    return firstFolder;
}
Jacob Proffitt
sorry, you can not use segments for relative urls, and the property Count doesnot exist, it is length. The important problem is the first.
netadictos
True. Add a check for relative path and you're golden. Example fixed to reflect that.
Jacob Proffitt
I think you cannot use Segments for relative urls. Thanks.
netadictos
That's not a relative Url any longer because it's made an absolute url by the webAddress = new Uri(Request.Url, url);
Jacob Proffitt
Ok, this is true it is made an absolute url, oops!, I prefer the regex solution, because to convert to absolute url to extract a part is not so efficient but i point this answer because it's another very good approach.
netadictos
A: 

If you want to solve this with a regular expression (as you tagged your question "regex"), try this:

public string ExtractURL(string url)
{
  return Regex.Match(url, "(?<!/)/[^/?#]+").Value;
}

This regex works on absolute URLs, and on relative URLs that begin with a slash. If it also needs to work on relative URLs without a slash, try this:

public string ExtractURL(string url)
{
  Regex.Match(url, @"(\w*:(//[^/?#]+)?/)?(?<folder>[^/?#]+)").Groups["folder"].Value;
}
Jan Goyvaerts
Could you explain the magic for the first method?
netadictos