tags:

views:

243

answers:

5

I need to parse the domain name from a string. The string can vary and I need the exact domain.

Examples of Strings:

http://somename.de/
www.somename.de/
somename.de/
somename.de/somesubdirectory
www.somename.de/?pe=12

I need it in the following format with just the domain name, the tld, and the www, if applicable:

www.somename.de

How do I do that using C#?

+1  A: 

I checked out Regular Expression Library, and it looks like something like this might work for you:

^(([\w][\w\-\.]*)\.)?([\w][\w\-]+)(\.([\w][\w\.]*))?$
Brandon Satrom
no this is not working
Umair Ashraf
@Umair Ashraf - you should probably explain how it doesn't work. Can you give an example of a line it doesn't match?
Kobi
How it doesn't work: it does not remove the protocol for instance (`http://`).
Wrikken
I straight put this line in Regex connstructor like (@"^(([\w][\w\-\.]*)\.)?([\w][\w\-]+)(\.([\w][\w\.]*))?$")
Umair Ashraf
+1  A: 

Try this:

^(?:\w+://)?([^/?]*)

this is a weak regex - it doesn't validate the string, but assumes it's already a url, and gets the first word, until the first slash, while ignoring the protocol. To get the domain look at the first captured group, for example:

string url = "http://www.google.com/hello";
Match match = Regex.Match(url, @"^(?:\w+://)?([^/?]*)");
string domain = match.Groups[1].Value;

As a bonus, it also captures until the first ?, so the url google.com?hello=world will work as expected.

Kobi
+8  A: 

As an alternative to a regex solution, you can let the System.Uri class parse the string for you. You just have to make sure the string contains a scheme.

string uriString = "http://www.google.com/search";

if (!uriString.Contains(Uri.SchemeDelimiter))
{
    uriString = string.Concat(Uri.UriSchemeHttp, Uri.SchemeDelimiter, uriString);
}

string domain = new Uri(uriString).Host;

This solution also filters out any port numbers and converts IPv6 addresses to its canonical form.

Niels van der Rest
Your answers looks valid also.
Umair Ashraf
+2  A: 

i simple used

 Uri uri = new Uri("http://www.google.com/search?q=439489");
            string url = uri.Host.ToString();
            return url;

because by using this you can sure.

steven spielberg