views:

71

answers:

1

Hi,

I'd like to implement a RegExp (regular expression) that can check a string to see if it contains "http://" (i.e. it contains a URL), and then take that whole URL into a new string variable. The string I am using is not HTML, it is simply text with any arrangement of words, characters, numbers and URLs.

I'd imagine I'd look for a mention of "http://" within my string, and take a new string whose starting point is http:// and the end of the string is the next whitespace point just after the full URL.

PLEASE HELP, I've looked high and low for this to no avail!

Thanks in advance, Alex

+2  A: 

I've being answering to smth like this here. I guess that code could be changed to suit your needs; it loads text file and searched for urls.

using (StreamReader reader = new StreamReader(File.OpenRead("c:\\test.txt")))
{
    string content = reader.ReadToEnd();
    string pattern = @"((https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))+[\w\d:#@%/;$()~_?\+-=\\\.&]*)";
    MatchCollection matches = Regex.Matches(content, pattern);
    foreach (Match match in matches)
    {
        GroupCollection groups = match.Groups;
        Console.WriteLine("'{0}' repeated at position {1}",
                          groups[0].Value, groups[0].Index);
    }
}

hope this helps, regards

serge_gubenko
thanks a lot, looks good, i will try it tomorrow and let you know :)
AlexW
I adapted this a bit but the RegEx part is PERFECT for extracting any Internet address from a string.Thanks!!
AlexW