ansaurus

Question

regular expression C#

Answer 1

A:

http://www.regular-expressions.info/ has tons of information

Mel Gerats 2009-08-27 07:10:44

This doesn't seem to be specific to the question being asked, and I don't see a reliable way of scraping linked URLs from a page without potentially pulling in comments/text which also contain URLs.

Conspicuous Compiler 2009-08-27 07:23:02

Answer 2

A:

You might want to try actually parsing the page and transversing the DOM.

Try: http://www.codeplex.com/htmlagilitypack

Chris T 2009-08-27 07:12:02

Answer 3

+3 A:

HTML Agility Pack is ideal for this; this is almost the same as the example on the home page:

foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href]")
{
    string href = link["href"].Value;
}

Now just parse "href"; perhaps something like:

Match match = Regex.Match(href, @"[&?]\w+=(\d+)");
int i;
if (match.Success && int.TryParse(match.Groups[1].Value, out i))
{
    Console.WriteLine(i);
}

Marc Gravell 2009-08-27 07:35:34

amm, main question was than i need only link with such template /with_us.php?page=digit for example <a href=out.php?i=1456 target=_blank><b>go</a>but your sample with HTML Agility Pack get ALL links from page. that i asked the question to find immediately only selected links

kusanagi 2009-08-27 07:45:33

Define "selected links"?

Marc Gravell 2009-08-27 09:29:22

Answer 4

A:

link text

Dilse Naaz 2009-08-27 08:42:54

Your link appears to go to an error page.

Conspicuous Compiler 2009-08-27 18:55:35

ansaurus

tags:

views:

answers:

regular expression C#

related questions