tags:

views:

181

answers:

4

i have html page with link like /with_us.php?page=digit and out.php?i=digit . how can i get all this links from page, but it will be better if i can collect immediately only digits from this links

A: 

http://www.regular-expressions.info/ has tons of information

Mel Gerats
This doesn't seem to be specific to the question being asked, and I don't see a reliable way of scraping linked URLs from a page without potentially pulling in comments/text which also contain URLs.
Conspicuous Compiler
A: 

You might want to try actually parsing the page and transversing the DOM.

Try: http://www.codeplex.com/htmlagilitypack

Chris T
+3  A: 

HTML Agility Pack is ideal for this; this is almost the same as the example on the home page:

foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href]")
{
    string href = link["href"].Value;
}

Now just parse "href"; perhaps something like:

Match match = Regex.Match(href, @"[&?]\w+=(\d+)");
int i;
if (match.Success && int.TryParse(match.Groups[1].Value, out i))
{
    Console.WriteLine(i);
}
Marc Gravell
amm, main question was than i need only link with such template /with_us.php?page=digit for example <a href=out.php?i=1456 target=_blank><b>go</a>but your sample with HTML Agility Pack get ALL links from page. that i asked the question to find immediately only selected links
kusanagi
Define "selected links"?
Marc Gravell
A: 

link text

Dilse Naaz
Your link appears to go to an error page.
Conspicuous Compiler