ansaurus

Question

Answer 1

+3 A:

HTML Agility Pack will save you tons of headaches. Try it instead of using regexps to parse HTML.

For what it's worth, in the page you link to the quote data is indeed in Javascript code, check http://www.nseindia.com/js/getquotedata.js and http://www.nseindia.com/js/quote_data.js

Vinko Vrsalovic 2010-02-20 12:44:20

+1 Agreed, and of course the obligatory link: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

Nick Craver 2010-02-20 12:46:15

Hi Vinko, I will follow your suggestion. Thanks for your help......

Raghavendra 2010-02-20 13:14:43

Hello Vinko, one more doubt, I have downloaded the content of the link to String as like @Asad Butt suggests, When I see the source of this String there is no values(Open Interest) in it. I think those are dynamically generates this report using JavaScript, Is this situation can handle by the HTML Agility Pack?

Raghavendra 2010-02-21 08:16:28

@Raghavendra: No, you have to understand how the data is stored in the Javascript file, download it and obtain the relevant part.

Vinko Vrsalovic 2010-02-21 12:22:28

@Vinko, I have gone through the javascript file and I found there is a Iframe and I got the exact link. From the above page, in middle part it contains a seperate URL to load data and I found it. Once again Thank you very much.......

Raghavendra 2010-02-22 07:16:42

Answer 2

+1 A:

as per @Vinko Vrsalovic answer, Html Agility pack is your friend. Here is a sample

  WebClient client = new WebClient();
  string source = client.DownloadString(url);

  HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
  document.LoadHtml(source);

  HtmlNodeCollection nodes = document.DocumentNode.SelectNodes("//*[@href]");

   foreach (HtmlNode node in nodes)
   {
    if (node.Attributes.Contains("class"))
    {
     if (node.Attributes["class"].Value.Contains("StockData"))
     {// Here is our info }
    }
   }

Asad Butt 2010-02-20 14:04:22

ansaurus

tags:

views:

answers:

html parsing problem using C#

related questions