views:

66

answers:

2

Straight to the point: I would like to parse the source/DOM of a html page. However i cannot because there is missing information which requires javascript and ajax.

I am using C# and .NET. There is a site that uses ajax to browse pages. Theres two sections i am interested in, the wiki and media section. If i have the link to the media or wiki page i can parse it with no problems. However this site uses ajax and javascript to browse pages (there page size is horrendous, i think it is done for performance reasons).

The link are format as . I would like to know, is there a way i can parse these pages easily? Maybe using a IE control and doing something like ie.set("htmlpage", "4"); ie.run(); parse(ie.source());

+1  A: 

I'm not sure if I understand what you're trying to do, but this post may help solve your problem. The accepted answer has a lot of useful details.

Andrew Flanagan
A: 

I'm not sure what you are asking either. Please try to rephrase it.

I want to help but I don't understand the question and I think I'm not the only one.

eipipuz