views:

43

answers:

1

I am retrieving the HTML source code from a remote URL via C#. I am storing the html in a string. I would like to parse through it, but do not want to use RegEx. Instead of want to leverage the jQuery engine to parse through it. Is this possible somehow?

This is how I getting the html from the remote url:

HttpWebRequest wr = (HttpWebRequest)WebRequest.Create(url);
string html = (new StreamReader(wr.GetResponse().GetResponseStream())).ReadToEnd();

I found the Fizzler library (http://code.google.com/p/fizzler/), but it does not use the jQuery engine so there is a lot of things missing from it. Any suggestions on how to do this properly?

+1  A: 

You can setup this server-side (it isn't pretty), but I'd recommend taking a look at something designed for just this purpose on the C# side: The HTML Agility Pack (which the library you linked is based on).

jQuery just isn't designed for this, go with a C# solution (or any .Net language/library you can include)...trust me when you're done you'll have much more hair.

Nick Craver
i might just do this for the hell of it since i'm bald anyway - woo hoo, here comes my hair again :)
jim