views:

478

answers:

6

Looking for something similar to Mechanize for .NET C#.

If you don't know what Mechanize is.. http://search.cpan.org/dist/WWW-Mechanize/

I will maintain a list of suggestions here. Anything for browsing/posting/screen scraping (Other than WebRequest and WebBrowser Control).

Parsing

  1. HTMLAgilityPack - http://www.codeplex.com/htmlagilitypack

Web App Testing

  1. WatiN - Web Application Testing Framework (.NET) - http://watin.sourceforge.net/

  2. Selenium - http://seleniumhq.org/

  3. Design Canvas - But it costs money =(

Other

  1. Tools for finding page strucutre - Firebug for Firefox or Internet Explorer Developer Toolbar for IE

Note: WatiN is pretty close to what I am looking for, except it opens up a browser, which is annoying and awesome at the same time. Depends on what you are doing. Converted some stuff to run in Watin...fun to watch.

+2  A: 

You need to use the HTML Agility Pack, which can parse tag soup from real websites into a DOM structure.

SLaks
That's good for parsing the pages once you get them, but what about everything else? Like, logging into sites, managing cookies, handling redirection etc?
Blankasaurus
Use the `WebClient` class, or HAP's `HtmlWeb` class, or the `HttpWebRequest` class with a `CookieContainer`.
SLaks
I know you can do it that way in .NET, looking for something a little higher level. Like Mechanize. Its OK if there isn't one, I just was curious if there was a library that did what I have done using WebClient etc.
Blankasaurus
A friend of mine wrote a program that does what my C# program does using Mechanize, and it is 13 lines. Mine is WAY more. 13 LINES! =P
Blankasaurus
+1  A: 

You can use the WebBrowser control, which can be automated to an extent.

John Saunders
@Downvoter: why bother to downvote if you can't be bothered to say why you downvoted? Do you think I _care_ about 2 points?
John Saunders
A: 

You want HttpWebRequest for automating web requests and HtmlAgilityPack for processing the resulting HTML.

qntmfred
+2  A: 

I've been using WatiN to great effect. It's an easy way to 1) automate user input w/ IE and 2) navigate the DOM.

ZaijiaN
+1 Best answer so far. Thanks!
Blankasaurus
WatiN requires actually launching browser windows. Mechanize is in-memory only
qntmfred
It is a bit slow, but it is fun to watch the browser be automated. Even more fun is to call .Highlight on whichever part of the DOM you're processing, and you can watch the processing happen.
ZaijiaN
Another benefit of WatiN - since it interacts with IE, it can process the *live* DOM - that is, there's no problem if the page been built by javascript. I don't believe the HTML agility pack can do that.
ZaijiaN
@qntmfred: You can hide those windows: "IE.Settings.MakeNewIeInstanceVisible = false". There is another property which handles dialogs: "IE.Settings.AutoStartDialogWatcher".
tsocks
+1  A: 

You can also use Selenium. It's for unit testing web sites. It has a java application that drives the browser and a C# interface that you can write your code in. It also has the downside of showing the browser, but it's pretty full featured in terms of control, waiting on responses and getting the results.

Pete McKinney
+1  A: 

Design Canvas is the best tool out there for this type of thing. Works with IE, Firefox, Safari, or an in-memory browser. It allows you to record and then playback any kind of web interaction.

jaws