.NET, scrape dynamic (Java App?) webpage for information?

I am attempting to get some information from a website, the info that I need is located on the missouri.edu site (so it's publicly available). Here is the process that I need to accomplish: - Navigate to https://webapps.missouri.edu/ODDSearchEngine/oddsearch - search for a department name like "business" - Click any of the department names, like "Business College, Advancement" - I need to be able to programatically view the source of the page that is output after clicking "Business College, Advancement".

I would like to be able to get the source of each page for each department under business (or whatever department I put in, like "Accounting").

Is this possible with a Windows program? It looks like the "ODDSearchEngine" that runs this is a Java applet. I'm not sure how to interface with it to get the pages.

For reference, if you put the address into my existing program that is output by the ODDSearchEngine it returns the source code of the Search page with 2 "java.lang.NullPointerException" errors.

Is there an easy way to get this information through .Net?

Those all look like they might do the trick, thanks very much. I'm leaning toward HttpWebRequest/Response.

Pselus 2009-10-20 15:30:45

Actually, none of them worked.I got the HttpWebRequest/Response stuff to get to the point of clicking the "Go" button for me, but from there if I try to get the sub-page ("Business College, Advancement") it still gives the same errors about java.lang.NullPointerException.WatiN might work, except I can't see how to get the source once I've gone through the "click" process. Not to mention that the page is so poorly formatted that there is no defining characteristic about the links to click other than that the address they point to has a different "ggid" at the end.

Pselus 2009-10-20 16:29:13

ansaurus

tags:

views:

answers:

.NET, scrape dynamic (Java App?) webpage for information?

related questions