views:

178

answers:

5

I have a web site which I download 2-3 MB of raw data from that then feeds into an ETL process to load it into my data mart. Unfortunately the data provider is the US Dept. of Ag (USDA) and they do not allow downloading via FTP. They require that I use a web form to select the elements I want, click through 2-3 screens and eventually click to download the file. I'd like to automate this download process. I am not a web developer but somehow it seems that I should be able to use some tool to tell me exactly what put/get/magic goes from the final request to the server. If I had a tool that said, "pass these parameters to this url and wait for a response" I could then hack something together in Perl to automate this process.

I realize that if I deconstructed all 5 of their pages and read through the JavaScript includes and tapped my heals together 3 times I could get this info from what I have access to. But I want a faster and more direct path that does not require me to manually parse all their JS.

Restatement of the final question: Is there a tool or method that will show clearly what the final request request sent from a web form was and how it was structured?

+1  A: 

Use Fiddler2 as a proxy to see what is being passed back and forth. I've done this with success in other similar circumstances

Home page is here: http://www.fiddler2.com/fiddler2/

Paul
Thanks for the fast and good reply Paul. That's exactly what I wanted but I could not articulate. Thanks!
JD Long
+1  A: 

A tamperer's best friends (these are firefox extensions, you could also use something like Wireshark)

HTTPFox

Tamper Data

Best of luck

John T
After a few months I have switched to Tamper Data add on for Firefox. Thanks John!
JD Long
A: 

As with the other responses, except my tool of choice is Charles

micmcg
A: 

What about using a web testing toolkit, like Watir and Ruby ?

Easy to fill in the forms.. just use the output..

Joe K
A: 

Use WatiN and combine it with WatiN TestRecorder (Google for it)

It can "simulate" a user sitting in front of the browser punching in values which you can supply from your own C# code...

Thomas Hansen