tags:

views:

90

answers:

2

I had to take a surveymonkey survey today, and the format was as follows: a question was asked, then after hitting the next button, the answer was displayed as "Answer: _" along with an explanation. For kicks, I'd like to make a program that could take this survey, answering any letter, then going to the next page and reading the answer, then going back and changing the answer to the correct one, then going 2 pages ahead and repeating.

I am familiar with Java and Python, but I'm not sure how to make them be able to "know" where the button is, and how to "read" text without unnecessary image recognition.

This is just a fun project, nothing serious, but I would appreciate any ideas to get me started.

A: 

Would it be unrealistic to make it post to the survey monkey pages? You could then do some regex's to pull "answer:__" out and look for that pattern in the original page. It would definitely be easier than trying to click things in a browser, etc. Basically, write a java app or python for that matter that does http posts to the survey pages in order and uses regex's to find the next page, etc and then use a stack to keep track of the history.

Edit if this isn't clear, let me know, I'll clarify

Edit 2: I completely forgot about HTMLUnit, my bad. It is a testing framework like suggested by jsight but specifically for Java and functions very similarly to JUnit, however, because it is designed for testing web applications, it can be used to automate interactions with other sites

Chris Thompson
+1  A: 

Assuming that the text was just that (text rather than images), there are a few useful tools for you:

  • .Net WebControl - I've scripted this before from .Net. It has the advantage of making all of the JS on the page still work. I know this isn't Java, but it is surprisingly easy to work with for this kind of task.
  • Selenium - It is primarily a web testing framework, but it would be easy to script it from Java to auto-submit forms.
  • TagSoup for Java - If the pages do not have significant javascript code that needs to run, there are many HTML parsers for Java that could potentially be used to develop a scraper.
jsight