views:

174

answers:

2

I wish to retrieve the source of a website, that is dynamically generated upon clicking a link. The link itself is as below:

<a onclick="function(); return false" href="#">Link</a>

This stops me from directly querying for a URL that would allow me to get the dynamically generated website (urllib/2).

How would one retrieve the source of the website, which was generated with the above function (in HTML) via python? Is there a method to bypass the return false" href="#"? Or the onclick entirely, and get the actual URL?

If there is another way to generate the website from the abstract link above, so that one can get it from urllib in python, please refer me to it.


EDIT:

I generalized the code seen above - however I've been told that one has to reverse engineer the specific javascript to be able to use it.

Link to .js - http://a.quizlet.com/j/english/create%5Fsetku80j8.js

Link to site with link:

<a onclick="importText(); return false" href="#">Bulk-import data</a>

Actual URL of site: http://quizlet.com/create%5Fset/

Beautified JS of relevant .js above: http://pastie.org/737042

+2  A: 

You will probably have to reverse engineer the JavaScript to work out what is going on.

Can you provide the site and the link in question?

nullptr
http://quizlet.com/create_set/ - You need to make an account >.>. How would one go about reverse engineering Javascript?
Nazarius Kappertaal
It's so that I can import my set of cards -> without invoking a web browser. Their API only allows for calls and no input <.<.
Nazarius Kappertaal
Javascript with relevative importText() function - http://a.quizlet.com/j/english/create_setku80j8.js.
Nazarius Kappertaal
+1  A: 

I don't immediately see any content-generation or link-following code in that script; all importText does is toggle whether a few divs are shown.

If you want to study the calls the webapp makes to do a particular action, in order to reproduce them from a bot, you're probably best off looking at the HTTP requests (form submissions and AJAX calls) that the browser makes whilst performing that action. You can use Firebug's ‘Net’ panel to study this for Firefox, or Fiddler for IE.

bobince
It does, let me look at the site's source more closely.
Nazarius Kappertaal