Hi,
Background: I have to download webpages with their resources for offline viewing, however as part of this I have to "rewrite" the URL's for links with the HTML webpage so they work. This is fine more the standard types of links however I'm realizing now that there are some links that are dynamically created by javascript.
Question: What approach (or even existing library) could I use to transcribe a web page with dynamically generated links (from javascript) to a webpage with normal non-dynamic links. (as then I can do the URL rewriting I need to do)
Notes:
- It's almost as if I need to have a Javascript interpreter library that I pass the page HTML to, and it then spits out the generated java code perhaps? Then I can rewrite the links as I wish (the result would then not use the javascript dynamic approach).
- Context is a C# WinForms (3.5) application.
Thanks
PS. Some examples:
<script type="text/javascript">
<!--
document.write("<a href=\"/home.asp\" onMouseOver=\"MM_swapImage('tab_home','','/_includes/images/tab_home_.gif',1)\" onMouseOut=\"MM_swapImgRestore()\"><img src=\"/includes/images/tab_home.gif\" alt=\"Home\" name=\"tab_home\" width=\"45\" height=\"18\" border=\"0\" id=\"tab_home\"><\/a>");
if (window.document.location.pathname.indexOf("mysite.asp") != "-1") {
document.write("<a href=\"/mysite.asp\" onMouseOver=\"MM_swapImage('tab_my_site','','/_includes/images/tab_my_site_.gif',1)\" onMouseOut=\"MM_swapImgRestore()\"><img src=\"/_includes/images/tab_my_site_.gif\" alt=\"My Site\" name=\"tab_my_site\" width=\"76\" height=\"18\" border=\"0\" id=\"tab_my_site\"><\/a>");
}
else {
document.write("<a href=\"/mysite.asp\" onMouseOver=\"MM_swapImage('tab_my_site','','/_includes/images/tab_my_site_.gif',1)\" onMouseOut=\"MM_swapImgRestore()\"><img src=\"/_includes/images/tab_my_site.gif\" alt=\"My Site\" name=\"tab_my_site\" width=\"76\" height=\"18\" border=\"0\" id=\"tab_my_site\"><\/a>");
}
and
<script type="text/javascript">
var fo = new FlashObject("/homepage/ia/flash/hero/banner.swf?q=1", "hero", "642", "250", "8", "#ffffff");
fo.addParam("wmode", "transparent");
fo.addParam("allowScriptAccess", "always");
fo.addParam("base", "/homepage/ia/flash/hero/");
fo.write("flashContent");
</script>
and
<td width="1%">
<a href="javascript:checksubmit(this);"
onmouseover="MM_swapImage('but_srch_go','','/_includes/images/but_srch_go_.gif',1)"
onmouseout="MM_swapImgRestore()">
<img src="http://localhost:3000/sites/http://qheps.health.qld.gov.au/_includes/images/but_srch_go.gif" alt="Go" name="but_srch_go" width="57" height="40" border="0">
</a>
</td>