views:

101

answers:

3

Basically, what I'm trying to do is simply make a small script that accesses finds the most recent post in a forum and pulls some text or an image out of it. I have this working in python, using the htmllib module and some regex. But, the script still isn't very convenient as is, it would be much nicer if I could somehow put it into an HTML document. It appears that simply embedding Python scripts is not possible, so I'm looking to see if theres a similar feature like python's htmllib that can be used to access some other webpage and extract some information from it.

(Essentially, if I could get this script going in the form of an html document, I could just open one html document, rather than navigate to several different pages to get the information I want to check)

I'm pretty sure that javascript doesn't have the functionality I need, but I was wondering about other languages such as jQuery, or even something like AJAX?

A: 

You can embed Python. The most straightforward way would be to use the cgi module. If the script will be run often and you're using Apache it would be more efficient to use mod_python or mod_wsgi. You could even use a Python framework like Django and code the entire site in Python.

You could also code this in Javascript, but it would be much trickier. There's a lot of security concerns with cross-site requests (ah, the unsafe internet) and so it tends to be a tricky domain when you try to do it through the browser.

Kai
+1  A: 

There are two general approaches:

  • Modify your Python code so that it runs as a CGI (or WSGI or whatever) module and generate the page of interest by running some server side code.
  • Use Javascript with jQuery to load the content of interest by running some client side code.

The difference between these two approaches is where the third party server sees the requests coming from. In the first case, it's from your web server. In the second case, it's from the browser of the user accessing your page.

Some browsers may not handle loading content from third party servers very gracefully (that is, they might pop up warning boxes or something).

Greg Hewgill
+2  A: 

As Greg mentions, an Ajax solution will not work "out of the box" when trying to load from remote servers.

If, however, you are trying to load from the same server, it should be fairly straightforward. I'm presenting this answer to show how this could be done using jQuery in just a few lines of code.

<div id="placeholder">Please wait, loading...</div>

<script type="text/javascript" src="/path/to/jquery.js">
</script>
<script type="text/javascript>
$(document).ready(function() {
    $('#placeholder').load('/path/to/my/locally-served/page.html');
});
</script>

If you are trying to load a resource from a different server than the one you're on, one way around the security limitations would be to offer a proxy script, which could fetch the remote content on the server, and make it seem like it's coming from your own domain.

Here are the docs on jQuery's load method : http://docs.jquery.com/Ajax/load

There is one other nice feature to note, which is partial-page-loading. For example, lets say your remote page is a full HTML document, but you only want the content of a single div in that page. You can pass a selector to the load method, as in my example above, and this will further simplify your task. For example,

$('#placeholder').load('/path/to/my/locally-served/page.html #someTargetDiv');

Best of luck!
-Mike

Funka