Let's assume I browse a specific web page that uses JavaScript to update its view constantly (using Web 2.0 techniques to talk to their server to retrieve updates of data).
Now I like to run some code on my own computer that monitors the contents and alerts me if some specific data appears on the page, so that I could record that data, for instance.
I am looking for ways to accomplish that. Since it's a private project, I am flexible in the choices of my tools (I can program in C and REALbasic, and could manage a little JavaScript as well). The only thing out of my control is the page I want to monitor.
I would prefer a solution I can employ on Mac OS X, but Linux or Windows would be feasible, too.
First, I wonder if there are already solutions for this out there. Something like a user-scriptable web browser, for instance.
If that's not available, I wonder how to best approach this by programming it myself. E.g, can someone tell me if Apple's Webkit allows me to introspect a dynamically updating web page?
As a last resort, I guess I would have to insert my own javascript code into the viewed webpage (I could do that easily, I think, at time of loading the page over the internet), and then have that script run periodically, introspecting the page it's in. The only thing I don't know in this case is how to get it to communicate with the outside, i.e. my computer. I could certainly write an app that it could try talking to, but how could it at all access my computer resources to establish such a communication? As far as I understand the sandboxing of web pages, they cannot read/write local files or communicate with a socket on the computer they're running on, or can they?
So, any ideas are welcome, as long as they're clear of the concept that I have to let a browser or its engine render the page and run the page's Javascripts.