I'd like to scrape a website to programmatically collect any external links within any flash elements on the page. I'd also like to collect any other text, if possible, but the links are the important part. Is this possible? A freeware library/service to accomplish this task would be preferable, but if none is, how can I accomplish the task on my own? Is it possible to get the source code and pull from that?
As a very crude first step you could use Google to get a text snippet out of the swf, given that the swf has been indexed by Google and that you know it's URL. e.g:
http://www.google.com/search?q=site%3Awww.michaelgraves.com%2Fmga.swf
Yanking "external links" out of a flash can be as simple as, for instance:
curl -s http://hostname/path/to/file.swf | strings | grep http
Of course, this'll fail if the author has taken any attempt to hide the URL.
YMMV a lot. Good luck!
Decompiling the Flash source would let you see the ActionScript part of the Flash file, which I've found to often contain info like links.
A free decompiler is Flare. It's command line only, and works fine. It won't decode some of the info in newer Flash formats (>CS3 I think). It dumps all the AS into one file.
Sothink SWF Decompiler is a more sophisticated commercial program. It will work fine with any Flash file I've tried and the results are quite thorough and well organized. it's GUI based and I don't know if it is easily automated.
With Flare, since it's a command line tool, one could easily write a script to obtain the SWF, decompile it, grep for 'http://', and log the results.