ansaurus

Question

How to scrape images from a web site with javascript and servlets

Answer 1

+1 A:

The JavaScript is probably manipulating the DOM and adding an image. Therefore the image (.jpg, .png or .gif) should be somewhere inside the JavaScript file, and should look something like this:

var image = new Image("/path/to/image.jpg");

You can use Regular Expressions to filter the path and filename out of the javascript code.

Luca Matteis 2010-01-26 21:19:32

OK, I updated the post to reflect what's going on. When I'm in Firefox and I press View->Page Source then I'm shown the exact source code as shown above. I had originally modified the url a bit too much in order to protect some private information, but I've changed it to look more like what it looks in reality now. There is nothing else in the page source, the 5 lines that you see above is all I see when I view the page source.

Lirik 2010-01-26 21:31:41

Have you tried downloading the html file with a download manager (not firefox) and had a look into the source?

svens 2010-01-26 21:50:02

@svens I have saved the page locally, I viewed the source in notepad++ and there is nothing different. It's identical to what I see in firefox too.

Lirik 2010-01-26 21:56:09

Use firebug to inspect the DOM after the image is showing. If its shown via HTML, you should see it there. Then its a matter of writing some JS to find that DOM node. (if its shown via flash/activex/etc then this approach won't work)

Frank Schwieterman 2010-01-26 22:06:43

@Frank thank you VERY MUCH! After opening up the source code in firebug I was able to see the javascript code and I was able to figure out the variables required to get the image! Once I had the right tools, then all the other comments and answers made sense! :)

Lirik 2010-01-26 23:11:55

Answer 2

+1 A:

Instead of saving a local copy of the HTML file, you should save a local copy of the JavaScript file to see how exactly it's adding the image to the HTML file's DOM. That should let you figure out how to construct requests to get the images you need.

Will McCutchen 2010-01-26 22:02:29

ansaurus

tags:

views:

answers:

How to scrape images from a web site with javascript and servlets

related questions