tags:

views:

518

answers:

1

Hi there,

I'm attempting to use wget to recursively grab only the .jpg files from a particular website, with a view to creating an amusing screensaver for myself. Not such a lofty goal really.

The problem is that the pictures are hosted elsewhere (mfrost.typepad.com), not on the main domain of the website (www.cuteoverload.com).

I have tried using "-D" to specified the allowed domains, but sadly no cute jpgs have been forthcoming. How could I alter the line below to make this work?

wget -r -l2 -np -w1 -D www.cuteoverload.com,mfrost.typepad.com -A.jpg -R.html.php.gif www.cuteoverload.com/

Thanks.

+2  A: 

An examination of wget's man page[1] says this about -D:

Set domains to be followed. domain-list is a comma-separated list of domains. Note that it does not turn on -H.

This advisory about -H looks interesting:

Enable spanning across hosts when doing recursive retrieving.

So you need merely to add the -H flag to your invocation.

(Having done this, looks like all the images are restricted to mfrost.typepad.com/cute_overload/images/2008/12/07 and mfrost.typepad.com/cute_overload/images/2008/12/08).

-- [1] Although wget's primary reference manual is in info format.

Cirno de Bergerac