I was able to produce useful output with the following command:
$ wget --spider -r -nv -nd -np http://localhost:3209/ 2>&1 | ack -o '(?<=URL:)\S+'
http://localhost:3209/
http://localhost:3209/robots.txt
http://localhost:3209/agenda/2008/08
http://localhost:3209/agenda/2008/10
http://localhost:3209/agenda/2008/09/01
http://localhost:3209/agenda/2008/09/02
http://localhost:3209/agenda/2008/09/03
^C
A quick reference of the wget
arguments:
# --spider don't download anything.
# -r, --recursive specify recursive download.
# -nv, --no-verbose turn off verboseness, without being quiet.
# -nd, --no-directories don't create directories.
# -np, --no-parent don't ascend to the parent directory.
About ack
ack
is like grep
but use perl
regexps, which are more complete/powerful.
-o
tells ack
to only output the matched substring, and the pattern I used looks for anything non-space preceded by 'URL:'