views:

47

answers:

3

Can wget be used to get all the files on a server.Suppose if this is the directory structure using Django framework on my site foo.com

And if this is the directory structure

            /web/project1
            /web/project2
            /web/project3
            /web/project4
            /web/templates

Without knowing the name of directories of /project1,project2.....Is it possible to download all the files

A: 

try recursive retrieval - the -r option.

Jayan
Can u give an example i tried -drc option .But not sure
Rajeev
Also if this is allowed it would be a security issue.rit?
Rajeev
Most web servers let you specify whether the directory structure can be enumerated or not.
Jason
A: 

You could use

wget -r -np http://www.foo.com/pool/main/z/

-r (fetch files/folders recursively)

-np (do not descent to parent directory when retrieving recursively)

or

wget -nH --cut-dirs=2 -r -np http://www.foo.com/pool/main/z/

--cut-dirs (it makes Wget not "see" number remote directory components)

-nH (invoking Wget with -r http://fly.srk.fer.hr/ will create a structure of directories beginning with fly.srk.fer.hr/. This option disables such behavior.)

MovieYoda
+1  A: 

First of all, wget can only be used to retrieve files served by the web server. It's not clear in the question you're posting whether you mean actual files or web pages. I would guess from the way you phrased your question that your intent is to download the server files, not the web pages served by Django. If this is correct, then no wget won't work. You need to use something like rsync or scp.

If you do mean using wget to retrieve all of the generated pages from Django, then this will only work if links point to those directories. So, you need a page that has code like:

<ul>
<li><a href="/web/project1">Project1</a></li>
<li><a href="/web/project2">Project2</a></li>
<li><a href="/web/project3">Project3</a></li>
<li><a href="/web/project4">Project4</a></li>
<li><a href="/web/templates">Templates</a></li>
</ul>

wget is not a psychic; it can only pull in pages it knows about.

Jordan Reiter