views:

51

answers:

2

In the same vein as http://stackoverflow.com/questions/2593399/process-a-set-of-files-from-a-source-directory-to-a-destination-directory-in-pyth I'm wondering if it is possible to create a function that when given a web directory it will list out the files in said directory. Something like...

files[]

for file in urllib.listdir(dir):
    if file.isdir:
        # handle this as directory
    else:
        # handle as file

I assume I would need to use the urllib library, but there doesn't seem to be an easy way of doing this, that I've seen at least.

+1  A: 

What is a web directory?

A web page has links. The page with the links may, or may not be, generated by the web server based on the contents of the directory.

An example of automatically generating links is found here, and is possibly the result of something like mod_dir configuration in the Web server, Apache.

What tools like wget, and curl, do, is take a page and download all links on that page, possibly recursively. I think that is the best you can achieve. And I have the feeling that questions about python + curl are abundant here in SO.

extraneon
A: 

You may got the concept confused. Directory is file system concept. URL do not have a concept of directory. It looks similar to the path name of a file system and often maps to a directory. But there is no requirement for it to be backed by a file system.

For example, http://stackoverflow.com/questions/2593399/ may map to a directory

/htdocs/questions/2593399/

But more likely it is generated from a database query and does not map to anything in the file system.

Wai Yip Tung