How can I download files from a website using wildacrds in Python? I have a site that I need to download file from periodically. The problem is the filenames change each time. A portion of the file stays the same though. How can I use a wildcard to specify the unknown portion of the file in a URL?
+7
A:
If the filename changes, there must still be a link to the file somewhere (otherwise nobody would ever guess the filename). A typical approach is to get the HTML page that contains a link to the file, search through that looking for the link target, and then send a second request to get the actual file you're after.
Web servers do not generally implement such a "wildcard" facility as you describe, so you must use other techniques.
Greg Hewgill
2009-08-31 19:48:53
+1
A:
You could try logging into the ftp server using ftplib. From the python docs:
from ftplib import FTP
ftp = FTP('ftp.cwi.nl') # connect to host, default port
ftp.login() # user anonymous, passwd anonymous@
The ftp object has a dir
method that lists the contents of a directory.
You could use this listing to find the name of the file you want.
DoR
2009-08-31 20:18:42