views:

1152

answers:

2

Hi. I have part of a model defined like this:

logo_image = models.ImageField(upload_to=lambda i, fn: "logo_%s"%(fn), height_field="logo_image_height", width_field="logo_image_width")

and had a question about the upload_to function.

According to django's documentation for FileField.upload_to, the second paramater, filename is "The filename that was originally given to the file."

Now, knowing about HTTP, file uploads, etc, the end user's client can easily fake the filename. In particular, couldn't the end client upload a file called "/etc/passwd", for example, and then if I use my naive code (lambda i, fn: "logo_%s"%(fn)), wouldn't the resulting file be uploaded to /etc/passwd? Do I need to escape the filename parameter?

#using django's example of using full paths in settings module,
#MEDIA_ROOT="/tmp/media"
>>> os.path.join("/tmp/media/", "apple.jpg")
'/tmp/media/apple.jpg'
>>> os.path.join("/tmp/media/", "/etc/passwd")
'/etc/passwd'

Thanks for any suggestions / answers / clarification.

Edit

The important methods to look at are in files.py, near line 272:

272         def get_directory_name(self):
273             return os.path.normpath(force_unicode(datetime.datetime.now().strftime(smart_str(self.upload_to))))
274     
275         def get_filename(self, filename):
276             return os.path.normpath(self.storage.get_valid_name(os.path.basename(filename)))
277     
278         def generate_filename(self, instance, filename):
279             return os.path.join(self.get_directory_name(), self.get_filename(filename))

Defining a custom upload_to replaces generate_filename() as seen here:

226             if callable(upload_to):
227                 self.generate_filename = upload_to

Then, in the save() method:

89      def save(self, name, content, save=True):
90          name = self.field.generate_filename(self.instance, name)
91          self.name = self.storage.save(name, content)

And the returned filename is passed to the storage class which eventually calls a django replacement function in the _os.py util module safe_join.

That function appears to alleviate my fears:

24    def safe_join(base, *paths):
25      """
26      Joins one or more path components to the base path component intelligently.
27      Returns a normalized, absolute version of the final path.
28  
29      The final path must be located inside of the base path component (otherwise
30      a ValueError is raised).
31      """
A: 

Answered my own question.

The_OP
A: 

I think you've answered your own question. One point of clarification in that the way os.path.join() works is to strip out the preceding directories are tossed out (according to the Python docs related to os.path). So the behavior you observed in invoking os.path.join() is consistent with how it is described.

One other thing to note: The get_filename() function invokes os.path.basename(), which will strip out any directory paths and return back only the basename. So without an upload_to= parameter, there is no danger of this possibility.

However, if you override the ImageField() with your own upload_to function, this function will not be called and it may be better to invoke os.path.basename(). First, it will avoid saving the filename as the full directory path too. I've found it therefore preferable to also invoke os.path.basename() within my upload_to function. Has anyone else encoutnered this issue?

For more details, see: http://hustoknow.blogspot.com/2010/08/try-me-out.html

PRH