views:

1822

answers:

5

How can I check if a file uploaded by a user is a real jpg file in Python (Google App Engine)?

This is how far I got by now:

Script receives image via HTML Form Post and is processed by the following code

...
incomming_image = self.request.get("img")
image = db.Blob(incomming_image)
...

I found mimetypes.guess_type, but it does not work for me.

A: 

You might have to check the bytes of the image http://www.obrador.com/essentialjpeg/headerinfo.htm describes how the jpeg header is set up

Harald Scheirich
+14  A: 

If you need more than looking at extension, one way would be to read the JPEG header, and check that it matches valid data. The format for this is:

Start Marker  | JFIF Marker | Header Length | Identifier
0xff, 0xd8    | 0xff, 0xe0  |    2-bytes    | "JFIF\0"

so a quick recogniser would be:

def is_jpg(filename):
    data = open(filename,'rb').read(11)
    if data[:4] != '\xff\xd8\xff\xe0': return False
    if data[6:] != 'JFIF\0': return False
    return True

However this won't catch any bad data in the body. If you want a more robust check, you could try loading it with PIL. eg:

from PIL import Image
def is_jpg(filename):
    try:
        i=Image.open(filename)
        return i.format =='JPEG'
    except IOError:
        return False
Brian
A: 

Use PIL. If it can open the file, it's an image.

From the tutorial...

>>> import Image
>>> im = Image.open("lena.ppm")
>>> print im.format, im.size, im.mode
S.Lott
This is not going to work in App Engine: PIL contains C code and is therefore not available. The Images API (http://code.google.com/appengine/docs/images/) uses PIL, but it's stubbed out.
chryss
+1  A: 

No need to use and install the PIL lybrary for this, there is the imghdr standard module exactly fited for this sort of usage.

See http://docs.python.org/library/imghdr.html

import imghdr

image_type = imghdr.what(filename)
if not image_type:
    print "error"
else:
    print image_type

As you have an image from a stream you may use the stream option probably like this :

image_type = imghdr.what(filename, incomming_image)


Actualy this works for me in Pylons (even if i have not finished everything) : in the Mako template :

${h.form(h.url_for(action="save_image"), multipart=True)}
Upload file: ${h.file("upload_file")} <br />
${h.submit("Submit", "Submit")}
${h.end_form()}

in the upload controler :

def save_image(self):
    upload_file = request.POST["upload_file"]
    image_type = imghdr.what(upload_file.filename, upload_file.value)
    if not image_type:
        return "error"
    else:
        return image_type
A: 

also PIL raises a memmory error if you upload some files. (FE: i tried to feed it a 8kb xls...)

Roel Kramer