views:

36

answers:

2

Hi, I have a file that contain a tiff image and a document xml in a multipart mime document. I would extract the image from this file. How I can get it?

I have this code, but it requires an infinite time to extract it, if I have a big file (for example 30Mb), so this is unuseful.

f=open("content_file.txt","rb")
msg = email.message_from_file(f)
j=0
image=False
for i in msg.walk():
    if i.is_multipart():
        #print "MULTIPART: "
        continue
    if i.get_content_maintype() == 'text':
        j=j+1
        continue
    if i.get_content_maintype() == 'image':
        image=True
        j=j+1
        pl = i.get_payload(decode=True)
        localFile = open("map.out.tiff", 'wb')
        localFile.write(pl)
        continue
        f.close()
    if (image==False):
        sys.exit(0);

Thank you so much.

A: 

It is not quite clear to me, why your code hangs. The indentation looks a bit wrong and opened files are not properly closed. You may also be low on memory.

This version works fine for me:

import email
import mimetypes

with open('email.txt') as fp:
    message = email.message_from_file(fp)

for i, part in enumerate(message.walk()):
    if part.get_content_maintype() == 'image':
        filename = part.get_filename()
        if not filename:
            ext = mimetypes.guess_extension(part.get_content_type())
            filename = 'image-%02d%s' % (i, ext or '.tiff')
        with open(filename, 'wb') as fp:
            fp.write(part.get_payload(decode=True))

(Partly taken from http://docs.python.org/library/email-examples.html#email-examples)

Bernd Petersohn
It works fine for small files...but I have to manage big files(for example 30mb), and it doesn't work well. It takes to long time, and the cpu is always loaded.
michele
Any suggestions?Thanks.
michele
A: 

Solved:

def extract_mime_part_matching(stream, mimetype):
"""Return the first element in a multipart MIME message on stream
matching mimetype."""

msg = mimetools.Message(stream)
msgtype = msg.gettype()
params = msg.getplist()

data = StringIO.StringIO()
if msgtype[:10] == "multipart/":

    file = multifile.MultiFile(stream)
    file.push(msg.getparam("boundary"))
    while file.next():
        submsg = mimetools.Message(file)
        try:
            data = StringIO.StringIO()
            mimetools.decode(file, data, submsg.getencoding())
        except ValueError:
            continue
        if submsg.gettype() == mimetype:
            break
    file.pop()
return data.getvalue()

From: http://docs.python.org/release/2.6.6/library/multifile.html

Thank you for the support.

michele