I have an object.
fp = open(self.currentEmailPath, "rb")
p = email.Parser.Parser()
self._currentEmailParsedInstance= p.parse(fp)
fp.close()
self.currentEmailParsedInstance, from this object I want to get the body of an email, text only no html....
How do I do it?
something like this?
newmsg=self._currentEmailParsedInstance.get_payload()
body=newmsg[0].get_content....?
then strip the html from body. just what is that .... method to return the actual text... maybe I mis-understand you
msg=self._currentEmailParsedInstance.get_payload()
print type(msg)
output = type 'list'
the email
Return-Path: [email protected]
Received: from xx.xx.net (xxxx) by mxx3.xx.net (xxx)
id 485EF65F08EDX5E12 for [email protected]; Thu, 23 Oct 2008 06:07:51 +0200
Received: from xxxxx2 (ccc) by fxxxx.net (ccc) (authenticated as [email protected])
id 48798D4001146189 for [email protected]; Thu, 23 Oct 2008 06:07:51 +0200
From: "xxx" [email protected]
To: [email protected]
Subject: FW: xxx
Date: Thu, 23 Oct 2008 12:07:45 +0800
Organization: xx
Message-ID: <001601c934c4$xxxx30$a9ff460a@xxx>
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_NextPart_000_0017_01C93507.F6F64E30"
X-Mailer: Microsoft Office Outlook 11
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Thread-Index: Ack0wLaumqgZo1oXSBuIpUCEg/wfOAABAFEA
This is a multi-part message in MIME format.
------=_NextPart_000_0017_01C93507.F6F64E30
Content-Type: multipart/alternative;
boundary="----=_NextPart_001_0018_01C93507.F6F64E30"
------=_NextPart_001_0018_01C93507.F6F64E30
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
From: xxxx.xxxx [mailto:[email protected]]
Sent: Thursday, October 23, 2008 11:37 AM
To: [email protected]
Subject: S/I for xxxxx (B/L
No.:4357-0120-810.044)
Pls find attached the xxxx.doc),
Thanks.
B.rgds,
xxx xxx
------=_NextPart_001_0018_01C93507.F6F64E30
Content-Type: text/html;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:st1=3D"urn:schemas-microsoft-com:office:smarttags" =
xmlns=3D"http://www.w3.org/TR/REC-html40">
HTML STUFF till
------=_NextPart_001_0018_01C93507.F6F64E30--
------=_NextPart_000_0017_01C93507.F6F64E30
Content-Type: application/msword;
name="xxxx.doc"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="xxxx.doc"
0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAABAAAAYAAAAAAAAAAA EAAAYgAAAAEAAAD+////AAAAAF8AAAD///////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////s pcEAI2AJBAAA+FK/AAAAAAAAEAAAAAAABgAAnEIAAA4AYmpiaqEVoRUAAAAAAAAAAAAAAAAAAAAA AAAECBYAMlAAAMN/AADDfwAAQQ4AAAAAAAAPAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//w8AAAAA AAAAAAD//w8AAAAAAAAAAAD//w8AAAAAAAAAAAAAAAAAAAAAAKQAAAAAAEYEAAAAAAAARgQAAEYE AAAAAAAARgQAAAAAAABGBAAAAAAAAEYEAAAAAAAARgQAABQAAAAAAAAAAAAAAFoEAAAAAAAA4hsA AAAAAADiGwAAAAAAAOIbAAA4AAAAGhwAAHwAAACWHAAARAAAAFoEAAAAAAAABzcAAEgBAADmHAAA FgAAAPwcAAAAAAAA/BwAAAAAAAD8HAAAAAAAAPwcAAAAAAAA/BwAAAAAAAD8HAAAAAAAAPwcAAAA AAAAMjYAAAIAAAA0NgAAAAAAADQ2AAAAAAAANDYAAAAAAAA0NgAAAAAAADQ2AAAAAAAANDYAACQA AABPOAAAaAIAALc6AACOAAAAWDYAAGkAAAAAAAAAAAAAAAAAAAAAAAAARgQAAAAAAABHLAAAAAAA AAAAAAAAAAAAAAAAAAAAAAD8HAAAAAAAAPwcAAAAAAAARywAAAAAAABHLAAAAAAAAFg2AAAAAAAA
------=_NextPart_000_0017_01C93507.F6F64E30--
I just want to get :
From: xxxx.xxxx [mailto:[email protected]]
Sent: Thursday, October 23, 2008 11:37 AM
To: [email protected]
Subject: S/I for xxxxx (B/L
No.:4357-0120-810.044)
Pls find attached the xxxx.doc),
Thanks.
B.rgds,
xxx xxx
not sure if the mail is malformed! seems if you get an html page you have to do this:
parts=self._currentEmailParsedInstance.get_payload()
print parts[0].get_content_type()
..._multipart/alternative_
textParts=parts[0].get_payload()
print textParts[0].get_content_type()
..._text/plain_
body=textParts[0].get_payload()
print body
...get the text without a problem!!
thank you so much Vinko.
So its kinda like dealing with xml, recursive in nature.