views:

131

answers:

3

Hi,

I used Python's imaplib to pull mail from a gmail account... but I got an email with this confusing text body:

> RGF0ZSBldCBoZXVyZTogICAgICAgICAgICAgICAgICAgICAgICAgICAyMi8wOC8yMDEwIDE0
> OjMzOjAzIEdNVCBVbmtub3duDQpQcsOpbm9tOiAgICAgICAgICAgICAgICAgICAgICAgICAg
> ICAgICAgICAgamFjaW50bw0KTm9tOiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
> ICAgICBjYXJ2YWxobw0KRS1NYWlsOiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg

who can help me to read this file from my email...

Thx

lo

+11  A: 

It looks like base64. In Python you can either use base64.b64decode or str.decode('base64').

message = '''
RGF0ZSBldCBoZXVyZTogICAgICAgICAgICAgICAgICAgICAgICAgICAyMi8wOC8yMDEwIDE0
OjMzOjAzIEdNVCBVbmtub3duDQpQcsOpbm9tOiAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgamFjaW50bw0KTm9tOiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICBjYXJ2YWxobw0KRS1NYWlsOiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
'''

print message.decode('base64')

Result:

Date et heure:                           22/08/2010 14:33:03 GMT Unknown
Prénom:                                   jacinto
Nom:                                     carvalho
E-Mail:

The é looks like it is incorrect. It appears that the text was encoded in UTF-8, so you also need to decode the UTF-8 encoding:

print message.decode('base64').decode('utf-8')

Result:

...
Prénom:
...

One other thing to be aware of is that there are different variants of Base64 encodings that differ in the two symbols they use for value 62 and 63. Using base64.b64decode you can specify these two characters if the defaults don't work for you.

Mark Byers
OP kinda got lucky it didn't have any usernames/passwords in there :)
sdolan
+1  A: 

Mark is mostly correct, but it is also UTF-8 encoded as evidenced by the \uc3a9 in "Prénom".

(And I'm trying to figure out why I get a proper decoding and Mark doesn't for the same code, but that's another issue.)

msw
@msw: Maybe your terminal is set to UTF-8 by default?
Mark Byers
@Mark: aye, 'twas it (learn something everyday (and my mother said all that time spent forging passports would never do me any good...))
msw
A: 

It's Base-64 UTF-8. This particular text says:

Date et heure:                           22/08/2010 14:33:03 GMT Unknown
Prénom:                                   jacinto
Nom:                                     carvalho
E-Mail:                                
editor
Thx a lot... i succeed, you are all Great.. thxlaurent
laurent
Great! Would you mind picking an answer to close this out?
editor