views:

21

answers:

3

What sort of encoding/image/formatting issues are there when building a web mail client that pulls emai via pop3?

Some things i can think of that I know I will have to handle:

  1. attachments
  2. inline images
  3. html emails

What other possible headaches are there?

+1  A: 

Many !

I highly suggest you to read the pop3 rfc as a starter.

http://www.faqs.org/rfcs/rfc1939.html

You can download few open source projects to see how they implemented the rfc's.

Pierre 303
well i'll be using java.mail
Blankman
Reading the rfc is IMHO the minimum you can do to learn what you need in your question.
Pierre 303
+2  A: 

It's quite a lot of work, and there are already a lot of solutions out there - but that shouldn't deter you! Your three points cover almost everything in general terms... the fact it's coming through POP3 isn't all that relevant, IMAP, or even OWS (Outlook Web Services for Exchange) all require attention to the following points:

  • Attachments can be referred to inline in an email (combo of your 1,2,3) - as in an email can include an IMAGE which is itself an attachment.
  • There are many MIME types you have to support.
  • Emails can be a single part, multi-part different, multi-part alternative, and combinations thereof. A good newsletter will send you a text & HTML version of the same data leaving the client the choice to choose which way to consume the data. That email could have one or more attachments... and that attachment can be another text/html email with another attachment... and this goes on ad nauseam.
  • HTML As you've already pointed out, rendering email HTML inside your page without intersects in style etc is tricky, plus you'll want to filter for bad content - JavaScript includes potentially, embedded images which might have privacy implications.
  • There are several character encodings that can be used - this ties into MIME types, but is worth independently noting (for headache-worthiness alone).

Basically you have to be jack of a number of trades to generate and decode emails.

Rudu
+1  A: 

I agree with Pierre that you should read the specs to fully understand what's going on behind the scenes.

One thing I would add though is that the key thing I would be worried about is the security of the mailboxes you are reading and SPAM. Emails often contain calls to javascript/images that can be used to track whether the message has been opened. This is the key reason many mail client don't show images unless you turn them on.

Along with the other methods you are using you will probably have to make sure you parse the message and take out any calls that could potentially cause privacy issues unless the senders are trusted.

Steve Smith