views:

1148

answers:

2

I would like to know if it is possible to parse .eml and .msg files in dot net (preferably from a memorystream) such that I can use them on an ASP.Net page.

+3  A: 

Yes you can. They are just regular text files, nothing fancy.

This is what an eml file looks like on the inside

X-Sender: [email protected]
X-Receiver: [email protected]
MIME-Version: 1.0
From: [email protected]
To: [email protected]
Date: 7 Jun 2009 18:58:01 -0400
Subject: From someone you know
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable

This is the body
Daniel A. White
Hmmm... ok, great, I thought .eml was the default standard for emails saved from outlook, but it turns out .msg is (which doesnt seem to be a text format), so where do .eml files come from?
CodeKiwi
Outlook Express and the ones ASP.NET drop off when you have the Pickup directory set.
Daniel A. White
What about when the email has html as content-type?
Eduardo Molteni
Look for the content type i would imagine and then the rest would have html after it.
Daniel A. White
Nop, it's encoded
Eduardo Molteni
Good eml/MIME parser should be able to parse correctly the Mime Torture sample message from Mike Crispin (coauthor of MIME and IMAP RFCs)? The structure of the message can be displayed at https://www.rebex.net/secure-mail.net/sample-mime-explorer.aspx . The test messages is included in the download package. Several attachments, signatures, different encoding types and quirks makes the task quite interesting ;-)
Martin Vobr
A: 

EML (MIME messages)

EML are in most cases MIME encoded files with mail messages. Common sources of EML files include messages saved from Outlook Express or Thunderbird, messages downloaded from IMAP or POP3 servers.

Loading an EML file correctly is not as easy as it looks. You can write an implementation working in 95% cases within few days. Remaining 5% would take at least several months ;-). I know, because I involved in developing one.

Consider following difficulties:

  • unicode emails
  • right-to-left languages
  • correcting malformed EML files caused by well known errors in popular mail clients and servers
  • dealing with S/MIME (encrypted and signed email messages)
  • dealing correctly with several methods of encoding attachments
  • dealing with inline images and stylesheets embedded into HTML emails
  • making sure that it parses correctly a MIME torture message from Mike Crispin (coauthor of Mime and IMAP RFCs)
  • making sure that malformed message will not result in buffer overun or other application crash
  • handling hierarchical messages (message with attached messages)
  • making sure that it handles correctly very big emails

Maturing of such parser takes years and continuous feedback for it's users. Right now is no such parser included in the .NET Framework. Until it changes I would suggest getting a third party MIME parser from an established vendor.

Following code uses our Rebex Secure Mail component, but I'm sure that similar task could be replicated easily with components from other vendors as well.

The code is based on Mail Message tutorial.

// create an instance of MailMessage 
MailMessage message = new MailMessage();

// load the message from a local disk file 
message.Load("c:\\message.eml");

// load the message from MemoryStream
MemoryStream stream = new MemoryStream(); 
// TODO: fill the stream, seek to the beginning
message.Load(stream);

MSG (Outlook messages)

MSG format is a email message format introduced by Microsoft in Microsoft Outlook. There is format specification on Microsoft site. You may also want to try a third-party component. I'm aware of one MSG format component from IndependentSoft but haven't tried it personally.

Martin Vobr