tags:

views:

155

answers:

2

Does anyone know of a .NET library that will process HTML e-mails and can be used to trim out the reply-chain? It needs to be able to accept HTML -or- text mails and then trim out everything but the actual response, removing the trail of messages that are not original content. I don't expect it to be able to handle responseswhen they're interleaved into the previous mail ("responses in-line") - that case can fail.

We have a home-built one based on SgmlReader and a series of XSL transforms, but it requires constant maintenance to deal with new e-mail clients. I'd like to find one I can buy... :)

Thanks, Steve

+1  A: 

This does not answer much of your question, but the W3C's Converting HTML to Other Formats has a section on converting HTML to text. I hope it helps someone develop a full answer to your question!

Joseph Holsten
+1  A: 

One free and very useful library we've used for dealing with HTML, including malformed HTML, is the HtmlAgilityPack.

There is no StripOutPreviousResponses() function, but it may help you with your home-made one.

Judah Himango
Thanks, Judah. We currently use the SgmlReader code which might be an ancestor of HtmlAgilityPack. I'd love to move to a supported library.Unfortunately it's the recognition of HTML intent that's our main problem here, rather than the manipulation of the HTML itself. But I appreciate the answer!
Steve Eisner