ansaurus

Question

Removing previous part from Reply emails

Answer 1

A:

No IMAP Server will not and does not remove anything
Such library does not exist because there is no standard, every email provider does it differently, gmail etc have developped their own tools
You have to look for pattern, that will somehow begin with headers with recipient as sender, like...

From: <receipent>
From: "NAME" <receipent>
From: receipent

and you have to omit the parts from this line below, howerver only checking this will not be sufficient as usually from is followed by subject,cc,to etc, so the pattern needs to be checked. I think some open source project or text library may exist, but its too difficult to find it on google.

Akash Kava 2010-09-25 10:52:46

Answer 2

+1 A:

Personally I think that you are out of luck here, as the message copy is part of the body. So in order to remove it you will have to process the message's body and write an extraction method for each known format (obviously the problem is that you cannot know all possible formats).

So, instead of parsing the body why don't you persist the whole message into the database? Normally the size of the message should not be the problem with modern DBMS. If it really is a problem you always can compress the body and store it in a BLOB.

Obalix 2010-09-25 10:54:39

I disagree, size is not the constraint most of the time but we need to display only the message and not the replies to the view.

Akash Kava 2010-09-25 11:25:52

I agree with you that the copied text is just clutter, however, it one will have to make a tradeoff: 1. Developing a filter that will ever only catch part of the clutter and has the danger of also removing relevant content - and thus be cause of the risks will most likely prove costly. - or - 2. Live with the clutter and deliver the project with a much lower risk. -- But as I said, it is a tradeoff!

Obalix 2010-09-25 13:48:47

Answer 3

A:

If you are able to associate a reply (RE:) message with the original/previous message that it is a reply to, then I would think that you could grab the body text of the original/previous message from your database, and then remove that text from the body of the reply. However, this method will not be 100% accurate, because clients could convert an HTML/Rich Text email in to plain text, or vice-versa. In any such case, this method probably wouldn't work. Even so, this technique would be generic and probably work the majority of the time.

In addition, the email provider may add certain header fields, or preambles, to the beginnings of a quoted message in a reply. In this case, I don't think there is any "catch all" solution.

My recommendation would be to target a few of the really huge web-mail providers (Gmail, Yahoo, Microsoft, etc), learn the formats that they use for their replies and parse the messages accordingly. In addition, you could likely handle a few generic formats as well. For instance, the '>' character is commonly used at the beginning of each line of quoted text in a reply.

If you're going to be developing in a language like C#, create yourself an Interface like IReplyFormat, with a corresponding implementation for each provider, and possibly some generic formats.

I don't think you will find any catch-all/perfect solution to this problem, as there are simply too many mail providers with too many different formats. However, I think you can at the very least find some techniques, like the ones mentioned above, that will work for you more times than not, which is the best you can hope for at this point.

Justin Holzer 2010-09-30 13:16:29

Answer 4

A:

I agree with Obalix. It's too hard to filter out replies so must keep the whole message. However, when you present email to the user, you can hide some parts of it. Those part can be shown with an optional "Click here to see the full message" or similar. For instance, regular expression to filter '>' characters would look something like @"^[ \f\t\v>]*"

SlavaGu 2010-10-01 12:43:39

ansaurus

tags:

views:

answers:

Removing previous part from Reply emails

related questions