views:

257

answers:

3

Hi,

I wonder how sites like yahoomail or gmail move the messages, which we click as spam into the spam folder. As far as I concerned Bayesian analysis algorithm checks the messages, if it is spam based on content, or some other probability. But what algorithm do these sites(yahoomail or gmail) use to migrate the message from one folder to another dynamically?

+1  A: 

Most mail systems allow the insertion of filter programs that are used to, among other things, determine if a message is spam or not. Procmail is, perhaps, the best known of these. The basic process:

  1. Send mail to filter program.
  2. Filter program checks spamminess, adds header and/or subject information.
  3. Sorting program (procmail, etc.) looks for header/subject information indicating spam level. If above some threshold, deliver to Spam folder. If not, deliver to Inbox.

Note that procmail and other similar software also allow a lot more functionality for automating delivery and/or filtering tasks - this is a fairly trivial example.

Harper Shelby
A: 

This is a strange question, but the literal answer is that email servic provides like Google, Yahoo and so on would implement this differently, depending on how they internal store mail messages and folders. For example, if email messages are stored as individual files and folders are represented as directories, then moving an email to the spam folder would be done as a file rename / move. On the other hand, if mail is stored in an SQL database, moving a message from one folder to another would be an UPDATE to a row in (say) a mail descriptor TABLE.

There are many possible ways to represent email messages and folders, each email service provider is likely to do it differently, and we have no way of knowing how they do it.

I would hesitate to call this process an "algorithm". Certainly, there will be no single algorithm, given that representations vary, and that models of what a folder is vary.

I don't see any connection between your question and the "java" or "javamail" tags. The chances are the big providers don't implement their email services in Java.

Stephen C
A: 

Check popfile http://getpopfile.org/ Software allows you to classify emails same way that you can sort Spam, but in multiple folders. You just move the email in correct folder and it starts learning.

After time, it learns how it should classify email. It works using bayesian forumula.

ralu