views:

34

answers:

1

Hi,

Adding on one more query extending from here Detecting language of email body:

Since I want to determine the language of the email that I receive on my system, so that I can reply to the sender in same language.

So in the email headers there is one header of the kind:

'Content-Type: text/plain; charset=ISO-8859-1'

How good it can prove in determining the language of the email body?

e.g (all headers taken out from gmail):

  1. for Chinese subject and body 'Content-Type: text/plain; charset=GB2312'

  2. for Korean subject and body 'Content-Type: text/plain; charset=EUC-KR'

  3. for french/italian subject and body 'Content-Type: text/html; charset=ISO-8859-1'

Also is there any list somebody can direct me that have mappings defined for language to charset?

thanks in advance

Ashish

+1  A: 

Here is the required list

I would suggest you to go for google api to detect language. as suggested here

org.life.java