You need to use MIME. Add mail headers:
MIME-Version: 1.0
Content-Type: text/plain;charset=utf-8
(If you are already using a MIME multipart/alternative
to put HTML and text in the same mail, you put the Content-Type: text/plain;charset=utf-8
on the sub-headers of the text part instead.)
This is assuming that the encoding you'll be sending your “international” characters in is UTF-8. If you are expecting to cater for multiple countries UTF-8 is the only reasonable choice of encoding to use throughout your application, but if you haven't really thought about that yet your site may be defaulting to a Western European encoding. Check that things like Chinese characters work correctly in your site and database before worrying about them in mail.
Derail: there are locales where sending mail in UTF-8 isn't the most effective thing. I don't know about China, but in Japan there are still some backwards and ridiculous mail systems (especially webmail) that can't cope with Unicode and have to be given a locale-specific encoding such as Shift-JIS instead. If you are concentrating on those markets you'll often end up having to use iconv
to create specially-encoded versions of the mail. Unpleasant.
Now, because many mail servers can't cope with non-ASCII characters in the mail body, you'll have to encode them. You can choose quoted-printable or base64 for this; quoted-printable is generally smaller and more readable for content that has ASCII characters in it too:
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: quoted-printable
Hello! An a-acute is =C3=A1
The function to encode in this format is quoted_printable_encode. However you do need a reasonably up-to-date PHP to get that function; if you don't have it you could set the Content-Transfer-Encoding
to base64
instead and use base64_encode.
Finally, if you want to include non-ASCII characters in the headers (for example in From
, To
or Subject
), there is a completely different syntax:
Subject: =?utf-8?b?QW4gYS1hY3V0ZSBpcyDDoQ==?=
Where that QW...==
mess in the middle is the base64_encode
of “An a-acute is á” in UTF-8.