Shortly you should use IETF language tags because they are already used for HTTP/HTML/XML and many other technologies. These is based on several standard including ISO-639 collection (yes language, region and culture selection is not so simple to define).
I wrote a more detailed article regarding the proper language code selection and usage. The idea is to use the simplest/shorter ISO-639-1 codes and specify more only for special cases. Inside the article there are codes for ~30 most used languages with explanation regarding why I consider one alternative better than another.
In case you want to skip reading the entire article here is a short list of language codes (not to be confused with country codes): ar, cs, da, de, el, en, en-gb, es, fr, fi, he, hu, it, ja, ko, nb, nl, pl, pt, pt-pt, ro, ru, sv, tr, uk, zh, zh-hant
As you may observer there are some not so obvious remarks:
en
is used for en-us
- American English, and for British English is used en-gb
pt
is used for pt-br
, and not pt-pt
witch has much less speakers
zh
is used instead of zh-hans
, zh-CN
,...
zh-hant
(Traditional Chinese) is used instead of more specific codes like zh-hant-TW
or zh-TW
You can find more explanations inside the article.