views:

461

answers:

4

Hi,

I have heard that PHP6 will natively support unicode, which will hopefully make multi-language support much easier. However, PHP5 has pretty weak support for unicode and multi-language (i.e. just a bunch of specialized string functions).

I was wondering what are your strategies to enable unicode and multi-languaage support in your PHP5 applications?

Also, how do you store translations since PHP5 doesn't have WebResource file like ASP.NET does?

+1  A: 

Well for app development in PHP I use CodeIgniter which takes care of handling multiple language files. It's very powerful and easy to use.

Here is a link to their Language Class

flexterra
never used Codeigniter before, thanks +1
oykuo
+6  A: 

It's not all that hard really, but you may want to make your question a bit more specific.

If you're talking to a database, make sure your database stores data in UTF-8 and the connection to your database is in UTF-8 (a common pitfall). Make sure to run something like this when establishing a connection:

mysql_query('SET NAMES utf8');
mysql_query('SET CHARACTER SET utf8');

For user input, set the accept-charset attribute on your forms.

<form accept-charset="utf-8">

Serve your sites with an appropriate HTTP header:

header('Content-Type: text/html; charset=utf-8');

or at least set appropriate meta tags for your site:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Keep your source code files encoded in UTF-8.

If you keep everything in UTF-8, you usually don't need to worry about anything. It's only getting problematic once you start mixing encodings throughout your app.

If you're starting to talk about string manipulation of course, you'll have to take a little more care. Mostly you'll want to use the mb_ set of string functions, as you pointed out yourself.

deceze
meta-tags are *only* parsed by the client, if the http-response doesn't contain them. This means that in practise, they are pretty worthless. On the other hand, you *hould* set the header with php's function for the same, aptly named `header`.
troelskn
+1. For the sake of completeness, You might also want to mention `htmlentities()` in this answer. When working with UTF-8, it should be used as follows: `htmlentities($str, ENT_COMPAT, 'UTF-8');`See http://stackoverflow.com/questions/724926/converting-fractions-to-html-entities/799876#799876 for more information.
Mathias Bynens
+1  A: 

For translations, you can either use a framework, or just roll your own library. You can store translations in csv files and use PHP's fgetcsv() to parse it. CSV files can be edited with any spreadsheet app.

For an example, you can look at the code of Zend_Translate (part of Zend Framework). It's easy to follow along.

Joeri Sebrechts
so I take that fgetcsv() is unicode safe then?
oykuo
+1  A: 

Related to usage of mb_* set of functions, at the same time of maintaining compatibility, see the mb_string.overload php.ini directive.

It will allow you to use the regular string functions which have been overloaded by the multi-byte enabled ones.

HabarNam