views:

189

answers:

1

I am creating a web application framework, in which I am providing support for multilingual content. I mean a content, say a paragraph can have 2 sentences in English and other 2 sentences in Hindi (an indian language). Now I have several doubts about that.

1) User or admin will add that content to the website. They will be presented a textarea (where they can paste their content). Then they submit the post and I will save the content in a database. I also want to provide them a web based typewriter interface where they can type content in a given language, copy it from there, and then put it back in my main textarea. Doubt: 1a) Will I need to do something to the textarea, so that it will accept characters in unicode. 1b) Where can I find a typewriter interface for some language I desire. Does tinymce supports that. 1c) I should put the encoding of database as 'UTF 8', right?

2) Then I nead to get content from database and put it in a webpage and show it. Now this content has utf8 encoding. As it can have many languages. What should I need to do? I am guessing that just setting encoding of the webpage as utf-8 will do. What will happen if the font that is required by a language is not installed on clients pc?

I am using PhpEd editor. Should my php files encoding must be utf-8, or just specifying the html encoding tag as utf8 will be enough?

I am a bit stumped. Please help.

+3  A: 

1a) Yes, if the text area will accept text in any language, as long as you have the web page that contains it encoded in UTF-8. If it doesn't work, double check both the HTTP Content-type header, and the HTML META http-equiv tag for Content-type. If they are both present, they should agree; one of them would be sufficient.

1c) what to do with your database depends on the specific DBMS you use. If supported, make sure that

1. the table encoding
2. the connection/the client encoding

are both set to UTF-8.

2) Again, set the page encoding to UTF-8 (see 1a). If there are no sufficient fonts on the client system, you lose - but likely, if that's the case, the end user wouldn't have been able to read the text, anyway (most users do have fonts for text in their native languages).

The encoding of the PHP files is only relevant if they contain non-ASCII text (which you should avoid).

Martin v. Löwis
Thank you Martin.
Krishna Kant Sharma