views:

363

answers:

2

I'm writing a PHP server and the client is sending data in a specific character encoding. Now, I want to tell the server to read and write data in this same character encoding set.

How should I specify the character encoding set for PHP's socket_read and socket_write methods?

+1  A: 

These functions transmit the string unconverted (i.e. byte by byte), so you cannot set a character encoding. But this means as well, that you don't have to. You'll have the exact same string on the other side with the exact same encoding.

You can convert it before sending/after receiving the string using mb_convert_encoding().

soulmerge
This I understand for receiving. But when the server is sending data, in what character encoding set will it do this? How can I set this default character encoding?
Tom
It is sent in the character set the string has.
soulmerge
Well... the following string can be Unicode but can also be UTF-8: "hello". So how do I make it pick one of the two?
Tom
Or, when I set the string... $string = "hello" - how do I make it set the string in a specific encoding?
Tom
If you write $string = "hello", it will have the encoding of the source file the statement is in. Your "hello"-example is an indication that you need more background info on character encodings (utf8 is a unicode variant). I would recommend this: http://www.joelonsoftware.com/articles/Unicode.html
soulmerge
Alright, so how can I set the enocing of my source file?
Tom
That's something you do in your editor. Example for vim: `:set fileencoding=utf8`
soulmerge
+1  A: 

PHP has close to zero automatic character encoding support. You'll have to keep track of which encoding any given string is and make sure it gets converted to the apropriate encoding every time it enters or leaves your code. Once good way of doing this is deciding all interal strings are in a certain encoding (eg PHP) and converting every incoming string to that encoding on its way in.

(Or use a different language)

grahamparks