tags:

views:

428

answers:

3


Okay, so emoji basically shows the above on a computer. Is that another programming language? So how do I put those little boxes into a php file? When I put it into a php file, it turns into question marks and what not. Also, how can I store these in a MySQL without it turning into question marks and other weird things?

A: 

I'm using FF3.5 and WinXP. I see little boxes in my browser, too.

This tells me the string requires a character set not installed on my computer.

When you put the string into a PHP file, the question marks tell you the same thing: your computer doesn't know how to display the characters.

You could store these emoji characters in MySQL if you encoded them differently, probably using UTF-8.

Do a web search for character encoding, as it relates to MySQL.

pavium
The string requires a *font* not installed on your computer (or most computers). The character set on all StackOverflow pages is Unicode, served in the UTF-8 encoding.
bobince
+2  A: 

This has nothing to do with programming languages, just with encoding and fonts. As a very brief overview: Every character is stored by its character code (e.g.: 0x41 = A, 0x42 = B, etc), which is rendered as a meaningful character on your screen using a font (which says "the character with the code 0x41 should look like this ...").

These emoji occupy the "private use area" of the Unicode table, which is a range of codes that are undefined and free for anyone to use. That makes them perfectly valid character codes, it's just that no standard font has an appropriate character to display for them, since they are undefined. Only the iPhone and other handhelds, mostly in Japan, have appropriate icons for these codes. This is done to save bandwidth; instead of transmitting relatively large image files back and forth, emoji can be transmitted using a single character code.

As for how to store them: They should be storable as is, as long as you don't try to convert them to another encoding, in which case they may get lost. Just be aware that they only make sense on the iPhone and other SoftBank phones in Japan.


Character Viewer

If you're on OSX you can copy and paste the character into the Character Viewer to find out what it is. I think there's a similar Character Map on Windows (albeit inferior ;-P). You could put it through PHP's ord(), but that only works on ASCII characters. See the discussion on the ord page for UTF8 functions.


BTW, just for the fun of it, these characters display fine on the iPhone as is, because the iPhone has a font which has icons for them:

iPhone

deceze
So how does one find the "encodings" for these?
Doug
See edit. What you can find with this are the *character codes*, the *encoding* is something different and can only be guessed (in this case it's probably UTF8).
deceze
+2  A: 

how do I put those little boxes into a php file?

Same way as any other Unicode character. Just paste them and make sure you're saving the PHP file and serving the PHP page as UTF-8.

When I put it into a php file, it turns into question marks and what not

Then you have an encoding problem. Work it out with Unicode characters you can actually see properly first, for example ąαд™日本, before worrying about the emoji.

Your PHP file should be saved as UTF-8; the page it produces should be served as Content-Type: text/html;charset:UTF-8 (or with similar meta tag); the MySQL database should be using a UTF-8 collation to store data and PHP should be talking to MySQL using UTF-8.

However. Even handling everything correctly like this, PCs will still not show the emoji. That's because:

  1. they don't have fonts that include shapes for those characters, and

  2. emoji are still completely unstandardised. Those characters you posted are in the Unicode Private Use Area, which means they don't have any official meaning at all.

Each network in Japan uses different character codes for their emoji, mapped to different areas in the PUA. So even on another mobile phone, it probably won't display the correct character, unless you spend ages manually converting emoji codes for different networks. I'm guessing the ones you posted above are from SoftBank (iPhone?).

There is an ongoing proposal led by Google and Apple to collate the different networks' emoji and give them a proper standardised place in Unicode. Until then, getting emoji to display consistently across networks is an exercise in unhappiness. See the character overview from the standardisation work to see how much converting you would have to do.

God, I hate emoji. All that pain for such a load of useless twee rubbish.

bobince
+1 for the last sentence. ;o)
deceze