views:

29

answers:

2

Are these squares a representation of chinese characters being turned into unicode?

EDIT:[Here I entered the squares with numbers inside them into the post but they didn't render]

I'd like to either turn this back into the original characters when displayed in android (or to enable mysql to just store them as chinese characters not in unicode???)

BufferedReader reader = new BufferedReader(new InputStreamReader(is, "UTF-8"), 8);

While debugging it shows the strings value as "\u001a\u001a\u001a\u001a"

 byte[] bytes = chinesestringfromdatabase.getBytes();

turns it into "[26, 26, 26, 26]"

String fresh = new String(bytes, "UTF-8");

and then this turns it back into EDIT:[Here I entered the squares with numbers inside them into the post but they didn't render]

My phone can display chinese text.

MySQL charset: UTF-8 Unicode (utf8)

While typing my question I realize that perhaps I have the wrong charset all together. I'm lost as to whether or not my issue will even be anything coding related or if it is just related to a setting or if php cannot handle the character set??

I'd like to store and render multiple language character sets that could contain a mixture of languages.

A: 

Here I entered the squares with numbers inside them into the post but they didn't render

With "squares with numbers inside", do you mean the same as those which you also see for some exotic languages somewhere at the bottom of the Wikipedia homepage, while browsing with Firefox browser? (in all other browsers -MSIE, Chrome, Safari, etc- you would only see nothing-saying empty squares).

If true, then it simply means that there are no glyphs available for those characters in the font which the webbrowser/viewer is been instructed to use.

I'd like to store and render multiple language character sets that could contain a mixture of languages.

Use UTF-8 all the way. Only keep in mind that MySQL only supports the BMP panel of Unicode (max 3 bytes per character), not the other panels (4 bytes per character). So the SMP panel (which contains "special" CJK characters) is out of range for MySQL.

References

BalusC
[side question - How do I quote you as you quoted me]"With "squares with numbers inside", do you mean the same as those which you also see for some exotic languages somewhere at the bottom of the Wikipedia homepage, while browsing with Firefox browser? (in all other browsers -MSIE, Chrome, Safari, etc- you would only see nothing-saying empty squares).If true, then it simply means that there are no glyphs available for those characters in the font which the webbrowser/viewer is been instructed to use."YES!
opted out
Just post a comment :) Well, then you need to install a proper font or to instruct the webbrowser/viewer to use a different font of which you can be certain that it has proper glyphs in its set.
BalusC
I have gone into the native android browser and copied some chinese text (as a test) and then went back into my application and pasted the chinese text into an EditText form ( they are shown properly) and submitted.From there it is sent to a php file through httppost and then it is inserted into the mysql database which in turn is read and brought to be shown in a textview, which only renders the square characters.I'm assuming without those middle steps, if I were to just set my onclicklistener to change the textview to equal the contents of the edittext, that it would be displayed properly.
opted out
I'll try to extract the typeface that the edittext is using and apply it to my textview, and if that doesn't work i'll include my own font in an asset and try that.Ok... using the typeface from the edittext ( that does render chinese characters properly).. I debugged and saw unique values when gettext.toString()'ing the edittext but once returned from the database each character holds a generic byte value of 26 and the squares all appear to hold the same value when inspected.. Font may have been an issue, but something is stripping what make these nonrendered characters unique, as well.
opted out
A: 
bobince