views:

80

answers:

3

Can anybody please tell me what is the range of Unicode (UTF8) printable characters? [e.g. Ascii printable character range is \u0020 - \u007f]

+1  A: 

Unicode, stict term, has no range. Numbers can go infinite.

What you gave is not UTF8 which has 1 byte for ASCII characters.

As for the range, I believe there is no range of printable characters. It always evolves. Check the page I gave above.

Wernight
+3  A: 

See, http://en.wikipedia.org/wiki/Unicode_control_characters

You might want to look especially at C0 and C1 control character http://en.wikipedia.org/wiki/C0_and_C1_control_codes

The wiki says, the C0 control character is in the range U+0000—U+001F and U+007F (which is the same range as ASCII) and C1 control character is in the range U+0080—U+009F

other than C-control character, Unicode also has hundreds of formatting control characters, e.g. zero-width non-joiner, which makes character spacing closer, or bidirectional text control. This formatting control characters are rather scattered.

More importantly, what are you doing that requires you to know Unicode's non-printable characters? More likely than not, whatever you're trying to do is the wrong approach to solve your problem.

Lie Ryan
I want to create a random unicode string generator which will generate printable characters.
Anindya Chatterjee
Printable by whom? Do you want to include eg. all the Chinese characters? Many users won't have fonts for them, so ‘printing’ them would give you nothing, a blank box, or some other useless replacement character.
bobince
+2  A: 

First, you should remove the word 'UTF8' in your question, it's not pertinent (UTF8 is just one of the encodings of Unicode, it's something orthogonal to your question).

Second: the meaning of "printable/non printable" is less clear in Unicode. Perhaps you mean a "graphical character" ; and one can even dispute if a space is printable/graphical. The non-graphical characters would consist, basically, of control characters: the range 0x00-0x0f plus some others that are scattered.

Anyway, the vast majority of Unicode characters (more than 200.000) are "graphical". But this certainly does not imply that they are printable in your environment.

It seems to me a bad idea, if you intend to generate a "random printable" unicode string, to try to include all "printable" characters.

leonbloy