views:

299

answers:

2

The curses.ascii module has some nice functions defined, that allow for example to recognize which characters are printable (curses.ascii.isprint(ch)).

But, diffrent character codes can be printable depending on which locale setting is being used. For example, there are certain polish characters:

>>> ord('a')
97
>>> ord('ą')
177
>>>

I'm wondering, is there a better way to tell if a number represents printable character then the one used in curses.ascii module:

def isprint(c): return _ctoi(c) >= 32 and _ctoi(c) <= 126

which is kind of locale-unfriendly.

+4  A: 

If you convert the character to a unicode then you can use unicodedata:

>>> unicodedata.category(u'ą')[0] in 'LNPS'
True
Ignacio Vazquez-Abrams
+2  A: 

Well, it is called curses.ascii, so using ASCII rules for what's printable should not be a surprise. If you are using an ISO 8-bit code, or you are operating from a known code page, you will need rules that correspond to what the actual codes and their displays are.

I think using unicode characters and standard Unicode classifications is fine. That just might not deal with what the curses and console arrangement are actually going to display properly.

There also needs to be some consideration for what is acceptable and unacceptable for the application, even if displayable.

orcmid