ansaurus

Question

KOI8-R: Having trouble translating a string

Answer 1

+2 A:

s = unicode(s) expects ascii encoding by default. You need to supply it an encoding your input is in, e.g. s = unicode(s, 'utf-8').

laalto 2009-06-15 11:15:07

That's very sad, btw. They should've used locale-default one.

alamar 2009-06-15 11:15:56

Oh, I don't know @alamar - I find any time I'm using or talking to anyone about character encodings, failure to be explicit on both ends causes problems, and eventually there's an edge case where you have to supply the information anyhow - better to train people to do it all the time! :-)

Blair Conrad 2009-06-15 11:17:48

Well, docs also doesn't specify what default is - even worse.

alamar 2009-06-15 11:18:36

Answer 2

+1 A:

try unicode(s, encoding) where encoding is whatever your terminal is in.

alamar 2009-06-15 11:15:28

2009-06-15 11:39:24

what's your terminal encoding?

alamar 2009-06-15 12:21:55

Answer 3

A:

Looking at the error messages that you are seeing, it seems to me that your terminal encoding is probably set to KOI8-R, in which case you don't need to perform any decoding on the input data. If this is the case then all you need is:

>>> s = raw_input("Enter a string you want to translit: ")
>>> print ''.join([chr(ord(c) & 0x7F) for c in s])
kOD oBMENA iNFORMACIEJ, 8 BIT

You can double check this by s.decode('koi8-r') which should succeed and return the equivalent unicode string.

mhawke 2009-06-16 02:30:56

ansaurus

tags:

views:

answers:

KOI8-R: Having trouble translating a string

related questions