I read Japanese, and want to try processing some Japanese text. I tried this using Python 3:
for i in range(1,65535):
print(chr(i), end='')
Python then gave me tons of errors. What went wrong?
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~Traceback (most recent call last):
File "C:\test\char.py", line 11, in <module>
print(chr(i), end='')
File "C:\Python31\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\x80' in position 0: character maps to <undefined>
My understanding is that the chr function goes on to convert Unicode numbers into the respective Japanese characters. If so, why are the Japanese characters not outputted? Why does it crash at the end of the list of Roman characters?
Please also correct me if I am mistaken in my understanding that the Unicode set was devised solely to cater for non-Western languages.
EDIT:
I tried the 3 lines suggested by John Machin in IDLE, and the output worked!
Before this, I had been using Programmer's Notepad, with the Tools set to capture python.exe compiler's output. Perhaps that is why the errors came about.
However, for most other things, the output is captured properly; then why does it fail particularly in this process? i.e. Why does the code work in the IDLE Python Shell, but not through Programmer's Notepad output capture? Shouldn't the output be the same, regardless of the interface?