ansaurus

Question

Answer 1

A:

Encode to the filesystem encoding before calling. See the locale module.

Ignacio Vazquez-Abrams 2010-01-16 08:35:11

thanks for this. But I'm not sure I follow. Are you saying that I can tell Django that an uploaded file's name should be adapted? I don't see anything about this in the locale module.

interstar 2010-01-16 08:57:52

You have to use the native system's encoding to refer to files. Try `locale.nl_langinfo(locale.CODESET)`.

Ignacio Vazquez-Abrams 2010-01-16 08:59:56

Answer 2

+1 A:

I'm assuming you're in Unix. If not, please remember to say which OS you're in.

Make sure your locale is set to UTF-8. All modern Linux systems do this by default, usually by setting the environment variable LANG to "en_US.UTF-8", or another language. Also, make sure your filenames are encoded in UTF-8.

With that set, there's no need to mess with encodings to access files in any language, even in Python 2.x.

[~/test] echo $LANG
en_US.UTF-8
[~/test] echo testing > 漢字
[~/test] python2.6
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.stat("漢字")
posix.stat_result(st_mode=33188, st_ino=548583333L, st_dev=2049L, st_nlink=1, st_uid=1000, st_gid=1000, st_size=8L, st_atime=1263634240, st_mtime=1263634230, st_ctime=1263634230)
>>> os.stat(u"漢字")
posix.stat_result(st_mode=33188, st_ino=548583333L, st_dev=2049L, st_nlink=1, st_uid=1000, st_gid=1000, st_size=8L, st_atime=1263634240, st_mtime=1263634230, st_ctime=1263634230)
>>> open("漢字").read()
'testing\n'
>>> open(u"漢字").read()
'testing\n'

If this doesn't work, run "locale"; if the values are "C" instead of en_US.UTF-8, you may not have the locale installed correctly.

If you're in Windows, I think Unicode filenames should always just work (at least for the os/posix modules), since the Unicode file API in Windows is supported transparently.

Glenn Maynard 2010-01-16 09:42:43

ansaurus

tags:

views:

answers:

Python os.stat and unicode file names

related questions