As Mike says, you can get the system codepage from getfilesystemencoding
. This encoding is used to convert Windows's native Unicode strings into bytes for all C stdio functions used by Python, including the filesystem calls that use byte string filepaths, and os.environ
.
What this means is that you will be able to read a string with non-ASCII characters from os.environ
and use it directly as a filepath without any special encode/decode step.
Unfortunately, if the %APPDATA%
variable contains Unicode characters that are not present in the system codepage — for example, if on a German (cp1252) Windows install, your path was C:\Documents and Settings\αβγ\Application Data
— then those characters will have already been mangled before you get the chance to use them. Decoding the byte string you get to Unicode using the filesystemencoding won't help in that case.
Here's a function you can use on recent Python versions that have the ctypes
extension, to read Windows native Unicode environment variables.
def getEnvironmentVariable(name):
n= ctypes.windll.kernel32.GetEnvironmentVariableW(name, None, 0)
if n==0:
return None
buf= ctypes.create_unicode_buffer(u'\0'*n)
ctypes.windll.kernel32.GetEnvironmentVariableW(name, buf, n)
return buf.value
In Python 3, the os.environ
dictionary contains Unicode strings taken directly from Windows with no codepage encoding, so you don't have to worry about this problem there.