views:

40

answers:

2

I want to write a folder on a windows system, Vista and Win7 with NTFS file systems. The folders may contain the characters å, ä and/or ö, "förjävligt" for example.

The python files and every string in it is currently in UTF-8, how do I convert it to suite the Windows file system?

+3  A: 

If you're working with normal Python 2 strings, you can simply convert them to Unicode

# -*- coding: utf-8 -*-
normalString = "äöü"

# Now convert to unicode. Specified encoding must match the file encoding
# in this example. In general, you must specify how the bytes-only string
# contained in "normalString" is encoded.
unicodeString = unicode(normalString, "utf-8")

with open(unicodeString, "w") as f:
    ...

and create the files using those Unicode strings. Python (and indirectly the Windows API) will take care of the rest.

AndiDog
+1  A: 

If you want to make the strings really nice for working with in windows you can use this safeFilenameCodec. It is a subset of allowable characters, but you won't have to worry about any craziness getting by. And it has generous licensing.

rox0r
That library should be called "uglyFilenamesCodec". It works but the filenames won't be very readable. I created a similar library for naming music files (escapes characters like \/:*?"<>| in the song title) - but I wouldn't use such a solution for *all* filenames. Anyway +1 for the suggestion.
AndiDog