I'm using os.walk to create a list of all music files under a folder. Some of these filenames are non-ascii, for example:
01 空即是色.mp3
I'm using the mutagen library to parse metadata for this file, and it professes complete unicode support. The filename is being retrieved as unicode, and can be printed as unicode. However, no matter what I do (including normalising the unicode beforehand, or encoding it as utf-8 beforehand), mutagen attempts to open()
01 \xe7\xa9\xba\xe5\x8d\xb3\xe6\x98\xaf\xe8\x89\xb2.mp3
or
01 \u7a7a\u5373\u662f\u8272.mp3
How can I force it to open()
the correct filename (the one it is perfectly capable of print
ing)?
The full code is here.
Note: I am rather new to python and programming in general, any advice you could give in regards to my code would be very much appreciated. Thanks in advance
EDIT: Okay, this is a rather embarrassing error of mine, the problem was not the character encoding, it was the fact that the path was not being appended to the open()
call. How do I find the full path for a file found via walk()
? The files are 2-3 directories deep.