views:

53

answers:

3

I'm currently trying to write a simple script that looks in a folder, and returns a list of all the file names in an RSS feed. However I've hit a major wall... Whenever I try to read filenames with Japanese characters in them, it shows them as ?'s. I've tried the solutions mentioned here: http://stackoverflow.com/questions/482342/php-readdir-problem-with-japanese-language-file-name - however they do not work for some reason, even with:

header('Content-Type: text/html; charset=UTF-8');
setlocale(LC_ALL, 'en_US.UTF8');
mb_internal_encoding("UTF-8");

At the top (Exporting as plain text until I can sort this out).

What can I do? I need this to work and I don't have much time.

A: 

This is not possible. It is a limitation of PHP itself. PHP does not use the wide WIN32 API calls, so you're limited by the codepage. UTF-8 (65001) is not valid for this purpose.

If you set a breakpoint at readdir_r() in win32\readdir.c, you'll see that FindNextFile already returns a filename with question marks in place of the characters you want, so there's nothing you can do about it, apart from patching PHP itself.

Artefacto
Well, that sucks... Since PHP is not an option, what would an alternative language be that handles Window's filename encoding? I just need to export an RSS feed of filenames in a folder (Along with the exact path and some other simple text).
Jon
I suggest you use Java. ROME (https://rome.dev.java.net/) is a very good RSS library.
Artefacto
+1  A: 

You can do it in PHP. Write a small C program to read directories and call that program from PHP.

See also: http://en.literateprograms.org/Directory_listing_(C,_Windows) http://www.daniweb.com/forums/thread74944.html http://forums.devshed.com/c-programming-42/reading-a-directory-in-windows-36169.html

Full Decent
You call this doing it in PHP? :p It's an acceptable work-around, though.
Artefacto
A: 

This displays Japanese filenames correctly on a Windows server

if ($handle = opendir($this->dir)) {
    while (false !== ($file = readdir($handle))){
        $name = mb_convert_encoding($file, "UTF-8", "SJIS-win" );
        echo "$name<br>";
    }
    closedir($handle);
}
dmortell