I can't use mkdir to create folders with UTF-8 characters.
<?php
$dir_name = "Depósito";
mkdir($dir_name );
?>
But, when I browse this folder in Windows Explorer, the folder name looks like this:
Depósito
What should I do?
I can't use mkdir to create folders with UTF-8 characters.
<?php
$dir_name = "Depósito";
mkdir($dir_name );
?>
But, when I browse this folder in Windows Explorer, the folder name looks like this:
Depósito
What should I do?
The problem is that Windows uses utf-16 for filesystem strings, whereas Linux and others use different character sets, but often utf-8. You provided a utf-8 string, but this is interpreted as another 8-bit character set encoding in Windows, maybe Latin-1, and then the non-ascii character, which is encoded with 2 bytes in utf-8, is handled as if it was 2 characters in Windows.
A normal solution is to keep your source code 100% in ascii, and to have strings somewhere else. However, PHP6 introduces Unicode functions etc., so you might want to have a look at those.
Just urlencode
the string desired as a filename. All characters returned from urlencode
are valid in filenames (NTFS/HFS/UNIX), then you can just urldecode
the filenames back to UTF-8 (or whatever encoding they were in).
Caveats (all apply to the solutions below as well):
glob
or reopening an individual file.scandir
or similar functions for alpha-sorting. You must urldecode
the filenames then use a sorting algorithm aware of UTF-8 (and collations).The following are less attractive solutions, more complicated and with more caveats.
On Windows, the PHP filesystem wrapper expects and returns ISO-8859-1 strings for file/directory names. This gives you two choices:
Use UTF-8 freely in your filenames, but understand that non-ASCII characters will appear incorrect outside PHP. A non-ASCII UTF-8 char will be stored as multiple single ISO-8859-1 characters. E.g. ó
will be appear as ó
in Windows Explorer.
Limit your file/directory names to characters representable in ISO-8859-1. In practice, you'll pass your UTF-8 strings through utf8_decode
before using them in filesystem functions, and pass the entries scandir
gives you through utf8_encode
to get the original filenames in UTF-8.
Caveats galore!
mb_convert_encoding
instead of utf8_decode
.unicode_semantics = On
may change everything...This nightmare is why you should probably just transliterate to create filenames.
It is possible to interact with the filesystem on Windows using a combo of 8.3 ShortPath and a COM Scripting.FileSystemObject :
http://github.com/nicolas-grekas/Patchwork/blob/lab/windows/class/WIN.php
It is not bullet proof, as for example ShortPath support can be disabled on NTFS, but it should work quite well for experimenting at least.