Unfortunately there is currently not a single solution that works with all browsers.
There are at least three "more obvious" approaches to the problem.
a) Content-type: application/octet-stream; charset=utf-8
+ filename=<utf8 byte sequence>
e.g. filename=Москва.txt
This is a violation of standards but firefox shows the name correctly. IE doesn't.
b) Content-type: application/octet-stream; charset=utf-8
+ filename=<urlencode(utf8 byte sequence)>
e.g. filename=%D0%9C%D0%BE%D1%81%D0%BA%D0%B2%D0%B0.txt
This works with IE but not with firefox.
c) providing the name as specified in rfc 2231
e.g filename*=UTF-8''%D0%9C%D0%BE%D1%81%D0%BA%D0%B2%D0%B0.txt
Again firefox supports this, IE doesn't.
for a more comprehensive comparison see http://greenbytes.de/tech/tc2231/
edit: When I said that there is no single solution, I meant via header('...'). But there is something of a work around.
When there is no usable filename=xyz header browsers use the basename of the path part of the url. I.e. for <a href="test.php/lala.txt">
both firefox and IE suggest lalala.txt
as the filename.
You can append extra path components after the actual path to your php script (when using apache's httpd see http://httpd.apache.org/docs/2.1/mod/core.html#acceptpathinfo).
E.g. if you have a file test.php in your document root and request it as http://localhost/test.php/x/y/z
the variable $_SERVER['PATH_INFO']
will contain /x/y/z
.
Now, if you put a link like
<a
href="/test.php/download/moskwa/Москва"
>
Москва
</a>
in your document you can fetch the download/moskwa/...
part and initiate the download of the file. Without sending any filename=... information both firefox and IE suggest the "right" name.
You can even combine it with sending the name according to rfc 2231. That's why I also put moskwa
into the link. That would be the id the script uses to find the file it is supposed to send. The IE ignores the filename*=...
information and still uses the basename part of the url to suggest a name. That means for firefox (and any other client that supports rfc 2231) the part after the id is meaningless* but for the IE (and other clients not supporting rfc 2231) it would be used for the name suggestion.
self-contained example:
<?php // test.php
$files = array(
'moskwa'=>array(
'htmlentities'=>'Москва',
'content'=>'55° 45′ N, 37° 37′ O'
),
'athen'=>array(
'htmlentities'=>'Αθήνα',
'content'=>'37° 59′ N, 23° 44′ O'
)
);
$fileid = null;
if ( isset($_SERVER['PATH_INFO']) && preg_match('!^/download/([^/]+)!', $_SERVER['PATH_INFO'], $m) ) {
$fileid = $m[1];
}
if ( is_null($fileid) ) {
foreach($files as $fileid=>$bar) {
printf(
'<a href="./test.php/download/%s/%s.txt">%s</a><br />',
$fileid, $bar['htmlentities'], $bar['htmlentities']
);
}
}
else if ( !isset($files[$fileid]) ) {
echo 'no such file';
}
else {
$f = $files[$fileid];
$utf8name = mb_convert_encoding($f['htmlentities'], 'utf-8', 'HTML-ENTITIES');
$utf8name = urlencode($utf8name);
header("Content-type: text/plain");
header("Content-Disposition: attachment; filename*=UTF-8''$utf8name.txt");
header("Content-length: " . strlen($f['content']));
echo $f['content'];
}
*) That's a bit like here on Stack Overflow. The link for this question is shown as
http://stackoverflow.com/questions/2578349/while-downloading-filenames-from-non-english-languages-are-not-getting-displayed
but it also works with
http://stackoverflow.com/questions/2578349/mary-had-a-little-lamb
the important part is the id 2578349