views:

92

answers:

4

Hey.

I recently encountered a strange problem on my website. Images with æ ø and å in them (Western European signs) Won't display.

The character encoding on all sites is "Iso-8859-1" I can print æ ø and å on the page without problems. If I right click the "broken image" and choose properties, it displays the filename with the european signs. (/admin/content/galleri/å.jpg)

the code for img looks like this

<img name='bilde'
     src='content/{$_SESSION["linkname"]}/{$row["img"]}'
     class='topmargin_ss leftmargin_ms rightmargin_s'
     width='80' height='80'>

(Wasn't allowed to post images so the code is without starting and ending brackets)

Made 4 files: z.jpg æ.jpg ø.jpg å.jpg

Only z.jpg shows up, they are the exact same jpg. The images are uploaded using php code, which works, uploads to the right directory and has no problem with the european signs.

Does anybody know what could be causing this?

A: 

You should encode the URL with %xx where xx represents the hex-value of a byte. As of the specification of your webserver vendor, these mostly are in UTF-8.

The same encoding method may be used for encoding characters whose use, although technically allowed in a URL, would be unwise due to problems of corruption by imperfect gateways or misrepresentation due to the use of variant character sets, or which would simply be awkward in a given environment.

It's browser dependent, wether the special symbol will get translated in UTF-8 and URL encoded. I guess it's not (else it would works), because actually nobody uses special symbols in file names, just a small subset of ASCII.

Pindatjuh
+1  A: 

This htmlentities('string', ENT_QUOTES, "UTF-8") works for me. For you that might be

$img = "<img name='bilde'
     src='" . htmlentities("content/{$_SESSION['linkname']}/{$row['img']}", ENT_QUOTES, "UTF-8") . "' class='topmargin_ss leftmargin_ms rightmargin_s' width='80' height='80'>

You might need to apply utf8_decode($string) to the URL, but I never needed to do that when using htmlentities with "UTF-8".

NOTE : This assumes that the page is already utf-8 encoded. This can be done using header('Content-Type: text/html; charset=utf-8');. And the data in the db is saved as utf-8. This can be done by calling mysql_set_charset('utf8'); before you start making MySQL queries; the query "SET NAMES 'utf8'" does the same.

partoa
Tried that in the code but got an "Unexpected T_string" error.
Rakoon
Fixed that for you.
partoa
Rakoon
Maybe saving the data as utf-8 would help. For a mysql database all you need to do is call is mysql_set_charset('utf8');
partoa
A: 

Have a look at bin2hex - you need to %age encode those crazy umlotts

Kevin Sedgley
+2  A: 

You've probably got a mismatch between the web-page (in ISO-8859-1 == Latin1) and the filesystem the images files are on - which is probably UTF-8.

I would suggest:

a) Encode the web-pages in UTF-8 - it's more likely to work in more places.

b) Only use ASCII for filenames to avoid these problems.

Douglas Leeder
Thanks for the reply. I also arrived at the conclusion that the file system had to be something different since all other aspects of the page, (like the comments) that could be using æøå works. Decided to try str_replacing the æ ø and å in the file names with ae oe and aa to avoid the problem.
Rakoon