views:

43

answers:

1

Hello!

I'm bulding a site and I've set its content type to use charset UTF-8. I'm also using HTML encoding for the special characters, ie: instead of having á I've got á.

Now I wonder (still bulding the site) if it was really necesary to do both things. Looking for the answer I found this:

http://www.w3.org/International/questions/qa-escapes.en.php

It says that I shoud not use HTML encoding for any special characters but for >, < and &. But the reason is that escapes

can make it difficult to read and maintain source code, and can also significantly increase file size.

I think that's true but very poor argument. Is it really THE SAME thing using the escapes and the special characters?

+7  A: 

The article is in fact correct. If you have proper UTF-8 encoded data, there is no reason to use HTML entities for special characters on normal web pages any more.

I say "on normal web pages", because there are highly exotic borderline scenarios where using entities is still the safest bet (e.g. when serving JavaScript code to an external page with unknown encoding). But for serving pages to a browser, this doesn't apply.

Pekka
In theory, that's right. However, you should confirm that it works across all browsers that you wish to support, including older versions.
Spudley
@Spudley good point. But as far as I know, support for UTF-8 characters is present in all browsers down to IE 6 (IE5 had problems). Display problems are usually a result of faultily declared encodings and such. Are there other instances of an older product that doesn't fully support UTF-8 characters, and works better with entities? I'd be interested to know.
Pekka
@Pekka, @Spudley: I am developing for all major browsers out there (including IE6) and i can confirm that this works everywhere if you specify/encode everything correctly.
elusive