ansaurus

Question

Answer 1

A:

It's charset, not chartset.

1) it depends on where the apostrophe is used, it's a valid ASCII character as well so depending on the characters intention (wether its for display only (inside a DOMText node) or used in code) you may or may not be able to use a literal apostrophe.

2) if your editor is a modern editor, it will be using utf sequences instead of just char to display text. most of the sequences used in code are just plain ASCII (and ASCII is a subset of utf8) so those characters will take up one byte. other characters may take up two, three or even four bytes in a specialized manner. they will still be displayed to you as one character, but the relation between character and byte has become different.

Anyway; since all valid ASCII characters are exactly the same in ASCII, utf8 and even windows-1252. you should not see any problems using utf8. And you can still use numeric and named entities because they are written in those valid characters. You just don't have to.

P.S. All modern browsers can do utf8 just fine. but our definitions of "modern" may vary.

Kris 2010-10-13 09:52:05

Answer 2

A:

Entities have three purposes: Encoding characters it isn't possible to encode in the character encoding used (not relevant with UTF-8), encoding characters it is not convenient to type on a given keyboard, and encoding characters that are illegal unescaped.

► should always produce ► no matter what the encoding. If it doesn't, it's a bug elsewhere.

► directly in the source is fine in UTF-8. You can do either that or the entity, and it makes no difference.

' is fine in most contexts, but not some. The following are both allowed:

<span title="Jon's example">This is Jon's example</span>

But would have to be encoded in:

<span title='Jon&#x27;s example'>This is Jon's example</span>

because otherwise it would be taken as the ' that ends the attribute value.

Jon Hanna 2010-10-13 10:01:57

Thanks Jon, some of my keywords include apostrophes, do you know how search engines interpret the entities? For example do they see widget#39;s the same as widget's? I have been wondering if they stop at the entity and just see widget. This would be a good reason for me not to use the entity in this circumstance.

cranfan 2010-10-13 10:39:50

A search engine that couldn't follow the basic rules of HTML to the extent that it knows `'` in source is the same as `'` (or even that `J` is the same as `J`, there's just never much point doing that) isn't going to be worth worrying about. As it is, they'll not only understand that its an apostrophe, they'll even be quite sophisticated in working out whether or not to include the apostrophe in matching it to search terms, etc.

Jon Hanna 2010-10-13 13:31:26

ansaurus

tags:

views:

answers:

chartset-utf8 and character entities

related questions