views:

60

answers:

1

I'm trying to find out how character sets/encoding are implemented in browsers, specifically Unicode.

  1. Are sets/encodings implemented separately in each browser or is it OS specific?
  2. Is it possible to find out what version of the Unicode Character Db (UCD) is being used?
  3. How are UCD updates pushed to each browser/OS? (Is it ever pushed out via automatic updates or is it just set for whatever version browser/OS you're using?)
  4. Links to character sets/encoding information for each browser/OS manufacturer would be nice.

Thanks

+2  A: 

I don't believe the browsers worry about the UCD at all.

A wellformed page will have a charset defined for it. Example:
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />

Any text that is being displayed will have a list of fonts defined for it (in preferred order). Example:
p { font-family: Verdana, Arial, sans-serif; }

For any character on the page the browser simply looks up the glyph in the font definition. If there isn't one it moves to the next font in the list. If it lucks out completely it probably just uses whatever uber-font the OS provides (Arial).

Chris Perkins
I'm thinking more internationalized support. If I know which UCD version is being used then I know what glyphs the user will need to type in there language. If the UCD version is specific to a particular browser or os then I can redirect the user to a specific page, or validate text accordingly, etc
EddyR
it doesn't work like that. By specifying the character set (and to a lesser extent the desired font), the page has done its job ("This is what you need to view me"). The browser will follow those instructions as best it can and will lean on the OS to provide glyphs for things it can't find in a named font.These days the fallback default uber-font on modern OS's is very robust. Common Chinese, Japanese, Hebrew, Arabic, Cyrillic, they are all there. I do web pages in all stripes and there is never a problem. If the user reads Thai language, they'll likely have Thai glyphs installed.
Chris Perkins
EddyR