views:

175

answers:

3

I'm not sure if it's my system, although I haven't done anything unusual with it, but I've started noticing incorrectly rendered characters popping up in web pages, text-files, like this:

I have a hunch it's a related to the fairly recent trend to use unicode for everything, which is a good thing I think, combined with fonts that don't support all possible characters.

So, does anyone know what's causing these blips (am I right?), and how do I stop this showing up in my own content?

+2  A: 

It appears that for this particular author, the text was edited in some editor that assumed it wasn't UTF8, and then re-wrote it out in UTF8. I'm basing this off the fact that if I tell my browser to interpret the page as different common encodings, none make it display correctly. This tells me that some conversion was done at some point improperly.

The only problem with UTF8 is that there isn't a standardized way to recognize that a file is UTF8, and until all editors are standardizing on UTF8, there will still be conversion errors. For other unicode variants, a Byte Order Mark (BOM) is fairly standard to help identify a file, but BOMs in UTF8 files are pretty rare.

To keep it from showing up in your content, make sure you're always using unicode-aware editors, and make sure that you always open your files with the proper encodings. It's a pain, unfortunately, and errors will occasionally crop up. The key is just catching them early so that you can undo it or make a few edits.

NilObject
A: 

I'm fairly positive it's nothing you can do. I've seen this on the front page of digg alot recently. It more than likely has to do with a character being encoded improperly. Not necessarily a factor of the font, just a mistake made somewhere in translation.

contagious
A: 

It looked for a while like the underscore and angle bracket problem had gone away, but it seems it might not be fixed.

here's a small sample, which should look like this:

__
#include

____
#include <stdio.h>
____
#include 

Update: it looks like it's fixed in display mode, and only broken in edit mode

Mark Harrison