tags:

views:

83

answers:

1

For all the chararacter encodings that I have seen, they all have a code table, each code point corresponding to a character that should be represented/drawed. It seems to fall into the MVC pattern.

The wierd character are caused because the programs are looking up the wrong table for the given code points.

If so, it will be no matter for me to copy some wierd characters from MS notepad to Ultraedit, and choose a suitable encoding type in Ultraedit to read out them. But I cann't do so, why???? am I wrong?

A: 

Well, it is a bit more than simply a code-table, as different encodings use different numbers of bytes per code-point; UTF8, for example, is variant length - making it particularly risky to get wrong.

Ultimately, if you try to open a file with the wrong encoding, then the data should be considered corrupt and suspect.

Marc Gravell