views:

917

answers:

3

I have a ColdFusion script that does:

<cfset content = replace(content,"&##147;","""","all")>

Which replaces &147; by ". Google understands this too, if you type &#145; &#147; &#233; at its search box its transformed on the results page to ‘ “ é".

If I search for é on this HTML Entity Character Lookup page, it'll return &#233; to me. But and doesn't return 145 and 147.

So the question is, what's the numeric reference, character encoding, or whatever else, is being used here by ColdFusion? Where can I see that 145 maps to , 147 maps to and 233 maps to é?

Bonus thanks if someone provides a page listing these characters (since more are replaced on the script).

Edit: Havenard made me realize I was typing the wrong characters because my browser (Opera) was displaying them badly, so now I'm using Firefox to edit this question, and hopefully it'll be clear :)

Cheers,

+1  A: 

If you are on Windows, you can use the CHARMAP.EXE to get those codes.

Html entities can represent symbols by it's numeric reference (like those you have and can see in the Character Map) or by an alias like &lt; for <, &gt; for >, &quot; for " etc.

Here is a list: http://www.w3schools.com/tags/ref%5Fentities.asp

Havenard
In your list, é matches #233. but ' or " doesn't match my numbers. Or am I overlooking something really obvious?...
inerte
Yes, it is not ' ". It is ‘ “. They are a little bit different. Maybe you can't even notice it depending on the font you're using.
Havenard
Ok, the visual representation in Opera is negligible, if it exists at all, so my previous comment doesn't make sense. But my question remains if inverted from numeric to character. I still can't find what 145 and 147 are supposed to be. Even Wikipedia's en.wikipedia.org/wiki/ISO/IEC_8859-1 table gaps on this range :(
inerte
145 and 147 are ‘ and “. Not ' and ". Trust me, they are different.
Havenard
@Havenard, I know *what* they are, I created a HTML document with them and saw it, but I still don't know where they came from. Your link, GaVrA's link, Wikipedia's, nowhere that I've searched answers this, for example: "Oh, ‘ is ‘ because ISO-XXXX-X/Unicode/Latin1/ascii says so". I can *see* what 145 is, just have no idea why! What's the encoding/reference/standard?
inerte
I beleave I didn't get your point. Is this about programming or **grammatic**?
Havenard
@Havenard, programming... `‘ == ‘ on what encoding/reference?` in the title and `what's the numeric reference, character encoding, or whatever else, is being used here by ColdFusion? Where can I see that 145 maps to ‘, 147 maps to “ and 233 maps to é?` in the body of my question...
inerte
+1  A: 

Maybe something like this?

http://www.w3schools.com/TAGS/ref%5Furlencode.asp

GaVrA
It's *like* this, but not exactly, it's another numeric character reference. In yours, é == %E9 and ' == %27.
inerte
Its because urlencode use hexadecimal values. E9 hex = 233 decimal.
Havenard
Yes, those numbers should work.
Chuck
+1  A: 

Found it. Windows-1252. Took me a long time but thanks everyone who tried to help :)

inerte