views:

65

answers:

1

Obviously, there must be something stupid i'm doing. The unicode chart for subscripts and superscripts says #00B2 is superscript 2, but i get scrambled output. 0078 is x, but I get N, and 0120 is x. Am i reading wrong manual?


EDIT

$x = 'N';

print html_entity_decode($x, ENT_NOQUOTES, 'UTF-8') . "\n";
+3  A: 

I think you might be confusing decimal and hexadecimal values. For example, hexadecimal 0x78 is lower-case x, but decimal 78 (hexadecimal 0x4e) is upper-case N.

In HTML, you can specify Unicode entities as either decimal using &#n; or as hexadecimal with &#xn; (where n is replaced with the decimal or hexadecimal character code). For a superscript 2, you'd could use either ² or ².

In your example code, you are decoding the entity N. This is a decimal entity, so you get the expected result (upper-case N). The Unicode tables you've linked to use hexadecimal. To get a lower-case x, you would have to use x as the input.

Phil Ross
ok.. thts gr8.. works.. could you please tell me if the unicode chart i'm using is correct.. http://www.unicode.org/charts/PDF/U2070.pdf . It says 00B2 is superscript 2.
robert
@robert The chart is correct. 00B2 is in hexadecimal. If you use `²` you will get a superscript 2 (note the extra 'x').
Phil Ross
@robert: aren't you missing 'x' to specify the hexadecimal value?
Naveen
@robert being an official unicode chart, it is correct by definition.
Agos
ok.. thanks. i missed the x..
robert