views:

220

answers:

1

I run the following html snippet in IE8 and IE7 with non-English characters (we tried both Hebrew and Chinese), and the second link never works properly.
The displayed text in the alert box is mangled. This occurs in IE8 and IE7, but not in firefox. It is not dependent on Windows's regional settings.

Here is the html snippet (html header and footer omitted for brevity, the content-type is "text/html; charset=utf-8", and so is the response header):

<p>
  <a href="javascript:alert('abשלוםab')">link with English and Hebrew text</a>
  <a href="javascript:alert('ab%D7%A9%D7%9C%D7%95%D7%9Dab')">same text, url encoded</a>
</p>

Here is the alert box that pops up when clicking the second link:

alt text

I know that the string for "שלום" is encoded as 8 bytes in utf-8, thus there are 8 %NN items, and there are also 8 weird characters in the alert box. The problem is, how can I make IE recognize that this is utf-8 encoding text, like firefox does?

UPDATE:

The full html (of the minimal example) is available here.
I tried decodeURI, decodeURIComponent, and unescape, but without success. Moving the link from href to onclick solves the issue. My problem is that some of the content is generated from other sources out of my control, and I ended up with javascript links inside the href attribute.

+4  A: 

Since the URI escapes aren't working reliably for you, I wonder if you might be better off with JavaScript Unicode escapes instead? E.g. (within the JavaScript string), \u1234 where 1234 is the Unicode code point to display. That also has the advantage of working outside a javascript: URI, in case you move the code to a JavaScript file or something.

So based on your reply about the code points, that would look like this:

<a href="javascript:alert('ab\u05E9\u05DC\u05D5\u05DDab')">same text, using JavaScript escapes</a>

...which looks like this in my tests:

IE7: IE7 result image

IE8: IE8 result image (via RD, hence the basic look)

Chrome: Chrome result image

Firefox: Firefox result image

(There would appear to be some disagreement about that last code point for some reason.)

T.J. Crowder