views:

130

answers:

2

First of all, I am aware of this question:

and specifically best answer therein, http://emilsblog.lerch.org/2009/07/javascript-hacks-using-xhr-to-load.html.

So accessing binary data from Javascript using Firefox (and later versions of Chrome which actually seem to work too; don't know about Opera). So far so good. But I am still hoping to find a way to access binary data with a modern IE (ideally IE 6, but at least IE 7+), without using VB. It has been mentioned that XHR.messageBody would not work (if it contains zero bytes), but I was wondering if this might have been resolved with newer versions; or if there might be alternate settings that would allow simple binary data access.

Specific use case for me is that of accessing data returned by a web service that is encoded using a binary data transfer format (including byte combinations that are not legal in UTF-8 encoding).

A: 

Ok, I have found some interesting leads, although not completely good solution yet.

One obvious thing I tried was to play with encodings. There are 2 obvious things that really should work:

  • Latin-1 (aka ISO-8859-1): it is single-byte encoding, mapping one-to-one with Unicode. So theoretically it should be enough to declare content type of "text/plain; charset=ISO-8859-1" and get character-per-byte. Alas, due to idiotic logics of browsers (and even more idiotic mandate by HTML 5!), there is some transcoding occuring which changes high control character range (codes 128 - 159) in strange ways. Apparently this is due to mandatory assumption that encoding really is Windows-1252 (why? For some silly reasons.. but it is what it is)
  • UCS-2 is a fixed-length 2-byte encoding that predated UTF-17; and simply splits 16-bit character codes into 2 bytes. Alas, browsers do not seem to support it.
  • UTF-16 might work, theoretically, but there is the problem of surrogate pair characters (0xD800 - 0xDFFF) which are reserved. And if byte pairs that encode these characters are included, corruption occurs.

However: it seems to conversion for Latin-1 might be reversible, and if so, I bet I could make use of it after all. All mutations are from 1 byte (0x00 - 0xFF) into larger-than-byte values, and there are no ambiguous mappings at least for Firefox. If this holds true for other browsers, it will be possible to map values back and remove ill effects of automatic transcoding. And that would then work for multiple browsers, including IE.

Finally, some useful links for conversions of datatypes are:

StaxMan
A: 

I guess answer is plain "no", as per this post: http://stackoverflow.com/questions/1919972/how-do-i-access-xhr-responsebody-from-javascript

(or: "use VBScript to help")

StaxMan