views:

1326

answers:

5

I've got a web page that uses XMLHttpRequest to download a binary resource.

Because it's binary I'm trying to use xhr.responseBody to access the bytes. I've seen a few posts suggesting that it's impossible to access the bytes directly from Javascript. This sounds crazy to me.

Weirdly, xhr.responseBody is accessible from VBScript, so the suggestion is that I must define a method in VBScript in the webpage, and then call that method from Javascript. See jsdap for one example.

var IE_HACK = (/msie/i.test(navigator.userAgent) && 
               !/opera/i.test(navigator.userAgent));   

if (IE_HACK) document.write('<script type="text/vbscript">\n\
     Function BinaryToArray(Binary)\n\
         Dim i\n\
         ReDim byteArray(LenB(Binary))\n\
         For i = 1 To LenB(Binary)\n\
             byteArray(i-1) = AscB(MidB(Binary, i, 1))\n\
         Next\n\
         BinaryToArray = byteArray\n\
     End Function\n\
</script>'); 

var xml = (window.XMLHttpRequest) 
    ? new XMLHttpRequest()      // Mozilla/Safari/IE7+
    : (window.ActiveXObject) 
      ? new ActiveXObject("MSXML2.XMLHTTP")  // IE6
      : null;  // Commodore 64?


xml.open("GET", url, true);
if (xml.overrideMimeType) {
    xml.overrideMimeType('text/plain; charset=x-user-defined');
} else {
    xml.setRequestHeader('Accept-Charset', 'x-user-defined');
}

xml.onreadystatechange = function() {
    if (xml.readyState == 4) {
        if (!binary) {
            callback(xml.responseText);
        } else if (IE_HACK) {
            // call a VBScript method to copy every single byte
            callback(BinaryToArray(xml.responseBody).toArray());
        } else {
            callback(getBuffer(xml.responseText));
        }
    }
};
xml.send('');

Is this really true? The best way? copying every byte? For a large binary stream that's not gonna be very efficient.

There is also a possible technique using ADODB.Stream, which is a COM equivalent of a MemoryStream. See here for an example. It does not require VBScript but does require a separate COM object.

if (typeof (ActiveXObject) != "undefined" && typeof (httpRequest.responseBody) != "undefined") {
    // Convert httpRequest.responseBody byte stream to shift_jis encoded string
    var stream = new ActiveXObject("ADODB.Stream");
    stream.Type = 1; // adTypeBinary
    stream.Open ();
    stream.Write (httpRequest.responseBody);
    stream.Position = 0;
    stream.Type = 1; // adTypeBinary;
    stream.Read....          /// ???? what here
}

I don't think that's gonna work - ADODB.Stream is disabled on most machines these days.


In The IE8 developer tools - the IE equivalent of Firebug - I can see the responseBody is an array of bytes and I can even see the bytes themselves. The data is right there. I don't understand why I can't get to it.

Is it possible for me to read it with responseText?

hints? (other than defining a VBScript method)

A: 

It's true that javascript can't do anything at all with byte arrays. It's not a supported data type and there's no built-in conversion. ADODB.Stream works fine for me on several "out of the box" machines, I use it with XHR for downloading files using a Windows Desktop Gadget. I can't say that I've mass-tested it, though.

There's a more cross-browser solution available from CodeProject which requires the server to send the data as a base64 string. I haven't tested it, but no "native" solution is likely to give great performance - it's just not what javascript was intended for.

Andy E
+2  A: 

So far, the only answer I came up with was to use the VBScript method in IE, and use the regular XHR in other browsers.

I had to inject the VBScript without jQuery. I used document.write, and placed it at the end of the <body> tag. Also I had to split the <script> tags that were being emitted, so as not to confuse the browser. The end of my .htm file looks like this:

     <!-- regular html content here -->

  </body>

  <script type="text/javascript">
    // <!--

  var IE_HACK = (/msie/i.test(navigator.userAgent) &&
                 !/opera/i.test(navigator.userAgent));   

  if (IE_HACK) {

      var vbScript = '<scr' + 'ipt type="text/vbscript">\n'+
          '<!-' + '-\n' + 
          'Function BinaryToArray(Binary)\n'+
          '  Dim i\n'+
          '  ReDim byteArray(LenB(Binary))\n'+
          '  For i = 1 To LenB(Binary)\n'+
          '    byteArray(i-1) = AscB(MidB(Binary, i, 1))\n'+
          '  Next\n'+
          '  BinaryToArray = byteArray\n'+
          'End Function\n'+
          '--' + '>\n' + 
          '</scr' + 'ipt>';

      //$(vbScript).insertAfter("script:last");
      document.write(vbScript);
  }
  // -->
  </script>

</html>

The JS class that reads binary files exposes a single interesting method, readCharAt(i), which reads the character (a byte, really) at the i'th index. This is how I set it up:

// see doc on http://msdn.microsoft.com/en-us/library/ms535874(VS.85).aspx
function getXMLHttpRequest() 
{
    if (window.XMLHttpRequest) {
        return new window.XMLHttpRequest;
    }
    else {
        try {
            return new ActiveXObject("MSXML2.XMLHTTP"); 
        }
        catch(ex) {
            return null;
        }
    }
}

// this fn is invoked if IE
function IeBinFileReaderImpl(fileURL){
    this.req = getXMLHttpRequest();
    this.req.open("GET", fileURL, true);
    this.req.setRequestHeader("Accept-Charset", "x-user-defined");
    this.req.onreadystatechange = function(event){
        if (that.req.readyState == 4) {
            that.status = "Status: " + that.req.status;
            //that.httpStatus = that.req.status;
            if (that.req.status == 200) {
                // this doesn't work
                //fileContents = that.req.responseBody.toArray(); 

                // this works...
                // call a VBScript method to copy every single byte
                var fileContents = BinaryToArray(that.req.responseBody).toArray();

                fileSize = fileContents.length-1;
                if(that.fileSize < 0) throwException(_exception.FileLoadFailed);
                that.readByteAt = function(i){
                    return fileContents[i];
                }
            }
            if (typeof callback == "function"){ callback(that);}
        }
    };
    this.req.send();
}

// this fn is invoked if non IE
function NormalBinFileReaderImpl(fileURL){
    this.req = new XMLHttpRequest();
    this.req.open('GET', fileURL, true);
    this.req.onreadystatechange = function(aEvt) {
        if (that.req.readyState == 4) {
            if(that.req.status == 200){
                var fileContents = that.req.responseText;
                fileSize = fileContents.length;

                that.readByteAt = function(i){
                    return fileContents.charCodeAt(i) & 0xff;
                }
                if (typeof callback == "function"){ callback(that);}
            }
            else
                throwException(_exception.FileLoadFailed);
        }
    };
    //XHR binary charset opt by Marcus Granado 2006 [http://mgran.blogspot.com] 
    this.req.overrideMimeType('text/plain; charset=x-user-defined');
    this.req.send(null);
}
Cheeso
What did you use to read binary data in other browsers? I see your question references a `getBuffer` function. What is that? AFAIK only IE supports `responseBody`. So *what did you use* ??
Crescent Fresh
I used MSXML2.XMLHTTP in IE, and XMLHttpRequest() in non-IE. In the non-IE browsers, I was able to use responseText to get the byte stream. Apparently IE thinks it is a string, and therefore in IE, beyond the first zero, responseText[i] returns "undefined". But that is not so for FF3.5. It just works.
Cheeso
But...then it may be a different issue: IE does not allow the `[]` operator on strings. It uses `.charAt(num)` instead. So it may just be you need `responseText.charAt(i)`?
Crescent Fresh
I used responseText.charCodeAt(i) - it didn't work in IE, after the first zero. I just updated the post with more code, to illustrate.
Cheeso
+1  A: 

Thanks so much for this solution. the BinaryToArray() function in VbScript works great for me.

Incidentally, I need the binary data for providing it to an Applet. (Don't ask me why Applets can't be used for downloading binary data. Long story short.. weird MS authentication that cant go thru applets (URLConn) calls. Its especially weird in cases where users are behind a proxy )

The Applet needs a byte array from this data, so here's what I do to get it:

 String[] results = result.toString().split(",");
    byte[] byteResults = new byte[results.length];
    for (int i=0; i<results.length; i++){
        byteResults[i] = (byte)Integer.parseInt(results[i]);
    }

The byte array can then converted into a bytearrayinputstream for further processing.

rk2010
A: 

I would suggest two other (fast) options:

  1. First, you can use ADODB.Recordset to convert the byte array into a string. I would guess that this object is more common that ADODB.Stream, which is often disabled for security reasons. This option is VERY fast, less than 30ms for a 500kB file.

  2. Second, if the Recordset component is not accessible, there is a trick to access the byte array data from Javascript. Send your xhr.responseBody to VBScript, pass it through any VBScript string function such as CStr (takes no time), and return it to JS. You will get a weird string with bytes concatenated into 16-bit unicode (in reverse). You can then convert this string quickly into a usable bytestring through a regular expression with dictionary-based replacement. Takes about 1s for 500kB.

For comparison, the byte-by-byte conversion through loops takes several minutes for this same 500kB file, so it's a no-brainer :) Below the code I have been using, to insert into your header. Then call the function ieGetBytes with your xhr.responseBody.

<!--[if IE]>    
<script type="text/vbscript">

    'Best case scenario when the ADODB.Recordset object exists
    'We will do the existence test in Javascript (see after)
    'Extremely fast, about 25ms for a 500kB file
    Function ieGetBytesADO(byteArray)
        Dim recordset
        Set recordset = CreateObject("ADODB.Recordset")
        With recordset
            .Fields.Append "temp", 201, LenB(byteArray)
            .Open
            .AddNew
            .Fields("temp").AppendChunk byteArray
            .Update
        End With
        ieGetBytesADO = recordset("temp")
        recordset.Close
        Set recordset = Nothing
    End Function

    'Trick to return a Javascript-readable string from a VBScript byte array
    'Yet the string is not usable as such by Javascript, since the bytes
    'are merged into 16-bit unicode characters. Last character missing if odd length.
    Function ieRawBytes(byteArray)
        ieRawBytes = CStr(byteArray)
    End Function

    'Careful the last character is missing in case of odd file length
    'We Will call the ieLastByte function (below) from Javascript
    'Cannot merge directly within ieRawBytes as the final byte would be duplicated
    Function ieLastChr(byteArray)
        Dim lastIndex
        lastIndex = LenB(byteArray)
        if lastIndex mod 2 Then
            ieLastChr = Chr( AscB( MidB( byteArray, lastIndex, 1 ) ) )
        Else
            ieLastChr = ""
        End If
    End Function

</script>

<script type="text/javascript">
    try {   
        // best case scenario, the ADODB.Recordset object exists
        // we can use the VBScript ieGetBytes function to transform a byte array into a string
        var ieRecordset = new ActiveXObject('ADODB.Recordset');
        var ieGetBytes = function( byteArray ) {
            return ieGetBytesADO(byteArray);
        }
        ieRecordset = null;

    } catch(err) {
        // no ADODB.Recordset object, we will do the conversion quickly through a regular expression

        // initializes for once and for all the translation dictionary to speed up our regexp replacement function
        var ieByteMapping = {};
        for ( var i = 0; i < 256; i++ ) {
            for ( var j = 0; j < 256; j++ ) {
                ieByteMapping[ String.fromCharCode( i + j * 256 ) ] = String.fromCharCode(i) + String.fromCharCode(j);
            }
        }

        // since ADODB is not there, we replace the previous VBScript ieGetBytesADO function with a regExp-based function,
        // quite fast, about 1.3 seconds for 500kB (versus several minutes for byte-by-byte loops over the byte array)
        var ieGetBytes = function( byteArray ) {
            var rawBytes = ieRawBytes(byteArray),
                lastChr = ieLastChr(byteArray);

            return rawBytes.replace(/[\s\S]/g, function( match ) {
                return ieByteMapping[match]; }) + lastChr;
        }
    }
</script>
<![endif]-->
Louis LC
A: 

You could also just make a proxy script that goes to the address you're requesting & base64's it. Then you just have to pass a query string to the proxy script that tells it the address. In IE you have to manually do base64 in JS though. But this is a way to go if you don't want to use VBScript.

I used this for my GameBoy Color emulator @ http://grantgalitz.org/gameboy/

My proxy script is at the subfolder res/proxy.php

Here is the PHP script that does the magic:

<?php
//Binary Proxy
if (isset($_GET['url'])) {
    try {
        $curl = curl_init();
        curl_setopt($curl, CURLOPT_URL, stripslashes($_GET['url']));
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($curl, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
        curl_setopt($curl, CURLOPT_POST, false);
        curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 30);
        $result = curl_exec($curl);
        curl_close($curl);
        if ($result !== false) {
            header('Content-Type: text/plain; charset=ASCII');
            header('Expires: '.gmdate('D, d M Y H:i:s \G\M\T', time() + (3600 * 24 * 7)));
            echo(base64_encode($result));
        }
        else {
            header('HTTP/1.0 404 File Not Found');
        }
    }
    catch (Exception $error) { }
}
?>
Grant Galitz