views:

8555

answers:

10

I'm writing a Web application that needs to store JSON data in a small, fixed-size server-side cache via AJAX (think: Opensocial quotas). I do not have control over the server.

I need to reduce the size of the stored data to stay within a server-side quota, and was hoping to be able to gzip the stringified JSON in the browser before sending it up to the server.

However, I cannot find much in the way of JavaScript implementations of Gzip. Any suggestions for how I can compress the data on the client side before sending it up?

A: 

Most browsers can decompress gzip on the fly. That might be a better option than a javascript implementation.

Yes, but I need to compress the data on the client side before sending it down...
David Citron
A: 

I guess a generic client-side JavaScript compression implementation would be a very expensive operation in terms of processing time as opposed to transfer time of a few more HTTP packets with uncompressed payload.

Have you done any testing that would give you an idea how much time there is to save? I mean, bandwidth savings can't be what you're after, or can it?

Tomalak
I need to keep the total data size within a certain quota--size is more important than time.
David Citron
Hm... Why is the limit? Just curious.
Tomalak
Well, here's Google's take on it: http://code.google.com/apis/opensocial/articles/persistence-0.8.html#restrictions-quotas -- Typical Opensocial quotas are around 10K.
David Citron
I see, thanks for the clarification.
Tomalak
Depending on how intensive the compression, you could use web workers to perform the task behind the scenes.
zachleat
+2  A: 

You can use a 1 pixel per 1 pixel Java applet embedded in the page and use that for compression.

It's not JavaScript and the clients will need a Java runtime but it will do what you need.

Bogdan
Interesting, but I'd rather avoid including an applet if possible.
David Citron
+1, -1, woot! I wouldn't have used a java craplet, but it's a useful answer regardless, so +1 :)
August Lilleaas
+20  A: 

I don't know of any gzip implementations, but the jsolait library has functions for LZW compression/decompression. The code is covered under the LGPL.

// LZW-compress a string
function lzw_encode(s) {
    var dict = {};
    var data = (s + "").split("");
    var out = [];
    var currChar;
    var phrase = data[0];
    var code = 256;
    for (var i=1; i<data.length; i++) {
        currChar=data[i];
        if (dict[phrase + currChar] != null) {
            phrase += currChar;
        }
        else {
            out.push(phrase.length > 1 ? dict[phrase] : phrase.charCodeAt(0));
            dict[phrase + currChar] = code;
            code++;
            phrase=currChar;
        }
    }
    out.push(phrase.length > 1 ? dict[phrase] : phrase.charCodeAt(0));
    for (var i=0; i<out.length; i++) {
        out[i] = String.fromCharCode(out[i]);
    }
    return out.join("");
}

// Decompress an LZW-encoded string
function lzw_decode(s) {
    var dict = {};
    var data = (s + "").split("");
    var currChar = data[0];
    var oldPhrase = currChar;
    var out = [currChar];
    var code = 256;
    var phrase;
    for (var i=1; i<data.length; i++) {
        var currCode = data[i].charCodeAt(0);
        if (currCode < 256) {
            phrase = data[i];
        }
        else {
           phrase = dict[currCode] ? dict[currCode] : (oldPhrase + currChar);
        }
        out.push(phrase);
        currChar = phrase.charAt(0);
        dict[code] = oldPhrase + currChar;
        code++;
        oldPhrase = phrase;
    }
    return out.join("");
}
Matthew Crumley
How can the code be LGPL if the algorithm is patented? Or are all patents truly expired?
David Citron
According to Wikipedia, the patents expired a few years ago. It might be a good idea to check that out though.
Matthew Crumley
LZW is way too old to still be patented. Last patents ran out in 2003 or so. There are loads of free implementations.
ypnos
I see at least two problems with the code above: 1) try to compress "Test to compress this \u0110\u0111\u0112\u0113\u0114 non ascii characters.", 2) No error is reported if code > 65535.
some
And I forgot the third one: The output from encode is in UTF-16. Does your application handle that?
some
Here is some info on how to compress Unicode: http://unicode.org/faq/compression.html. Looks if this was not so trivial.
Tomalak
There's a different LZW implmentation at http://zapper.hodgers.com/files/javascript/lzw_test/lzw.js I've no idea whether this addresses any of the above concerns. Also see the related blog post: http://zapper.hodgers.com/labs/?p=90
msanders
FWIW -- I don't think the zapper.hodgers.com implementation addresses the problems described above. It worked fine with 'plain old ASCII', but when I tried it on a string generated by the HTML canvas toDataUrl() method, for example, the compressed-then-decompressed string didn't match the original. Has anyone implemented JavaScript compression *and* decompression in a way that addresses the issues above *and* can cope performance-wise with strings up to about 500K in length -- I realise this is a tall order!
Sam Dutton
@Sam - try utf8_encode(lzw_encode(my_string)). Here's a UTF8 encoder in Javascript: http://farhadi.ir/works/utf8.
Roy Tinker
A: 

I can't help you with GZipping on the "fly". But if you want to GZip your WebResource JS then Kariem's blog about GZipping in nAnt might help you...

Thomas Hansen
+3  A: 

Here are some other compression algorithms implemented in Javascript:

Mauricio Scheffer
this LZMA implementation requires BrowserPlus (a browser extension) and does not look to be pure Javascript
Piotr Findeisen
Thanks, you're right
Mauricio Scheffer
this LZ77 implementation is no longer available and at least it's Python version (published on the same page) was incorrect for quite simple inputs.
Piotr Findeisen
geocities dead, will update the link
Mauricio Scheffer
This is pretty close to what i want. googling things too will update here
Theofanis Pantelides
A: 

Hello Guys,

The lzw functions are very nice and fast. i want to implement this in a auto js compression script in php (with cache) but there is one problem: chr seems to act differently when comparing to fromCharCode in javascript, they give not the same results when charcode is above 256. So what can i do about this? Her is my translation of lzw_encode():

function lzw_encode($s) 
{
    $dict = array();
    $data = "".str_replace( "\r", "", $s );
    //print_r( $data ); die;
    $out = array();
    $currChar = 0;
    $phrase = $data{0};
    $code = 256;
    $c = strlen($data);
    for ($i=1; $i<$c; $i++){
        $currChar=$data{$i};
        if($dict[$phrase.$currChar] != null) {
            $phrase.=$currChar;
        }
        else {
            $out[]=(strlen($phrase) > 1 ? $dict[$phrase] : ord($phrase{0}));
            //print( "'".((strlen($phrase) > 1) ? $dict[$phrase] : ord($phrase{0}))."'\n"); 
            $dict[$phrase.$currChar] = $code;
            $code++;
            $phrase=$currChar;
        }
    }

    $out[]=(strlen($phrase) > 1)? $dict[$phrase] : ord($phrase{0});
    $c = count($out);

    for($i=0; $i<$c; $i++) {
         $out[$i] = chr($out[$i]); // <-- here it is
    } 

    return implode("",$out);
}

Thank you very much for your reply. Greetz, Erwinus

Erwinus
A: 

I did not test, but there's a javascript implementation of ZIP:

http://jszip.stuartk.co.uk/

Sirber
A: 

A JavaScript GZIP implementation: http://code.google.com/p/gzipjs/

sibnick
Does it actually do anything? The JavaScript files seem to be almost empty.
Kinopiko
There does not seems to be a "release version" published.
Nordes
+1  A: 

I ported an implementation of LZMA from a GWT module into standalone JavaScript. It's called LZMA-JS.

nmrugg
do you have a compatible php module for it?
Sirber