views:

85

answers:

3

Hi folks,

I'm trying to install this library for LZJB compression. PyLZJB LINK

The library is a binding for a C library, the file is located here PyLZJB.so


Unfortunately by copying to the site-packages directory when import I get the "Wrong ELF class" error.

>>> import PyLZJB
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: ./PyLZJB.so: wrong ELF class: ELFCLASS32

Help would be great. :)

PS: I'm running Ubuntu 10.4 64bit


Edit:

If someone could suggest me an alternative compression algorithm I would be equally happy. :)

The algorithm is for HTML compression, and it needs client side Javascript decompression/compression support too.

I really hope someone can help on this one. Thanks guys!

A: 

You can either run a 32-bit Python or compile your own PyLZJB rather than using the prebuilt binary. Or get a 64-bit binary PyLZJB from somewhere.

Borealid
+5  A: 

You are running a 64 bit Python interpreter and trying to load a 32 bit extension and that is not allowed.

You need to have both your Python interpreter and your extension compiled for the same architectures. While you could get a 32 bit Python interpreter, it would probably be better to get a 64 bit extension.

What you should do is get the source for LZJB and build it yourself to get a 64 bit shared object.

R Samuel Klatchko
@Samuel: thanks for the reply! I'm a bit confused on how to build it though, could you give me a few pointers please? :)
RadiantHex
@RadiantHex, if the package uses `distutils`, it's usually as simple as running setup.py with your 64-bit python runtime. Sometimes, you need to resort to specifying `CFLAGS=--march=x86_64` and `CXXFLAGS=--march=x86_64` in your environment, depending on how the native code is built. This assumes AMD x64 architecture, you may need a different flag if you're on Intel.
Nathan Ernst
+4  A: 

If someone could suggest me an alternative compression algorithm I would be equally happy.

There is always good old deflate, a much more common member of the LZ compression family. JavaScript implementation. How to handle raw deflate content with Python's zlib module.

This a lot of overhead in relatively slow client-side code to be compressing submission data, and it's not trivial to submit the raw bytes you will obtain from it.

do they Gzip GET parameters within a request?

GET form submissions in the query string must by nature be fairly short, or you will overrun browser or server URL length limits. There is no point compressing anything so small. If you have a lot of data, it needs to go in a POST form.

Even in a POST form, the default enctype is application/x-www-form-urlencoded, which means a majority of bytes are going to get encoded as %nn sequences. This will balloon your form submission, probably beyond the original uncompressed size. To submit raw bytes you would have to use a enctype="multipart/form-data" form.

Even then, you're going to have encoding problems. JS strings are Unicode not bytes, and will get encoded using the encoding of the page containing the form. That should normally be UTF-8, but then you can't actually generate an arbitrary sequence of bytes for upload by encoding to it, since many byte sequences are not valid in UTF-8. You could have bytes-in-unicode by encoding each byte as a code unit to UTF-8, but that would bloat your compressed bytes by 50% (since half the code units, those over 0x80, would encode to two UTF-8 bytes).

In theory, if you didn't mind losing proper internationalisation support, you could serve the page as ISO-8859-1 and use the escape/encodeURIComponent idiom to convert between UTF-8 and ISO-8859-1 for output. But that won't work because browsers lie and actually use Windows code page 1252 for encoding/decoding content marked as ISO-8859-1. You could use another encoding that mapped every byte to a character, but that'd be more manual encoding overhead and would further limit characters you could use in the page.

You could avoid encoding problems by using something like base64, but then, again, you've got more manual encoding performance overhead and a 33% bloat.

In summary, all approaches are bad; I don't think you're going to get much useful out of this.

bobince