ansaurus

Question

Answer 1

+2 A:

It seems like you're more interested in compression rather than encryption. Is that the case? If so, this might prove an interesting read even though is not an exact solution.

mizipzor 2009-07-04 14:21:34

Answer 2

A:

Hey, I hope I understood correctly what you need to do... First thing I would like to say is that there are no good or bad compression algorithmss for text - zip, bzip, gzip, rar, 7zip are good enough to compress anything that has a low entrpy - i.e. large file with small character set. If I would have to use them I would choose 7zip at my first choice, rar as a second and zip as third. But the difference is very small so you should try whatever easier for you. Second - I could not understand what you are trying to encrypt. Suppose that this is an XML file then you should first compress it using your favourite compression algorithm and then encrypt it using your favourite encryption algorithm. In most cases any modern algorithm implemented for instance in PGP will be secure enough for anything. Hope that helps,

       Jack David

2009-07-04 14:34:42

A signature in an answer! That's new ;)

ivan_ivanovich_ivanoff 2009-07-05 10:07:28

Answer 3

A:

Your alternatives are:

Use a webserver that supports gzip compression. It'll auto compress all outgoing html. There's a small CPU penalty though.
Use something like JSON. It'll drastically reduce the size of the message
There's also a binary XML but I have not tried it myself.

Zepplock 2009-07-04 15:56:31

Answer 4

A:

By the way, the scenario is this: I am creating a standard for documents, like ODF or MS Office XML, that contain XML files, packaged in a .zip.

then I'd suggest you use .zip compression, or your users will get confused.

Pete Kirkham 2009-07-05 09:55:38

Answer 5

+7 A:

There is a W3 (not-yet-released) standard named EXI (Efficient XML Interchange).

Should become THE data format for compressing XML data in the future (claimed to be the last necessary binary format). Being optimized for XML, it compresses XML more ways more efficient than any conventional compression algorithm.

With EXI, you can operate on compressed XML data on the fly (without the need to uncompress or re-compress it).

EXI = (XML + XMLSchema) as binary.

And here you go with the opensource implementation (don't know if it's already stable):
Exificient

ivan_ivanovich_ivanoff 2009-07-05 10:04:00

Ugh.. XML was designed because "binary files are evil". And we now have these EXI stuff. This proof XML was just reinventing the wheel. Shouldn't we have used ASN.1?

J-16 SDiZ 2009-07-05 10:36:14

Some substandard (or something) of ASN.1 was an candidate for EXI. Binary files **are** evil. EXI is not a binary file in common sense. You don't need to write own implementation to read/write this binary file, nor you have to define own structure and type system. All done for you by XML+XmlSchema.

ivan_ivanovich_ivanoff 2009-07-05 11:03:10

Answer 6

+1 A:

Another alternative to "compress" XML would be FI (Fast Infoset).

XML, stored as FI, would contain every tag and attribute only once, all other occurrences are referencing the first one, thus saving space.

See:

Very good article on java.sun.com, and of course
the Wikipedia entry

The difference to EXI from the compression point of view is that Fast Infoset (being structured plaintext) is less efficient.

Other important difference is: FI is a mature standard with many implementations.
One of them: Fast Infoset Project @ dev.java.net

ivan_ivanovich_ivanoff 2009-07-06 15:33:53

ansaurus

tags:

views:

answers:

Best compression algorithm for XML?

related questions