m making project on double data compression which uses the lossless data compression technique.. can any one tell me how should i procede?? m making project for the first tim.. please help me frnds...
In Ruby you could use the "rubyzip" gem, which does provide lossless compression.
In Python, here's a way to do "double data compression" by compressing a compressed string:
from zlib import compress
data_to_compress = 'double double, toil and trouble'
doublely_compressed_data = compress(compress(data_to_compress))
# but why stop there? why not have triple data compression too?
triplely_compressed_data = compress(compress(compress(data_to_compress)))
Of couse, this can be extended to "n" data compression:
from zlib import compress
def n_compress(data, n):
for _ in range(n):
data = compress(data)
return data
[tongue firmly in cheek]
avinash, why do you want to do double data compression? This will not result in better compression; in fact, it will quite possibly be worse than single compression.
Imagine if you could keep compressing infinite times, and get a smaller file / string each time. Eventually, you would end up with a file / string with a length of 1 byte (or possibly 1 bit). But wait - doesn't that mean that all files are then exactly the same?
Instead, there is a limit as to how much you can losslessly compress data. This is known as the entropy of the data. Note that the entropy of a file / string is ideal - in practice, compression algorithms can not reach this level of compression (usually). This is simply because it would take too long to calculate the best tree to use, so they use a greedy algorithm.
You can calculate the entropy of a string if you know the frequency of each character. The formula for this is