tags:

views:

133

answers:

5

http://en.wikipedia.org/wiki/Data%5Fcompression#Lossless%5Fdata%5Fcompression

For any given compression scheme, one can provide sample input that would result in no savings in space, right?

+1  A: 

Correct. Try zipping a zip file ... if the data is already compressed, you won't be able to get further compression.

Jess
+4  A: 

For any given compression scheme, one can provide sample input that would result in no savings in space, right?

Yes: A single bit.

Ben S
+4  A: 

Yes, there's always something that will grow larger. The pigeonhole principle says that if you have a space of inputs, and a 1-to-1 function (the lossless compression), then the number of outputs has to be the same as the number of inputs.

If the inputs are files of N bits, then the number of inputs is 2**N, and the number of outputs is 2**N. You can't store that many different outputs in files all shorter than N bits.

Ned Batchelder
doesn't that imply all lossless compression is useless in general?
Aaron F.
Theoretically it's not necessary for anything to be larger. But we can guarantee that there are inputs that are at least as large.
recursive
Of course, if no input gets larger, then no input can get smaller either.
Anon.
number of inputs would be 2**(8*N) inputs and outputs. (8 bits per byte.)
retracile
Not at all: they work really well on typical inputs. Just because a few files are larger doesn't mean that the average isn't much smaller.
Ned Batchelder
oops: yes, 2**(8*N), I fixed it.
Ned Batchelder
@aaron -- the idea is that you cleverly choose which files get bigger and which get smaller. You choose unusual data to be the sacrificial victims that get bigger, and more common data chunks as the ones which are better compressed.
Alex Feinman
Note that out of 2^(2^30) 1GB bitstreams only some represent useful data, for example movies. Almost all of these bitstreams are simply random noice for us. Lossless compression makes the useful bitstreams smaller at the cost that other less useful bitstreams.
liori
+2  A: 

Absolutely.

If it wasn't, you could conceivably run the output of the compression into the compressor again ad infinium for better compression until you get all the way to a single bit. That's obviously impossible.

whatsisname
Though it would be nice...
Ben S
A: 

"if I give you a stream of integers, can you compress them always?"

In the "zipping a zipfile" example, why are you thinking of bytes in the zipfile as something other than a stream of integers?

That was a pretty concise example of an instance when you could "defeat" lossless data compression.

drakaan