tags:

views:

513

answers:

13

I have written a java program for compression. I have compressed some text file. The file size after compression reduced. But when I tried to compress PDF file. I dinot see any change in file size after compression.

So I want to know what other files will not reduce its size after compression.

Thanks Sunil Kumar Sahoo

+4  A: 

Compressed files will not reduce their size after compression.

stefanw
That may not hold true, depending on the algorithms used.
Michael Foukarakis
+1  A: 

Generally you cannot compress data that has already been compressed. You might even end up with a compressed size that is larger than the input.

Martin Liversage
+1  A: 

jpeg/gif/avi/mpeg/mp3 and already compressed files wont change much after compression. You may see a small decrease in filesize.

waqasahmed
A: 

Media files don't tend to compress well. JPEG and MPEG don't compress while you may be able to compress .png files

AutomatedTester
Actually JPEG and MPEG files can often be compressed a few percent by a good compression algorithm.
Michael Borgwardt
Are you sure? Remember that special-purpose compression algorithms often lose some data not important for the content (like noise in sound files or similar areas on images). That means they always have better compression ratio than any general purpose compression algorithms (mainy loss-less).
twk
But BMP files compress very well. This does not depend from type of a media, but from a compression type. And yes - file formats are some kind of a compression of information.
smok1
+5  A: 

File compression works by removing redundancy. Therefore, files that contain little redundancy compress badly or not at all.

The kind of files with no redundancy that you're most likely to encounter is files that have already been compressed. In the case of PDF, that would specifically be PDFs that consist mainly of images which are themselves in a compressed image format like JPEG.

Michael Borgwardt
A: 

File that are already compressed usually can't be compressed any further. For example mp3, jpg, flac, and so on. You could even get files that are bigger because of the re-compressed file header.

klez
+2  A: 

The only files that cannot be compressed are random ones - truly random bits, or as approximated by the output of a compressor.

However, for any algorithm in general, there are many files that cannot be compressed by it but can be compressed well by another algorithm.

Will
A: 

Really, it all depends on the algorithm that is used. An algorithm that is specifically tailored to use the frequency of letters found in common English words will do fairly poorly when the input file does not match that assumption.

In general, PDFs contain images and such that are already compressed, so it will not compress much further. Your algorithm is probably only able to eke out meagre if any savings based on the text strings contained in the PDF?

Coxy
A: 

You will probably have difficulty compressing encrypted files too as they are essentially random and will (typically) have few repeating blocks.

Colin Desmond
A: 

PDF files are already compressed. They use the following compression algorithms:

  • LZW (Lempel-Ziv-Welch)
  • FLATE (ZIP, in PDF 1.2)
  • JPEG and JPEG2000 (PDF version 1.5 CCITT (the facsimile standard, Group 3 or 4)
  • JBIG2 compression (PDF version 1.4) RLE (Run Length Encoding)

Depending on which tool created the PDF and version, different types of encryption are used. You can compress it further using a more efficient algorithm, loose some quality by converting images to low quality jpegs.

There is a great link on this here

http://www.verypdf.com/pdfinfoeditor/compression.htm

badbod99
Not really. Not all PDF file automatically stores their content in compressed format. But you're right, PDF supports compression. Unless your PDF contains only images, there's a high probability that you could squeeze some extra space using ZIP or RAR
Salamander2007
It 100% depends on the application which created the PDF, as mentioned in my post.
badbod99
A: 

Simple answer: compressed files (or we could reduce file sizes to 0 by compressing multiple times :). Many file formats already apply compression and you might find that the file size shrinks by less then 1% when compressing movies, mp3s, jpegs, etc.

soulmerge
A: 

You can add all Office 2007 file formats to the list (of @waqasahmed):

Since the Office 2007 .docx and .xlsx (etc) are actually zipped .xml files, you also might not see a lot of size reduction in them either.

GvS
+1  A: 

Files encrypted with a good algorithm like IDEA or DES in CBC mode don't compress anymore regardless of their original content. That's why encryption programs first compress and only then run the encryption.

sharptooth