views:

55

answers:

2

Hello,

Does anyone have any ideas for how to pragmatically quickly check if a zip file is corrupted based on file size? Ideally the best way to check if a zip is corrupted is to do a CRC check but this can take a long time especially if there is a lot of large zip files. I would be happy just to be able to do a quick file size or header check.

Thanks in advance.

A: 

This page says that the compressed size is 4 bytes starting from byte 18. You could try reading that and comparing it to the size to the file.

However, I think it's pretty much useless for checking if the zip file is corrupted for two reasons:

  1. Some zip files contain more bytes than just the zip part. For example, self-extracting archives have an executable part yet they're still valid zip.
  2. The file can be corrupted without changing its size.

So, I suggest calculating the CRC for a guaranteed method of checking for corruption.

imgx64
Also, many zip creation tools will write the header before they know the length of the file, so these bytes remain zero (to support streaming, presumably).
SimonJ
What @SimonJ said is true, but also - the compressed size starting from byte 18 is the compressed size of a single entry in the zip file. It is not the compressed size of the zip file.
Cheeso
Also, this may be obvious, but worth stating: "calculating the CRC" works to verify the file, only if the original CRC is known.
Cheeso
A: 

DotNetZip, a free open source library for handling zip files in .NET languages, supports a CheckZip() method that does what you want. There are various levels of assurance available at your option. The basic level just checks consistency of metadata. The most complete level does a full extraction of the zip file into a bitbucket to verify that the actual compressed data is not corrupted.

Cheeso