tags:

views:

2876

answers:

3

When I create a zip Archive via java.util.zip.*, is there a way to split the resulting archive in multiple volumes? Let's say my overall archive has a filesize of 24 MB and I want to split it into 3 files on a limit of 10 MB per file. Is there a zip API which has this feature? Or any other nice ways to achieve this?

Thanks Thollsten

+3  A: 

Check: http://saloon.javaranch.com/cgi-bin/ubb/ultimatebb.cgi?ubb=get_topic&f=38&t=004618

I am not aware of any public API that will help you do that. (Although if you do not want to do it programatically, there are utilities like WinSplitter that will do it)

I have not tried it but, every ZipEntry while using ZippedInput/OutputStream has a compressed size. You may get a rough estimate of the size of the zipped file while creating it. If you need 2MB of zipped files, then you can stop writing to a file after the cumulative size of entries become 1.9MB, taking .1MB for Manifest file and other zip file specific elements. So, in a nutshell, you can write a wrapper over the ZippedInputStream as follows:

import java.util.zip.ZipOutputStream;
import java.util.zip.ZipEntry;
import java.io.FileOutputStream;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;

public class ChunkedZippedOutputStream {

    private ZipOutputStream zipOutputStream;

    private String path;
    private String name;

    private long currentSize;
    private int currentChunkIndex;
    private final long MAX_FILE_SIZE = 16000000; // Whatever size you want
    private final String PART_POSTFIX = ".part.";
    private final String FILE_EXTENSION = ".zip";

    public ChunkedZippedOutputStream(String path, String name) throws FileNotFoundException {
        this.path = path;
        this.name = name;
        constructNewStream();
    }

    public void addEntry(ZipEntry entry) throws IOException {
        long entrySize = entry.getCompressedSize();
        if((currentSize + entrySize) > MAX_FILE_SIZE) {
            closeStream();
            constructNewStream();
        } else {
            currentSize += entrySize;
            zipOutputStream.putNextEntry(entry);
        }
    }

    private void closeStream() throws IOException {
        zipOutputStream.close();
    }

    private void constructNewStream() throws FileNotFoundException {
        zipOutputStream = new ZipOutputStream(new FileOutputStream(new File(path, constructCurrentPartName())));
        currentChunkIndex++;
        currentSize = 0;
    }

    private String constructCurrentPartName() {
        // This will give names is the form of <file_name>.part.0.zip, <file_name>.part.1.zip, etc.
        StringBuilder partNameBuilder = new StringBuilder(name);
        partNameBuilder.append(PART_POSTFIX);
        partNameBuilder.append(currentChunkIndex);
        partNameBuilder.append(FILE_EXTENSION);
        return partNameBuilder.toString();
    }
}

The above program is just a hint of the approach and not a final solution by any means.

sakana
A: 

Not exactly what you want, but if you can not do this in zip volumes, you may consider simply divide the file (in java) as described here.

VonC
+2  A: 

If the goal is to have the output be compatible with pkzip and winzip, I'm not aware of any open source libraries that do this. We had a similar requirement for one of our apps, and I wound up writing our own implementation (compatible with the zip standard). If I recall, the hardest thing for us was that we had to generate the individual files on the fly (the way that most zip utilities work is they create the big zip file, then go back and split it later - that's a lot easier to implement. Took about a day to write and 2 days to debug.

The zip standard explains what the file format has to look like. If you aren't afraid of rolling up your sleeves a bit, this is definitely doable. You do have to implement a zip file generator yourself, but you can use Java's Deflator class to generate the segment streams for the compressed data. You'll have to generate the file and section headers yourself, but they are just bytes - nothing too hard once you dive in.

Here's the zip specification - section K has the info you are looking for specifically, but you'll need to read A, B, C and F as well. If you are dealing with really big files (We were), you'll have to get into the Zip64 stuff as well - but for 24 MB, you are fine.

If you want to dive in and try it - if you run into questions, post back and I'll see if I can provide some pointers.

Kevin Day
I'm having problems with multi-volume zip files. Specifically when a single file component is split between more than disk file. In file.zx01 I have the file header and the first part of the compressed data, then in file.zx02 I have the rest of the compressed data. But I'm not able to reassemble the files for some reason, and I'm not sure why. Do you have any experience here?
vy32