views:

120

answers:

4

We have a project page which consists of users' files, multimedia stuff, etc and we want to allow the user to export all this out into a single zip file. We're using unix and mysql to store all of these currently and our primary goal is to minimize load/performance time from all the processing and compiling all the files into a zip file.

My idea was to cache the zip file into a temp dir and keep all the CRC checksum for each files in the zip into a separate text file. Each time the user tries to do an export I will first check through each file's CRC and compare it to the list before adding or removing files from the zip file.

But my other concern is also the space that the zip file will be occupying as we might have a lot of users.

IMHO, this is probably the dumbest way possible to do this, so can any of you guys please suggest a better way to deal with this problem?

thanks ~codeNoobian

A: 

If bandwidth/download speed is not a concern, I recommend you use an uncompressed tar file. TAR is a very simple format, so it will be easy to write code to update sections of it when a few of the files have changed. Also, leaving it uncompressed will be a huge win on server CPU time.

Of course, leaving it uncompressed will take a lot of storage space on your server. But since it is uncompressed, it might remove the need for you to keep a cache copy of the file at all, if you can build it fast enough you can just build it on the fly as needed. Then you don't have to worry about storing CRCs and updating the TAR, either.

SoapBox
+2  A: 

This reaks of premature optimization, just use a very light compression, aka 'fastest' and worry about the speed if it's actually a problem.

TravisO
A: 

hi soapbox, but the thing is our users will be all using windows, thus i need to give them back the archived file in zip format.

and the zip is not primary used to save space, but rather it's a compile it up into a neat folder with subfolders structure for the user to unzip to.

so will tar still work in this case?

hi travis0, so you're saying that just do it on the fly? our files are scattered around, and is there a good way to benchmark this performance time in unix or something?

thanks guys for the prompt replies.

melaos
WinZip, WinRAR, and most decompression programs can read TAR files. Windows cannot read them natively though (and it CAN read zip files natively), but most people have an external program anyways.
SoapBox
A: 

Common sound and image files are pretty well compressed to start with, aren't they? It might be worth looking at your payload to see how much you're buying with compression.

le dorfier