views:

178

answers:

4

Greetings.

1 - Let's say I have about 500 folders of variable size with a total size of 100 GB.

2 - I want to distribute these folders automatically in other folders until the size of 700 MB is reached with the best optimization of space.

Example: In folder "CD--01" I want to have the maximum number of folders possible without passing the limit of 700 MB, and so on in "CD--02", "CD--03"...

Is there a tool that allows me to do this "on the fly" or will I have to code one myself?

Thanks

A: 

If you're on UNIX (inc Mac OSX) you can script something like

tar cvzf allfolders.tgz ./allfolders
split allfolders.tgz -b 700m

This will create a (compressed) archive of all the folders and then split it into 700M sized chunks. However you'll need to recombine all the pieces then extract again using tar, when you want to reconstitute the original folder set.

If you want to keep them as individual OS folders on the CD, that's fairly difficult (in fact I think it's a kind of knapsack problem, which is NP-hard).

frankodwyer
A: 

There are tools that will do this - similar to frankodwyer's answer, WinZip will take your 100GB, zip it up and split it into any size 'chunks' you'd like - i.e. ~700MB

Here's the page the WinZip split feature

Andrew
A: 

This is a very naive and poorly coded solution, but it works. My bash-fu is not strong, but a shell script seems like the best way to approach this problem.

#!/bin/bash
dirnum=1
for i in *
    do
    if [ `du -b -s "$i" | cut -f 1` -gt 700000000 ]
     then
     echo "$i is too big for a single folder, skipping"
     continue
    fi
    if [ ! -d "CD_$dirnum" ]
     then
     echo "creating directory CD_$dirnum"
     mkdir "CD_$dirnum"
    fi
    echo "moving $i to CD_$dirnum"
    mv "$i" "CD_$dirnum"
    if [ `du -b -s "CD_$dirnum" | cut -f 1` -gt 700000000 ]
     then
     echo "CD_$dirnum is too big now"
     mv "CD_$dirnum/$i" .
     let "dirnum += 1"
     if [ ! -d "CD_$dirnum" ]
      then
      echo "creating directory CD_$dirnum"
      mkdir "CD_$dirnum"
     fi
     echo "moving $i to CD_$dirnum"
     mv "$i" "CD_$dirnum"
    fi
done
Sparr
Thanks Sparr.... I'm not under UNIX.... but I can always share a folder between Win and a Unix virtual machine and run that script. I'll give it a try.
Joao Heleno
bash is available on windows via cygwin, although some consideration must be given to issues such as drive letters and \ vs /
Sparr
Also, as joel.neely's answer points out, one obvious improvement is to look for smaller things to move into an almost-full directory, instead of creating a new one as soon as the next item won't fit into the current ont.
Sparr
+2  A: 

Ultimately you're asking for a solution to the Knapsack Problem, which comes in many forms.

A simple approach would be per the following pseudocode, but this will not produce optimal solutions for all inputs (see the articles above).

while (there are unallocated files) {
    create a new, empty directory
    set remaining space to 700,000,000
    while (the size of the smallest unallocated is at most (<=) the remaining space) {
        copy into the current the largest unallocated file with size at most the remaining space
        subtract that file's size from the remaining space
        remove that file from the set of unallocated files
    }
    burn the current directory
}

(Of course, this assumes that no single file will be greater than 700MB in size. If that's possible, be sure to remove any such files from the unallocated list, else the above will produce infinitely many empty directories! ;-)

joel.neely