views:

105

answers:

4

My drive has DMG-blocks. The sum of their sizes is strictly below 47GB. I have 11 DVDs, each of the size 4.7GB. I want to use as small amount of DVDs as possible, without using compressing (the problem may be superflous, since it considers the most optimal combinations in terms of DMG-files. You can think it in terms of compressed files, if you want.).

You can see that the DMG-files have arbitrary sizes. So many solutions are possible.

find . -iname "*.dmg" -exec du '{}' \; 3&> /dev/null

1026064 ./Desktop/Desktop2.dmg
5078336 ./Desktop/Desktop_2/CS_pdfs.dmg
2097456 ./Desktop/Desktop_2/Signal.dmg
205104  ./Dev/things.dmg
205040  ./Dev/work.dmg
1026064 ./DISKS/fun.dmg
1026064 ./DISKS/school.dmg
1026064 ./DISKS/misc.dmg
5078336 ./something.dmg

The files in DVDs can have an arbitrary order. For example, CS_pdfs.dmg and Signal.dmg do not need to be on the some disk.

So how can you find the way to use as small amount of DVDs as possible?

A: 

The most generic solution would involve implementing a simple backtracking algorithm, but I'm fairly certain that in this particular case you can just sort them by size and pick the largest file that fits on your disc over and over until it's full, then move on to the next with the remaining files.

Blindy
This is a `greedy strategy` and *may* waste lots of space. It depends on how precice the wanted solution must be.
Dario
or how fast! don't even need a computer to do it this way <.<
Blindy
+1  A: 

Your problem is called bin packing problem mathematically (which is related to the knapsack problem.)

Since it is np-hard, it very difficult to solve this efficiently! There is a recursive solution (dynamic programming + backtracking) but even this may require big amounts of space and computation time.

The most straightforward solution is a greedy algorithm (see Blindy's post), but this may give bad results.

It depends on how many items (n) you want to pack and how precise the solution must be (more precision will increase the runtime!). For small n the recursive/bruteforce or backtracking solution is sufficient, for bigger problems I'd advice to use some metaheuristic - especially genetic algorithms work quite well and yield good approximations in acceptable timespans.

Dario
A: 

You should probably try the greedy algorithm before anything else - That is, pick the largest item that can fit on the remaining DVD each time. While this is not guaranteed to work well, this problem is NP-complete and so no efficient solution exists. I had a similar problem recently, and the greedy algorithm worked quite well in my case - maybe it'll be good enough in yours as well.

Nathaniel Flath
rather than roll his own greedy solution, he'd be better off finding an existing bin-backing program and plugging his data into that.
Martin DeMello
Maybe, although it's a pretty trivial problem and might take him less time to just write his own.
Nathaniel Flath
+2  A: 

Totally alternate solution: Use split and cut up the borders onto multiple DVDs. You'll get 100% utilization of every disk but the last. http://unixhelp.ed.ac.uk/CGI/man-cgi?split

Autocracy