Hey folks, I'm trying to figure out an algorithm that will help me group an assortment of files of varying sizes into say, 'n' groups of approximately equal size.
Any ideas on how to achieve this?
Hey folks, I'm trying to figure out an algorithm that will help me group an assortment of files of varying sizes into say, 'n' groups of approximately equal size.
Any ideas on how to achieve this?
K means might help you. It's a good starting point to research about more advanced clustering algorithms, but given that your problem is 1-dimensional, k-means should be more than enough.
Find the target group size. This is the sum of all sizes divided by n.
Create a list of sizes.
Sort the files decreasing in size.
for each group
while the remaining space in your group is bigger than the first element of the list
take the first element of the list and move it to the group
for each element
find the elemnet for which the difference between group size and target group size is minimal
move this elemnt to the group
This doesn't produce optimal results, but is easy to implement and gets you good results. For the optimal solution you need an exhaustive search which is NP complete.
Your implicit optimization goal is most likely to minimize n, the number of groups. Then you have exactly the bin packing problem, sometimes called the cutting stock problem.
Netlib has this fortran code to solve the more general multiple knapsack problem (items have profit as well cost/weight values).