Greetings!
I feel like this problem is related to the bin packing problem, as well as potentially to the set partitioning problem... I just want to bounce this off of someone before I head down the path too deeply.
I have input data (in a datafile) as follows:
entry_one 55
entry_two 56
entry_three 61
entry_four 62
entry_five 62
entry_six 68
entry_seven 72
entry_eight 73
entry_nine 78
entry_ten 79
entry_eleven 84
entry_twelve 85
entry_thirteen 91
entry_fourteen 92
entry_fifteen 99
entry_sixteen 100
entry_seventeen 121
entry_eighteen 125
entry_nineteen 127
entry_twenty 161
With this data I want to have an algorithm that: groups the entries into groups such that the entries's associated numerical values within a group are within X (in my case, X is 16.) So for example, one arrangement could be:
group one:
entry_one
entry_two
entry_three
entry_four
entry_five
entry_six
group two:
entry_seven
entry_eight
entry_nine
entry_ten
entry_eleven
entry_twelve
group three:
entry_thirteen
entry_fourteen
entry_fifteen
entry_sixteen
group four:
entry_seventeen
entry_eighteen
entry_nineteen
group five:
entry_twenty
This particular arrangement was achieved using a naive greedy algorithm in which I started with the lowest value (entry_one's 55), and allowed all values that were under 55+16 to be part of that group. I then started with the very next entry which was not in that group (entry_seven's 72) and allowed all values that were under 72+16 to be part of that group (group two), and so on in that order.
I believe that although a naive greedy algorithm works, it is unlikely to give me an optimal grouping/categorization, where I define "optimal grouping" such that the total number of groups is what is being minimized (in my case, this is for job scheduling, so I want to group like work as best as possible to minimize changeover.)
Any thoughts, modules, algorithms, sample code out there that people can suggest?
Thanks!
EDIT: I thought I should add how this is different from the bin packing problem. In the bin packing problem the optimization is: "given these bins of fixed size, with these objects of fixed size, how can I stuff the most of these fixed size objects into these bins without overflowing each bin." In my case, what I have is bins of infinite size but of filtered entry, so that if an object does "match" the signature for a bin, it can be inserted into said bin, but what we want is to minimize the total number of bins that we need to create.