views:

53

answers:

2

I am trying to create a number of restricted permutations of a list of items. Each item has a category, and I need to find combinations of items such that each combination does not have multiple items from the same category. To illustrate, here's some sample data:

   Name      | Category
   ==========|==========
1. Orange    | fruit
2. Apple     | fruit
3. GI-Joe    | toy
4. VCR       | electronics
5. Racquet   | sporting goods

The combinations would be restricted to length three, I don't need every combination of every length. So a set of combinations for the above list could be:

(Orange, GI-Joe, VCR)
(Orange, GI-Joe, Racquet)
(Orange, VCR, Racquet)
(Apple,  GI-Joe, VCR)
(Apple,  GI-Joe, Racquet)
... and so on.

I do this fairly often, on various lists. The lists will never be more than 40 items in length, but understandably that could create thousands of combinations (though there will likely be around 10 unique categories per each list, restricting it somewhat)

I've come up with some pseudo-python for how I would implement it recursively. It's been too long since I took combinatorics, but from what I recall this is essentially a subset of the combinations of the set, something like C(list length, desired size). There's likely some library modules which can make this cleaner (or at least more performant)

I was wondering if perhaps there was a better approach than what I've got (perhaps one which uses itertools.combinations somehow):

# For the sake of this problem, let's assume the items are hashable so they
# can be added to a set.

def combinate(items, size=3):
    assert size >=2, "You jerk, don't try it."
    def _combinate(index, candidate):
        if len(candidate) == size:
            results.add(candidate)
            return
        candidate_cats = set(x.category for x in candidate)
        for i in range(index, len(items)):
            item = items[i]
            if item.category not in candidate_cats:
                _combinate(i, candidate + (item, ))

    results = set()
    for i, item in enumerate(items[:(1-size)]):
        _combinate(i, (item, ))

    return results
+2  A: 

Naive approach:

#!/usr/bin/env python

import itertools

items = {
    'fruits' : ('Orange', 'Apple'),
    'toys' : ('GI-Joe', ),
    'electronics' : ('VCR', ),
    'sporting_goods' : ('Racquet', )
}

def combinate(items, size=3):
    if size > len(items):
        raise Exception("Lower the `size` or add more products, dude!")

    for cats in itertools.combinations(items.keys(), size):
        cat_items = [[products for products in items[cat]] for cat in cats]
        for x in itertools.product(*cat_items):
            yield zip(cats, x)

if __name__ == '__main__':
    for x in combinate(items):
        print x

Will yield:

# ==> 
#
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('sporting_goods', 'Racquet')]
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Orange')]
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Apple')]
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')]
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')]
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')]
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')]
The MYYN
Your edited implementation looks to be more or less exactly what I need. Thanks!
Crast
You're welcome.
The MYYN
+1  A: 

What you seek to generate is the Cartesian product of elements taken from set of category.

The partitioning into multiple sets is relatively easy:

item_set[category].append(item)

With proper instantiation (e.g.) collections.defaultdict for item_set[category] and then itertools.product will give you the desired output.

msw
Ah yes, cartesian product. My brain kept thinking 'kleene star' and then kept rejecting it because those are infinite. Thanks, your comment and MYYN's edited implementation should get me going in the right direction
Crast
indeed, you seemed like you were looking to be taught to fish, not given a fish, but that certainly doesn't mean future questions aren't welcome.
msw