views:

253

answers:

5

I know that if I have the following group of numbers

{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }

I can have 5040 different 4-digit numbers, using 10! / (10 - 4)!

But if I repeat one number on the initial group, like

{ 1, 1, 2, 3, 4, 5, 6, 7, 8, 9 }

How many different 4-digit numbers can we build ? I know the answer is 3360, just don't know how to implement it.

Important: In this case, numbers like "1123" or "1213" should be valid combinations, but not "1113", as there are only two ones on the initial group.

Also, for the group

{ 1, 1, 2, 2, 3, 4, 5, 6, 7, 8 }

There should be 2190 4-digit numbers. Any ideas on how to calculate these answers ?

This is not homework, I'll get this numbers from a specific hardware and depending on the number of combinations, will return some data.

A: 

You might want to reference this outstanding Combinatorics implementation.

Eric J.
A: 

This would be pretty easy without duplication . . . for

{ 1, 1, 2, 3, 4, 5, 6, 7, 8, 9 }

Since you only have 9 distinct choices (as opposed to the 10 in the original problem), the answer should be 9! / (9 - 4)!

( By the way, you can actually have more different 4-digit numbers if you allow repetition, i.e. 4456. Then the answer would just be 9^4 4-digit numbers. )

Similarly, {1, 1, 2, 2, 3, 4, 5, 6, 7, 8 } has 8 distinct choices, so the answer should be 8! / (8 - 4)! by your original math.

Edit and actual answer: Maybe you meant that if the 1 is duplicated in your set, you can use two 1's in the answer?

Edit 2: Working, tested python module provided

In that case, you might try calculating the distinct number of possibilities, and then adding the results with duplicates, as the following python code suggests):

import math

def set_exclude(a, b):
    result = []
    for i in a:
        if not i in b:
            result.append(i)
    return result

def cnt_unique(aset, choices_picked, count_left, count_total):
    # Sanity check
    if count_left < 0:
        return 0
    if count_left == 0:
        return math.factorial(count_total)
    # Do non-duplicate combinations first
    # Set unprocessed = (set without any elements in choices_picked)
    unprocessed = set_exclude(aset, choices_picked)
    temp = len(set(unprocessed))

    # If we have no more valid items to choose from, this is impossible
    if temp >= count_left:
        result = math.factorial(temp) / math.factorial(temp - count_left) / \
                 math.factorial(count_left) * math.factorial(count_total)
    else:
        result = 0

    # Now find duplicate-involving combinations
    for member in set(unprocessed):
        valid = True
        for contest in choices_picked:
            if contest >= member:
                valid = False
                break

        if valid:
            count = unprocessed.count(member)
            new_choices = choices_picked + [ member ]
            for i in range(2, count+1):
                result_temp = cnt_unique(aset, new_choices, \
                  count_left - i, count_total)
                if result_temp != 0:
                    result_temp //= math.factorial(i)
                    result += result_temp
    return result

aset = [ 1, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
result_size = 4
combinations = cnt_unique(aset, [], result_size, result_size)

OK, I've verified by hand that algorithm works for all of your cases presented above. I'm fairly confident it works in more general cases, but I don't have time at the moment to do any additional test cases (for instance, if there were 3 1's, or 3 groups of duplicates). Note it would also blow up if there were no numbers in set which were not in choices_picked (ie you have one unique number duplicated 10 times).

Edit 3: Regarding how many function calls are made with this algorithm for large sets, I tested with the following function call, incrementing a variable once for each non-trivial (count_left >= 0) call to cnt_unique:

>>> def test():
        b = [0]
        c = time.time()
        result = cnt_unique(range(1,51) + range(1,51), [], 4, 4, b)
        c = time.time() - c
        print("Result: " + str(result))
        print("Time: " + str(c))
        print("Calls: " + str(b[0]))

>>> test()
Result: 6240150
Time: 0.0150001049042
Calls: 1276

So, for a 100 element set with 2 entries for each number 1-50, there were 1276 calls. And it executes quite fast; one tick with time.time() is 15 ms, so it executes in typically less than 15 ms.

Walt W
I think you should be able to have two 1s.
Sean Nyman
He wants to include repetition, so the vector, 1122 is valid, but it wont represent 4 different values in the n!/(n-d)! equation. at least based on his numbers, since 9!/(9-4)! != 3360
nlucaroni
exactly. On the group { 1,1,2,3,4,5,6,7,8,9 } 1123 should be a valid combination
guigouz
guigouz: Please try the above algorithm. If I need to give more of an explanation, let me know.
Walt W
Oh, and guigouz: You may want to redescribe your original question to remove the -1 ticker on it; edit the post to include what counts as a separate result (like if there are two 1's in the set, then 1124 or 1123 is valid, but not 1114). I think that's why you got the downvote.
Walt W
@ anyone: Is it appropriate to post my edit there as a new solution, since it has little in common with the original answer? When should multiple answers be used? I've been wondering about that...
Walt W
I implemented something like this early on while learning Haskell. This is mind-numbingly easy in a functional language.
Imagist
@Walt: I think ideally the question and answers evolve (in that order). The change history is always accessible so nothing is lost. Having multiple answers to the question when it was in different states would be confusing.
Troubadour
@Troubadour: Alright, thanks!
Walt W
Walt, it seems to work, but it would require me to list all the possible combinations... 10 elements is just an example, I may have 100+ characters on one request.Anyway it enlightened me, now I'm trying to figure out how many combinations may have the repeated numbers, so I can subtract them.
guigouz
Vote up please? ;) And it should expand to as many characters as you want. The biggest thing I don't know is how well it handles 3 duplicates or 3 groups of duplicates. If I tested that, I'd be convinced (personally). Good luck!
Walt W
I'll definatelly vote up as soons as I have 15 points here on stackoverflow. My main concern is performance and memory consuption as with 50 elements, I'll have 5 million+ possibilities to check for uniques.
guigouz
How do you get that figure? How many duplicates are you expecting in a set of 50?
Walt W
It's a simple math calculation, if you have 50 elements, you can build (50!) / (50-4!) combinations with them! that's 5527200 elements. Iterating through them on a high-load environment is not a choice.
guigouz
The algorithm above does not iterate through all of them. I don't have time to do an upper bound calculation at the moment, but it is somewhat faster than building them all, as it optimizes all combinations containing single elements.
Walt W
It probably is on the order of O(n^m) (where n is size of the selection, and m is the size of the set), given if you have a set where every element is repeated exactly once there will be some recursion, but in practice you should see it go pretty fast (unless you have lots of dups).
Walt W
guigouz: Added run-time / call results and working python implementation
Walt W
A: 

You need to break the problem in 2 cases:

Case 1: zero or one "1" in your 4-digit number Number of permutation is 9! / (9-4)! = 3024

Case 2: two "1"s in your 4-digit number You know two of the digits must be 1, so there are 8*7 ways to select the remaining two digits. And there are (4! / 2!) ways to arrange the two "1"s and two other digits. Hence, the number of permutation is 8*7*(4! / 2!) = 672

It looks like the answer is 3024+672 = 3696

David
His provided answers are right
Walt W
This: 8*7*(4! / 2!) should be 8*7/2!*(4! / 2!) because you account for the ordering of the non-ones in the 4! at the end.
Walt W
Why 8*7 ways ? to select the remaining 2 digits ? Does this assume they can be non-contiguous ? For example if I get 42 as the other 2 digits, I'll be able to have 1421, 1142, 4211, 4112, 4121 and so on.We're almost there, but I'm sure the two answers I provided are right.
guigouz
8 = number of letters in set that aren't 1. 7 = letters in set after removing the first picked letter.
Walt W
No, 3360 is correct for the case of { 1, 1, 2, 3, 4, 5, 6, 7, 8, 9 }. See http://stackoverflow.com/questions/1343412/combinatorics-with-repeated-chars-on-initial-group/1344110#1344110 for details.
Jason
+4  A: 
{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }

Here you are choosing four numbers from ten and they can be any order. Thus the solution is

(10 choose 4) * 4! = 5040.

Now we consider the case of

{ 1, 1, 2, 3, 4, 5, 6, 7, 8, 9 }

There are a few combinations here. We can have zero 1s, one 1 or two 1s. In the first case there are

(8 choose 4) * 4!

possible combinations. In the second case there are

(8 choose 3) * 4!

possible combinations. In the third case there are

(8 choose 2) * 4! / 2!

possible combinations. This last one requires explanation. There are eight possible non 1 digits to choose from and we are choosing two. The two remaining digits are 1s (by assumption that we are in the case where are our 4-string contains two 1s). We can put these four digits in any order however the 1s are interchangeable so we divide by the number of possible orderings of the 1s (2!).

Thus the answer is

(8 choose 4) * 4! + (8 choose 3) * 4! + (8 choose 2) * 4! / 2! = 3360.

For the case of

{ 1, 1, 2, 2, 3, 4, 5, 6, 7, 8 }

there are again a few combinations. We can have zero 1s and zero 2s, one 1 and zero 2s, two 1s and zero 2s, two 1s and one 2, two 1s and two 2s, zero 1s and one 2, zero 1s and two 2s, one 1 and two 2s, and one 1 and one 2. These can be worked out as above and the final answer is

(6 choose 4) * 4!
    + 2 * (6 choose 3) * 4!
    + 2 * (6 choose 2) * 4! / 2!
    + (6 choose 2) * 4!
    + 2 * (6 choose 1) * 4! / 2!
    + (6 choose 0) * 4! / (2! * 2!)
= 2190.

This a fairly simplistic approach to problems of this sort. There are others (for example, inclusion/exclusion) that are more sophisticated but the current approach is the easiest to understand if you're seeing problems of this sort for the first time.

Jason
what do you mean by (10 choose 4) * 4! ? I get the same result using 10! / (10-4)! = 5040.Can you please clarify this ?
guigouz
10 choose 4 = 10! / ((10 - 4!) * 4!), so multiplying by 4! again yields 10! / (10-4)!. He split it out to make the method more apparent.
Walt W
In other words, it means: Out of 10 objects, how many sets of 4 can I pick, without caring about which order I picked out the elements within each set?
Walt W
I'm testing this using bc, I defined the following functiondefine choose(x, y) { return f(x) / ( f(x - y) * f (y) ) * f(4) }f(x) returns the factorial of xit works for choose(10,4), but choose(8,4) + choose(8,3) + choose(8,2) returns 3696Maybe I'm missing something ?
guigouz
@guigouz: (m choose n) = m! / (n! * (m - n)!) is the formula that you want. See http://en.wikipedia.org/wiki/Binomial_coefficient.
Jason
thanks a lot, everything's clear now
guigouz
A: 

If you need to do this on any set larger than your example, watch out for overflows in your intermediate calculations. 13! is already large enough to overflow a 32-bit UINT. Only a few larger than that will overflow 64 bits. Using float/double isn't any better since you'll get the wrong answer without full precision.

You would need to use an arbitrary precision integer class or some custom numeric class that carries the full list of factors when calculating the factorial or any multiplies, then doing factor elimination to simplify the division.

Alan