views:

123

answers:

2

A SO post about generating all the permutations got me thinking about a few alternative approaches. I was thinking about using space/run-time trade offs and was wondering if people could critique this approach and possible hiccups while trying to implement it in C#.

The steps goes as follows:

  1. Given a data-structure of homogeneous elements, count the number of elements in the structure.

  2. Assuming the permutation consists of all the elements of the structure, calculate the factorial of the value from Step 1.

  3. Instantiate a newer structure(Dictionary) of type <key(Somehashofcollection),Collection<data-structure of homogeneous elements>> and initialize a counter.

  4. Hash(???) the seed structure from step 1, and insert the key/value pair of hash and collection into the Dictionary. Increment the counter by 1.

  5. Randomly shuffle(???) the order of the seed structure, hash it and then try to insert it into the Dictionary from step 3.

  6. If there is a conflict in hashes,repeat step 5 again to get a new order and hash and check for conflict. Upon successful insertion increment the counter by 1.

  7. Repeat steps 5 & 6 until the counter equals the factorial calculated in step 2.

It seems like doing it this way using some sort of randomizer(which is a black box to me at the moment) might help with getting all the permutations within a decent timeframe for datasets of obscene sizes.

It will be great to get some feedback from the great minds of SO to further analyze this approach whose objective is to deviate from the traditional brute-force approach prevalent in algorithms of such nature and also the repercussions of implementing such an algorithm using C#.

Thanks

+1  A: 

It seems like a complicated way to randomise the order of the generated permutations. In terms of time efficiency, you can't do much better than the 'brute force' approach.

Mau
Are there any examples of approaches doing better that the 'brute force' approach?
sc_ray
+1  A: 

This method of generating all permutations does not fare well as compared to the standard known methods.

Say you had n items and M=n! permutations.

This method of generation is expected to generate M*lnM permutations before discovering all M.

(See this answer for a possible explanation: http://stackoverflow.com/questions/3282832/programing-pearls-random-select-algorithm/3282949#3282949)

Also, what would the hash function be? For a reasonable hash function, we might have to start dealing with very large integer issues pretty soon (any n > 50 for sure, don't remember that exact cut-off point).

This method uses up a lot of memory too (the hashtable of all permutations).

Even assuming the hash is perfect, this method would take expected Omega(n*M*logM) operations and guaranteed Omega(nM) space, while standard well-known methods can do it in O(M) time and O(n) space.

As a starting point I suggest one can read: Systematic Generation of All Permutations which is believe is O(nM) time and O(n) space and still much better than this method.

Note that if one has to generate all permutations, any algorithm will necessarily take Omega(M) steps and so the the method I refer to above is optimal!

Moron
Thanks for the well drafted response. This is the sort of analysis I have been looking for. The purpose of this was not to suggest a "better"/"worse"/"quite bad" method, just explore alternatives. As part of a scientific community your analysis quantifies the shortcomings but I think when responding we should refrain from using verbiage that's somewhat bordering the hostile.
sc_ray
@sc_ray: Sorry, I didn't mean to offend you in any way. I will edit it out.
Moron
@Moron: No problem at all. So in layman's words from a time perspective the problem stems from the randomization that will take too many tries before it actually narrows down an arrangement that is unique and from the space point of view there is the unnecessary overhead of storing the entire collection again and again in memory while storing the simple hash would suffice(considering the hash function is perfect)
sc_ray
@sc_ray: From time perspective yes, you are likely to hit dupes (try googling Birthday Paradox, for instance). The MlogM time is the expected number of tries you need before you find all permutations. From the space usage perspective, even if you store just the compact hash (instead of the whole permutation), the space is Omega(M), while the standard method can generate next permutation by reordering elements of the previous permutation and so is O(n) and that amounts to really huge savings in space (eg: for 10 items, instead of ~40 bytes, you will be using ~3MB).
Moron
.. for more than 100 items, the Omega(M) space usage makes it impossible! (100! is probably more than number of estimated atoms in universe or some other cliche like that...)
Moron
@Moron: Thanks. That clarifies a lot. This may be something I can look up but just for the edification of the readers, how do you gauge the space to be Omega(M) for storing the hashes?
sc_ray
@sc_ray: You have to store at least M-1 hashes, don't you?
Moron
@Moron: Thanks.
sc_ray