views:

546

answers:

2

Let's say I have an array of elements for which a total ordering exists. The bubble sort distance is the number of swaps that it would take to sort the array if I were using a bubble sort. What is an efficient (will probably involve dynamic programming) way to calculate the number of possible permutations of this array that will have a bubble sort distance less than or equal to some pre-specified number?

If it simplifies the problem, you may assume that all elements of the array are unique (no ties).

+1  A: 
Charlie Martin
I don't think that's useful or relevant... it is about something similar, but how does it help here?
ShreevatsaR
I disagree. I'm also suspecting this might be somewhat homeworky, so I'm hinting at a direction.
Charlie Martin
This is not a homework problem. I tried to get rid of all unnecessary context and keep just the part that I'm stuck on, but actually, it's for calculating the exact P-value for Kendall's tau correlation.
dsimcha
Okay, I was traumatized by a guy in grad school that made these dynamic programming things *every* problem. (The Wagner of Wagner-Fisher, in fact.) Still, I think this is the pattern you need.
Charlie Martin
I thought of one way of counting the numbers in the question, but I still don't see a way to use this answer. :-) It would be cool to see a connection, really... (beyond "they're both dynamic programming")
ShreevatsaR
Well, I haven't got code to demonstrate he solution, but consider: a swap is exactly one delete and insert. So it ought to be possible to use WF by reduction.
Charlie Martin
Ah, I see, thanks. The problem is that the edit distance (Wagner-Fisher / whatever) is the number of edits it would take if you choose to do the edits optimally, but the bubble-sort distance is the number of swaps that a particular fixed (and stupid) algorithm will take.
ShreevatsaR
+2  A: 

Ok, here's a solution. Let's assume that all elements of the array are distinct, and further, without loss of generality, we can assume that they are {1,...,n}. (We can always relabel the elements so that this is the case, and nothing gets affected.)

First, we can observe that the number of swaps performed by bubble sort is the number of inversions in the permutation a[1..n]: the number of pairs (i,j) such that i<j but a[i]>a[j]. (This is not too hard to prove.)

So we want the number of permutations of {1,...,n} with at most k inversions. Let c(n,k) denote this number. Any permutation of {1,...n} can be thought of as taking a permutation of {1,...,n-1} and inserting {n} into it somewhere. If you insert it at position i, it creates exactly n-i new inversions. So the old permutation must have had at most k-(n-i) inversions. This gives:

c(n,k) = sum_{i s.t. n-i≤k} c(n-1, k-(n-i))
       = sum_{i=max(1,n-k) to n} c(n-1, k-n+i)

And the base case:

c(1,0) = 1 (or better, c(0,0)=1)

(Note that k is at most n*(n-1)/2 < n2.)


Update: The above takes O(n^2k) — so upto O(n^4) — time to compute c(n,k), because each of the nk c(n,k)'s takes O(n) time to compute given the earlier ones. We can improve by a factor of n by making the recurrence shorter, so that each c(n,k) can be computed in O(1) time given earlier ones. Write j for k-n+i so that

c(n,k) = sum_{j=max(k-n+1,0) to k} c(n-1, j)

Note that most of the sum is the same for c(n,k) and c(n,k-1). Specifically,

When k≤n-1, c(n,k) = c(n,k-1) + c(n-1,k)
When k≥n,   c(n,k) = c(n,k-1) + c(n-1,k) - c(n-1,k-n)


Updated program: (I wrote a lazy memoised version; you can make it slightly more efficient by making it bottom-up, the usual way with dynamic programming.)

ct = {(0,0): 1}
def c(n,k):
    if k<0: return 0
    k = min(k, n*(n-1)/2) #Or we could directly return n! if k>=n*(n-1)/2
    if (n,k) in ct: return ct[(n,k)]
    ct[(n,k)] = c(n,k-1) + c(n-1,k) - c(n-1,k-n)
    return ct[(n,k)]

if __name__ == "__main__":
    n = input("Size of array: ")
    k = input("Bubble-sort distance at most: ")
    print c(n,k)
ShreevatsaR
Nice, implemented it and it works.
dsimcha
Cool, glad to hear that. :)
ShreevatsaR