Given two sorted vectors a
and b
, find all vectors which are sums of a
and some permutation of b
, and which are unique once sorted.
You can create one of the sought vectors in the following way:
- Take vector
a
and a permutation of vectorb
. - Sum them together so
c[i]=a[i]+b[i]
. - Sort
c
.
I'm interested in finding the set of b
-permutations that yield the entire set of unique c
vectors.
Example 0: a='ccdd'
and b='xxyy'
Gives the summed vectors: 'cycydxdx'
, 'cxcxdydy'
, 'cxcydxdy'
.
Notice that the permutations of b
: 'xyxy'
and 'yxyx'
are equal, because in both cases the "box c" and the "box d" both get exactly one 'x'
and one 'y'
.
I guess this is similar to putting M
balls in M
boxes (one in each) with some groups of balls and boxes being identical.
Update: Given a string a='aabbbcdddd'
and b='xxyyzzttqq'
your problem will be 10 balls in 4 boxes. There are 4 distinct boxes of size 2, 3, 1 and 4. The balls are pair wise indistinguishable.
Example 1: Given strings are a='xyy'
and b='kkd'
.
Possible solution: 'kkd'
, 'dkk'
.
Reason: We see that all unique permutations of b
are 'kkd'
, 'kdk'
and 'dkk'
. However with our restraints, the two first permutations are considered equal as the indices on which the differ maps to the same char 'y'
in string a
.
Example 2: Given strings are a='xyy'
and b='khd'
.
Possible solution: 'khd'
, 'dkh'
, 'hkd'
.
Example 3: Given strings are a='xxxx'
and b='khhd'
.
Possible solution: 'khhd'
.
I can solve the problem of generating unique candidate b
permutations using Narayana Pandita's algorithm as decribed on Wikipedia/Permutation.
The second part seams harder. My best shot is to join the two strings pairwise to a list, sort it and use it as a key in a lookup set. ('xx'
+'hd'
join→'xh','xd'
sort→'xd','xh'
).
As my M
is often very big, and as similarities in the strings are common, I currently generate way more b
permutations than actually goes through the set filter. I would love to have an algorithm generating the correct ones directly. Any improvement is welcome.